The monotonic_cst
parameter in scikit-learn’s DecisionTreeRegressor
allows enforcing monotonic constraints on the tree’s predictions with respect to specified input features.
A monotonic relationship between a feature and the target means that as the feature value increases, the predicted target always either increases (positive monotonic) or decreases (negative monotonic), but not both.
The monotonic_cst
parameter accepts an array specifying the desired monotonic constraint for each input feature. A value of 0 indicates no constraint, 1 indicates an increasing constraint, and -1 indicates a decreasing constraint.
By default, monotonic_cst
is set to None
, imposing no monotonic constraints on the tree.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
# Generate synthetic dataset with monotonic relationships
X, y = make_regression(n_samples=1000, n_features=5, noise=0.1, random_state=42)
X[:, 0] = np.sort(X[:, 0]) # Enforce increasing monotonic relationship for feature 0
X[:, 1] = np.sort(X[:, 1])[::-1] # Enforce decreasing monotonic relationship for feature 1
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different monotonic_cst values
monotonic_cst_values = [None, [1, -1, 0, 0, 0], [0, 0, 1, -1, 0]]
mse_scores = []
for cst in monotonic_cst_values:
dt = DecisionTreeRegressor(random_state=42, monotonic_cst=cst)
dt.fit(X_train, y_train)
y_pred = dt.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"monotonic_cst={cst}, MSE: {mse:.3f}")
Running the example gives an output like:
monotonic_cst=None, MSE: 5910.115
monotonic_cst=[1, -1, 0, 0, 0], MSE: 5602.567
monotonic_cst=[0, 0, 1, -1, 0], MSE: 4171.930
The key steps in this example are:
- Generate a synthetic regression dataset with known monotonic relationships for some features
- Train
DecisionTreeRegressor
models with differentmonotonic_cst
settings - Evaluate the mean squared error (MSE) of each model on the test set
Some tips and heuristics for setting monotonic_cst
:
- Use domain knowledge to determine which features are expected to have monotonic relationships with the target
- Inspect partial dependence plots to verify the assumed monotonic relationships
- Compare model performance with and without the constraints to assess the benefit
Issues to consider:
- Applying incorrect monotonic constraints can harm the model’s performance
- Monotonic constraints may be too restrictive for some datasets, limiting the model’s flexibility