The monotonic_cst
parameter in scikit-learn’s DecisionTreeClassifier
allows you to specify monotonic constraints for each feature. This enforces the tree to have a monotonically increasing or decreasing relationship between the feature and the target.
Monotonic constraints can be useful when you have prior knowledge that certain features have a monotonic relationship with the target variable. For example, in a pricing model, you might expect that a higher quality rating would never lead to a lower price.
The monotonic_cst
parameter takes an array-like structure or a string that specifies the constraint for each feature. The default value is None
, meaning no constraints are enforced.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=2, n_informative=2,
n_redundant=0, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different monotonic_cst settings
settings = [None, [1, 0], [-1, 0], [0, 1]]
acc_scores = []
for setting in settings:
dt = DecisionTreeClassifier(monotonic_cst=setting, random_state=42)
dt.fit(X_train, y_train)
y_pred = dt.predict(X_test)
acc = accuracy_score(y_test, y_pred)
acc_scores.append(acc)
print(f"monotonic_cst={setting}, Accuracy: {acc:.3f}")
The output will look similar to:
monotonic_cst=None, Accuracy: 0.925
monotonic_cst=[1, 0], Accuracy: 0.875
monotonic_cst=[-1, 0], Accuracy: 0.840
monotonic_cst=[0, 1], Accuracy: 0.910
The key steps in this example are:
- Generate a synthetic classification dataset with a monotonic feature
- Split the data into train and test sets
- Train
DecisionTreeRegressor
models with differentmonotonic_cst
settings - Evaluate the accuracy of each model on the test set
Some tips and heuristics for setting monotonic_cst
:
- Use domain knowledge to identify features that are expected to have a monotonic relationship with the target
- Set the constraint to 1 for increasing relationships and -1 for decreasing relationships
- Leave the constraint as 0 for features without a monotonic relationship
Issues to consider:
- Incorrectly constraining a feature can lead to poorer model performance
- Monotonic constraints limit the expressiveness and flexibility of the decision tree