Configure RandomForestClassifier "monotonic_cst" Parameter

The monotonic_cst parameter in scikit-learn’s RandomForestClassifier allows you to enforce monotonic constraints on the decision trees in the ensemble.

Monotonic constraints specify whether a feature has a monotonically increasing or decreasing relationship with the target variable. This can be useful when you have prior domain knowledge about the relationships between features and the target.

The monotonic_cst parameter takes a list with a length equal to the number of features and values are either 1 or -1, indicating a monotonically increasing or decreasing relationship, respectively.

By default, monotonic_cst is set to None, which means no monotonic constraints are applied.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic regression dataset with monotonic relationships
X, y = make_classification(n_samples=1000, n_features=5, n_informative=5 ,
                           n_redundant=0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different monotonic_cst configurations
configs = [
    None,  # No constraints
    [1, 0, 0, 0, 0],  # Monotonically increasing for feature 0
    [0, -1, 0, 0, 0],  # Monotonically decreasing for feature 1
    [1, -1, 0, 0, 0]  # Increasing for feature 0, decreasing for feature 1
]

for monotonic_cst in configs:
    rf = RandomForestClassifier(n_estimators=100, random_state=42,
                               monotonic_cst=monotonic_cst)
    rf.fit(X_train, y_train)
    y_pred = rf.predict(X_test)
    acc = accuracy_score(y_test, y_pred)
    print(f"monotonic_cst={monotonic_cst}, Accuracy: {acc:.3f}")

Running the example gives an output like:

monotonic_cst=None, Accuracy: 0.935
monotonic_cst=[1, 0, 0, 0, 0], Accuracy: 0.880
monotonic_cst=[0, -1, 0, 0, 0], Accuracy: 0.910
monotonic_cst=[1, -1, 0, 0, 0], Accuracy: 0.840

The key steps in this example are:

Generate a synthetic classification dataset
Split the data into train and test sets
Train RandomForestClassifier models with different monotonic_cst configurations
Evaluate the accuracy of each model on the test set

Some tips and heuristics for using monotonic_cst:

Consider using monotonic constraints when you have strong prior knowledge about the feature-target relationships
Determine the direction of the constraint (increasing or decreasing) based on domain understanding
Be aware that using constraints may slightly reduce accuracy but can lead to more interpretable models

Issues to consider:

Monotonic constraints are a strong assumption and may not always hold perfectly in real data
Applying incorrect constraints can lead to reduced model performance

See Also