SKLearner Home | About | Contact | Examples

Configure RandomForestRegressor "monotonic_cst" Parameter

The monotonic_cst parameter in scikit-learn’s RandomForestRegressor allows you to specify monotonic constraints for each feature. This can be useful when you have prior knowledge that certain features have a monotonic relationship with the target variable.

RandomForestRegressor is an ensemble learning method that combines predictions from multiple decision trees to improve regression performance. The monotonic_cst parameter is a list that specifies the monotonic constraints for each feature, where 1 indicates an increasing constraint, -1 indicates a decreasing constraint, and 0 indicates no constraint.

By default, monotonic_cst is set to None, which means no monotonic constraints are applied. In practice, the parameter is set based on domain knowledge about the relationship between features and the target variable.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Generate synthetic dataset with monotonic relationships
X, y = make_regression(n_samples=1000, n_features=5, noise=0.1, random_state=42,
                       effective_rank=5, tail_strength=0.5)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different monotonic_cst values
monotonic_cst_values = [None, [1, 1, 1, 1, 1], [-1, -1, -1, -1, -1]]
results = []

for cst in monotonic_cst_values:
    rf = RandomForestRegressor(n_estimators=100, monotonic_cst=cst, random_state=42)
    rf.fit(X_train, y_train)
    y_pred = rf.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    results.append((cst, mse, r2))

for cst, mse, r2 in results:
    print(f"monotonic_cst={cst}, MSE: {mse:.3f}, R-squared: {r2:.3f}")

Running the example gives an output like:

monotonic_cst=None, MSE: 0.755, R-squared: 0.926
monotonic_cst=[1, 1, 1, 1, 1], MSE: 2.512, R-squared: 0.754
monotonic_cst=[-1, -1, -1, -1, -1], MSE: 10.691, R-squared: -0.046

The key steps in this example are:

  1. Generate a synthetic regression dataset with features that have monotonic relationships with the target
  2. Split the data into train and test sets
  3. Train RandomForestRegressor models with different monotonic_cst values
  4. Evaluate and compare the performance of each model on the test set using MSE and R-squared

Some tips and heuristics for setting monotonic_cst:

Issues to consider:



See Also