SKLearner Home | About | Contact | Examples

Configure ExtraTreesRegressor "monotonic_cst" Parameter

The monotonic_cst parameter in scikit-learn’s ExtraTreesRegressor allows you to enforce monotonic constraints on the relationship between features and the target variable.

Extra Trees Regressor is an ensemble method that builds multiple randomized decision trees and averages their predictions to improve generalization and reduce overfitting. The monotonic_cst parameter enables you to specify whether each feature should have a positive, negative, or no monotonic relationship with the target.

Monotonic constraints ensure that the predicted output always increases (or decreases) as a specific feature increases, regardless of other feature values. This can be useful when you have domain knowledge about the expected relationship between features and the target.

The default value for monotonic_cst is None, which means no monotonic constraints are applied. When specified, it should be a list or array with values 1 (positive constraint), -1 (negative constraint), or 0 (no constraint) for each feature.

import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=3, noise=0.1, random_state=42)

# Ensure the first feature has a positive relationship with the target
X[:, 0] = np.abs(X[:, 0])
y += X[:, 0]

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define different monotonic constraints
constraints = [
    None,
    [1, 0, 0],  # Positive constraint on first feature
    [1, -1, 0]  # Positive on first, negative on second, no constraint on third
]

for cst in constraints:
    etr = ExtraTreesRegressor(n_estimators=100, random_state=42, monotonic_cst=cst)
    etr.fit(X_train, y_train)
    y_pred = etr.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f"Monotonic constraints: {cst}, MSE: {mse:.4f}")

Running the example gives an output like:

Monotonic constraints: None, MSE: 11209.1497
Monotonic constraints: [1, 0, 0], MSE: 10275.3948
Monotonic constraints: [1, -1, 0], MSE: 17836.4720

The key steps in this example are:

  1. Generate a synthetic regression dataset with features suitable for monotonic constraints
  2. Split the data into train and test sets
  3. Create ExtraTreesRegressor models with different monotonic_cst configurations
  4. Train the models and evaluate their performance using mean squared error
  5. Compare the results of different constraint configurations

Some tips and heuristics for setting monotonic_cst:

Issues to consider:



See Also