SKLearner Home | About | Contact | Examples

Configure HistGradientBoostingClassifier "monotonic_cst" Parameter

The monotonic_cst parameter in scikit-learn’s HistGradientBoostingClassifier allows you to enforce monotonic constraints on the relationship between features and the target variable.

Monotonic constraints ensure that the predicted probability (or score) always increases (or decreases) as a feature value increases, regardless of the values of other features. This can be useful when you have domain knowledge about the expected relationship between features and the target.

The monotonic_cst parameter accepts a list or dictionary specifying the constraint for each feature. Use -1 for a decreasing relationship, 1 for an increasing relationship, and 0 for no constraint.

By default, monotonic_cst is set to None, which means no monotonicity constraints are applied.

import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=2, n_informative=2,
                           n_redundant=0, n_clusters_per_class=1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train models with different monotonic constraints
constraints = [None, [1, 0], [-1, 0], [1, -1]]
models = []

for cst in constraints:
    model = HistGradientBoostingClassifier(monotonic_cst=cst, random_state=42)
    model.fit(X_train, y_train)
    models.append(model)
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"Constraints: {cst}, Accuracy: {accuracy:.3f}")

# Visualize the effect of monotonic constraints
x_plot = np.linspace(X[:, 0].min(), X[:, 0].max(), 100).reshape(-1, 1)
x_plot = np.hstack([x_plot, np.zeros_like(x_plot)])

plt.figure(figsize=(12, 8))
for i, (model, cst) in enumerate(zip(models, constraints)):
    y_plot = model.predict_proba(x_plot)[:, 1]
    plt.subplot(2, 2, i+1)
    plt.plot(x_plot[:, 0], y_plot)
    plt.title(f"Constraints: {cst}")
    plt.xlabel("Feature 0")
    plt.ylabel("Predicted Probability")

plt.tight_layout()
plt.show()

Running the example gives an output like:

Constraints: None, Accuracy: 0.925
Constraints: [1, 0], Accuracy: 0.930
Constraints: [-1, 0], Accuracy: 0.930
Constraints: [1, -1], Accuracy: 0.520

Configure HistGradientBoostingClassifier “monotonic_cst” Parameter

The key steps in this example are:

  1. Generate a synthetic binary classification dataset with two features
  2. Split the data into train and test sets
  3. Train HistGradientBoostingClassifier models with different monotonic_cst configurations
  4. Evaluate the accuracy of each model on the test set
  5. Visualize the effect of monotonic constraints on the relationship between feature 0 and the predicted probability

Some tips for setting monotonic_cst:

Issues to consider:



See Also