Configure HistGradientBoostingRegressor "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s HistGradientBoostingRegressor controls the contribution of each tree to the final prediction.

Histogram-based Gradient Boosting is an efficient implementation of gradient boosting that uses binning to speed up training. The learning_rate parameter scales the contribution of each tree, influencing the trade-off between fitting and generalization.

A smaller learning rate requires more trees to achieve similar performance, which can improve generalization but increases training time. Conversely, a larger learning rate may lead to faster convergence but potentially overfitting.

The default value for learning_rate is 0.1.

In practice, values between 0.01 and 0.3 are commonly used, depending on the dataset and other hyperparameters.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import HistGradientBoostingRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate values
learning_rates = [0.01, 0.1, 0.3, 1.0]
mse_scores = []

for lr in learning_rates:
    hgb = HistGradientBoostingRegressor(learning_rate=lr, random_state=42, max_iter=100)
    hgb.fit(X_train, y_train)
    y_pred = hgb.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"learning_rate={lr}, MSE: {mse:.3f}")

# Find best learning rate
best_lr = learning_rates[np.argmin(mse_scores)]
print(f"Best learning_rate: {best_lr}")

Running the example gives an output like:

learning_rate=0.01, MSE: 14886.051
learning_rate=0.1, MSE: 3073.589
learning_rate=0.3, MSE: 3848.682
learning_rate=1.0, MSE: 15661.218
Best learning_rate: 0.1

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train HistGradientBoostingRegressor models with different learning_rate values
Evaluate the mean squared error of each model on the test set
Identify the best performing learning_rate

Some tips and heuristics for setting learning_rate:

Start with the default value of 0.1 and adjust based on model performance
Use smaller learning rates with more trees for better generalization
Consider the trade-off between model performance and training time

Issues to consider:

The optimal learning rate often depends on other hyperparameters like max_iter
Very small learning rates may require a large number of trees to converge
Very large learning rates may lead to overshooting and poor convergence

See Also