Configure GradientBoostingClassifier "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s GradientBoostingClassifier controls the contribution of each tree to the ensemble prediction.

Gradient Boosting is an ensemble method that iteratively trains decision trees to correct the errors made by the previous trees. The learning_rate parameter scales the contribution of each tree.

A smaller learning_rate means each tree contributes less, requiring more trees to fit the data. This can lead to better generalization but longer training times. A larger learning_rate means fewer trees are needed, but the model may not learn the data as effectively.

The default value for learning_rate is 0.1.

In practice, values between 0.01 and 0.1 are commonly used, with smaller datasets often benefiting from larger learning rates.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_redundant=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate values
learning_rates = [0.01, 0.05, 0.1, 0.5]
accuracies = []

for lr in learning_rates:
    gb = GradientBoostingClassifier(n_estimators=100, learning_rate=lr, random_state=42)
    gb.fit(X_train, y_train)
    y_pred = gb.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"learning_rate={lr}, Accuracy: {accuracy:.3f}")

Running the example gives an output like:

learning_rate=0.01, Accuracy: 0.830
learning_rate=0.05, Accuracy: 0.880
learning_rate=0.1, Accuracy: 0.880
learning_rate=0.5, Accuracy: 0.890

The key steps in this example are:

Generate a synthetic binary classification dataset with informative and redundant features
Split the data into train and test sets
Train GradientBoostingClassifier models with different learning_rate values
Evaluate the accuracy of each model on the test set

Some tips and heuristics for setting learning_rate:

Start with the default of 0.1 and adjust up or down depending on performance
Smaller learning rates generally lead to better generalization but require more trees
Larger learning rates can converge faster but may not fit the data as well

Issues to consider:

There is a tradeoff between model complexity (number of trees) and performance
Very small learning rates can be computationally expensive due to the number of trees needed
The optimal learning rate depends on the size and complexity of the dataset

See Also