SKLearner Home | About | Contact | Examples

Configure GradientBoostingClassifier "n_iter_no_change" Parameter

The n_iter_no_change parameter in scikit-learn’s GradientBoostingClassifier controls the early stopping mechanism based on the number of iterations with no improvement.

GradientBoostingClassifier is an ensemble method that builds trees sequentially, each tree correcting the errors of the previous ones. The n_iter_no_change parameter specifies the number of iterations with no improvement in the validation loss before stopping training early.

The default value for n_iter_no_change is None, which means early stopping is not used. In practice, values between 5 and 10 are commonly used depending on the problem and dataset.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10, n_redundant=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different n_iter_no_change values
n_iter_no_change_values = [None, 5, 10]
accuracies = []

for n in n_iter_no_change_values:
    gb = GradientBoostingClassifier(n_iter_no_change=n, random_state=42)
    gb.fit(X_train, y_train)
    y_pred = gb.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"n_iter_no_change={n}, Accuracy: {accuracy:.3f}")

Running the example gives an output like:

n_iter_no_change=None, Accuracy: 0.880
n_iter_no_change=5, Accuracy: 0.885
n_iter_no_change=10, Accuracy: 0.885

The key steps in this example are:

  1. Generate a synthetic binary classification dataset with informative and redundant features.
  2. Split the data into training and testing sets.
  3. Train GradientBoostingClassifier models with different n_iter_no_change values.
  4. Evaluate the accuracy of each model on the test set.

Some tips and heuristics for setting n_iter_no_change:

Issues to consider:



See Also