SKLearner Home | About | Contact | Examples

Configure GradientBoostingClassifier "warm_start" Parameter

The warm_start parameter in scikit-learn’s GradientBoostingClassifier allows you to add more trees to an already fitted model without starting the training from scratch.

GradientBoostingClassifier is a powerful machine learning algorithm used for classification tasks. It builds an additive model in a forward stage-wise fashion, allowing for the optimization of arbitrary differentiable loss functions.

The warm_start parameter allows the model to reuse the solution of the previous call to fit and add more estimators to the ensemble. This is useful for iterative training and fine-tuning.

The default value for warm_start is False.

In practice, warm_start=True is used when you want to iteratively add estimators to an existing model.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5, n_redundant=0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train initial model
gb = GradientBoostingClassifier(n_estimators=50, warm_start=False, random_state=42)
gb.fit(X_train, y_train)

# Add more trees using warm_start
gb.set_params(warm_start=True, n_estimators=100)
gb.fit(X_train, y_train)

# Evaluate model performance
y_pred = gb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy after adding more trees with warm_start: {accuracy:.3f}")

Running the example gives an output like:

Accuracy after adding more trees with warm_start: 0.900

The key steps in this example are:

  1. Generate a synthetic binary classification dataset.
  2. Split the data into train and test sets.
  3. Train an initial GradientBoostingClassifier model without warm_start.
  4. Enable warm_start and add more trees to the existing model.
  5. Evaluate the accuracy of the model on the test set.

Some tips and heuristics for setting warm_start:

Issues to consider:



See Also