SKLearner Home | About | Contact | Examples

Configure GradientBoostingClassifier "criterion" Parameter

The criterion parameter in scikit-learn’s GradientBoostingClassifier determines the function used to measure the quality of a split at each node of the decision trees.

Gradient Boosting is an ensemble method that sequentially adds decision trees to correct the errors made by the previous trees. The criterion parameter affects how the algorithm decides to split nodes when building these trees.

The default value for criterion is 'friedman_mse', which refers to the mean squared error with improvement score by Friedman. Another supported value is 'squared_error' for the regular mean squared error.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=3, n_informative=5,
                           n_redundant=0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different criterion values
criterion_values = ['friedman_mse', 'squared_error']
accuracies = []

for criterion in criterion_values:
    gb = GradientBoostingClassifier(criterion=criterion, random_state=42)
    gb.fit(X_train, y_train)
    y_pred = gb.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"criterion='{criterion}', Accuracy: {accuracy:.3f}")

Running the example gives an output like:

criterion='friedman_mse', Accuracy: 0.785
criterion='squared_error', Accuracy: 0.785

The key steps in this example are:

  1. Generate a synthetic multiclass classification dataset with informative and noise features
  2. Split the data into train and test sets
  3. Train GradientBoostingClassifier models with different criterion values
  4. Evaluate the accuracy of each model on the test set

Some tips and heuristics for setting criterion:

Issues to consider:



See Also