Configure HistGradientBoostingClassifier "verbose" Parameter

The verbose parameter in scikit-learn’s HistGradientBoostingClassifier controls the level of output during model training.

HistGradientBoostingClassifier is a gradient boosting algorithm that uses histogram-based decision trees. It’s designed for efficiency and performance on large datasets.

The verbose parameter determines how much information is displayed during the training process. Higher values provide more detailed output, which can be useful for monitoring progress and debugging.

By default, verbose is set to 0, which means no output is produced during training. Common values are 0 (no output), 1 (some output), and 2 (detailed output).

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.metrics import accuracy_score
import sys
from io import StringIO

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5,
                           n_redundant=0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different verbose values
verbose_values = [0, 1, 2]

for v in verbose_values:
    # Capture stdout to record verbose output
    old_stdout = sys.stdout
    sys.stdout = StringIO()

    hgbc = HistGradientBoostingClassifier(max_iter=100, random_state=42, verbose=v)
    hgbc.fit(X_train, y_train)

    # Get the captured output
    verbose_output = sys.stdout.getvalue()
    sys.stdout = old_stdout

    y_pred = hgbc.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)

    print(f"verbose={v}, Accuracy: {accuracy:.3f}")
    print("Verbose output:")
    print(verbose_output[:200] + "..." if len(verbose_output) > 200 else verbose_output)
    print()

Running the example gives an output like:

verbose=0, Accuracy: 0.915
Verbose output:


verbose=1, Accuracy: 0.915
Verbose output:
Binning 0.000 GB of training data: 0.007 s
Fitting gradient boosted rounds:
[1/100] 1 tree, 20 leaves, max depth = 6, in 0.001s
[2/100] 1 tree, 24 leaves, max depth = 9, in 0.001s
[3/100] 1 tree, 27 l...

verbose=2, Accuracy: 0.915
Verbose output:
Binning 0.000 GB of training data: 0.006 s
Fitting gradient boosted rounds:
[1/100] 1 tree, 20 leaves, max depth = 6, in 0.003s
[2/100] 1 tree, 24 leaves, max depth = 9, in 0.002s
[3/100] 1 tree, 27 l...

The key steps in this example are:

Generate a synthetic binary classification dataset
Split the data into train and test sets
Train HistGradientBoostingClassifier models with different verbose values
Capture the output produced during training
Evaluate the accuracy of each model on the test set

Some tips for using the verbose parameter:

Use verbose=0 for production environments where output is not needed
Set verbose=1 or verbose=2 during development to monitor training progress
Higher verbose values can slow down training slightly due to increased I/O

Issues to consider:

The amount of output can be overwhelming for large datasets or many iterations
Verbose output is useful for debugging but may not be necessary once the model is tuned
In some environments, capturing verbose output might require additional setup

See Also