The verbose
parameter in scikit-learn’s HistGradientBoostingClassifier
controls the level of output during model training.
HistGradientBoostingClassifier
is a gradient boosting algorithm that uses histogram-based decision trees. It’s designed for efficiency and performance on large datasets.
The verbose
parameter determines how much information is displayed during the training process. Higher values provide more detailed output, which can be useful for monitoring progress and debugging.
By default, verbose
is set to 0, which means no output is produced during training. Common values are 0 (no output), 1 (some output), and 2 (detailed output).
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import HistGradientBoostingClassifier
from sklearn.metrics import accuracy_score
import sys
from io import StringIO
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5,
n_redundant=0, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different verbose values
verbose_values = [0, 1, 2]
for v in verbose_values:
# Capture stdout to record verbose output
old_stdout = sys.stdout
sys.stdout = StringIO()
hgbc = HistGradientBoostingClassifier(max_iter=100, random_state=42, verbose=v)
hgbc.fit(X_train, y_train)
# Get the captured output
verbose_output = sys.stdout.getvalue()
sys.stdout = old_stdout
y_pred = hgbc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"verbose={v}, Accuracy: {accuracy:.3f}")
print("Verbose output:")
print(verbose_output[:200] + "..." if len(verbose_output) > 200 else verbose_output)
print()
Running the example gives an output like:
verbose=0, Accuracy: 0.915
Verbose output:
verbose=1, Accuracy: 0.915
Verbose output:
Binning 0.000 GB of training data: 0.007 s
Fitting gradient boosted rounds:
[1/100] 1 tree, 20 leaves, max depth = 6, in 0.001s
[2/100] 1 tree, 24 leaves, max depth = 9, in 0.001s
[3/100] 1 tree, 27 l...
verbose=2, Accuracy: 0.915
Verbose output:
Binning 0.000 GB of training data: 0.006 s
Fitting gradient boosted rounds:
[1/100] 1 tree, 20 leaves, max depth = 6, in 0.003s
[2/100] 1 tree, 24 leaves, max depth = 9, in 0.002s
[3/100] 1 tree, 27 l...
The key steps in this example are:
- Generate a synthetic binary classification dataset
- Split the data into train and test sets
- Train
HistGradientBoostingClassifier
models with differentverbose
values - Capture the output produced during training
- Evaluate the accuracy of each model on the test set
Some tips for using the verbose
parameter:
- Use
verbose=0
for production environments where output is not needed - Set
verbose=1
orverbose=2
during development to monitor training progress - Higher
verbose
values can slow down training slightly due to increased I/O
Issues to consider:
- The amount of output can be overwhelming for large datasets or many iterations
- Verbose output is useful for debugging but may not be necessary once the model is tuned
- In some environments, capturing verbose output might require additional setup