The verbose
parameter in scikit-learn’s BaggingClassifier
controls the verbosity of output during model fitting and prediction.
BaggingClassifier
is an ensemble meta-estimator that fits base classifiers on random subsets of the original dataset and aggregates their predictions. The verbose
parameter determines how much information is printed during these processes.
The verbose
parameter affects the level of detail in the output logs. Higher values provide more detailed information, which can be useful for monitoring progress and debugging.
The default value for verbose
is 0, which means no output is produced. Common values are 0 (silent), 1 (progress bar), and 2 (one line per estimator).
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
import io
import sys
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, n_classes=2, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Function to capture print output
def capture_output(verbose):
old_stdout = sys.stdout
sys.stdout = buffer = io.StringIO()
bagging = BaggingClassifier(estimator=DecisionTreeClassifier(),
n_estimators=10, random_state=42, verbose=verbose)
bagging.fit(X_train, y_train)
sys.stdout = old_stdout
return buffer.getvalue()
# Train with different verbose values
verbose_values = [0, 1, 2]
accuracies = []
for verbose in verbose_values:
output = capture_output(verbose)
bagging = BaggingClassifier(estimator=DecisionTreeClassifier(),
n_estimators=10, random_state=42, verbose=verbose)
bagging.fit(X_train, y_train)
y_pred = bagging.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
print(f"verbose={verbose}, Accuracy: {accuracy:.3f}")
print("Output:")
print(output)
print("-" * 50)
Running the example gives an output like:
verbose=0, Accuracy: 0.885
Output:
--------------------------------------------------
verbose=1, Accuracy: 0.885
Output:
--------------------------------------------------
Building estimator 1 of 10 for this parallel run (total 10)...
Building estimator 2 of 10 for this parallel run (total 10)...
Building estimator 3 of 10 for this parallel run (total 10)...
Building estimator 4 of 10 for this parallel run (total 10)...
Building estimator 5 of 10 for this parallel run (total 10)...
Building estimator 6 of 10 for this parallel run (total 10)...
Building estimator 7 of 10 for this parallel run (total 10)...
Building estimator 8 of 10 for this parallel run (total 10)...
Building estimator 9 of 10 for this parallel run (total 10)...
Building estimator 10 of 10 for this parallel run (total 10)...
verbose=2, Accuracy: 0.885
Output:
Building estimator 1 of 10 for this parallel run (total 10)...
Building estimator 2 of 10 for this parallel run (total 10)...
Building estimator 3 of 10 for this parallel run (total 10)...
Building estimator 4 of 10 for this parallel run (total 10)...
Building estimator 5 of 10 for this parallel run (total 10)...
Building estimator 6 of 10 for this parallel run (total 10)...
Building estimator 7 of 10 for this parallel run (total 10)...
Building estimator 8 of 10 for this parallel run (total 10)...
Building estimator 9 of 10 for this parallel run (total 10)...
Building estimator 10 of 10 for this parallel run (total 10)...
--------------------------------------------------
The key steps in this example are:
- Generate a synthetic binary classification dataset
- Split the data into train and test sets
- Create a function to capture printed output
- Train
BaggingClassifier
models with differentverbose
values - Evaluate the accuracy of each model on the test set
- Display the captured output for each
verbose
level
Some tips for using the verbose
parameter:
- Use
verbose=0
for silent operation in production environments - Use
verbose=1
orverbose=2
during development for progress monitoring - Higher
verbose
values are useful for debugging and understanding model behavior
Issues to consider:
- Verbose output can slow down computation, especially with large datasets
- In production, consider using
verbose=0
to minimize overhead - Ensure that sensitive information is not exposed in verbose output when logging