Configure BaggingClassifier "verbose" Parameter

The verbose parameter in scikit-learn’s BaggingClassifier controls the verbosity of output during model fitting and prediction.

BaggingClassifier is an ensemble meta-estimator that fits base classifiers on random subsets of the original dataset and aggregates their predictions. The verbose parameter determines how much information is printed during these processes.

The verbose parameter affects the level of detail in the output logs. Higher values provide more detailed information, which can be useful for monitoring progress and debugging.

The default value for verbose is 0, which means no output is produced. Common values are 0 (silent), 1 (progress bar), and 2 (one line per estimator).

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
import io
import sys

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=2, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Function to capture print output
def capture_output(verbose):
    old_stdout = sys.stdout
    sys.stdout = buffer = io.StringIO()

    bagging = BaggingClassifier(estimator=DecisionTreeClassifier(),
                                n_estimators=10, random_state=42, verbose=verbose)
    bagging.fit(X_train, y_train)

    sys.stdout = old_stdout
    return buffer.getvalue()

# Train with different verbose values
verbose_values = [0, 1, 2]
accuracies = []

for verbose in verbose_values:
    output = capture_output(verbose)
    bagging = BaggingClassifier(estimator=DecisionTreeClassifier(),
                                n_estimators=10, random_state=42, verbose=verbose)
    bagging.fit(X_train, y_train)
    y_pred = bagging.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"verbose={verbose}, Accuracy: {accuracy:.3f}")
    print("Output:")
    print(output)
    print("-" * 50)

Running the example gives an output like:

verbose=0, Accuracy: 0.885
Output:

--------------------------------------------------
verbose=1, Accuracy: 0.885
Output:

--------------------------------------------------
Building estimator 1 of 10 for this parallel run (total 10)...
Building estimator 2 of 10 for this parallel run (total 10)...
Building estimator 3 of 10 for this parallel run (total 10)...
Building estimator 4 of 10 for this parallel run (total 10)...
Building estimator 5 of 10 for this parallel run (total 10)...
Building estimator 6 of 10 for this parallel run (total 10)...
Building estimator 7 of 10 for this parallel run (total 10)...
Building estimator 8 of 10 for this parallel run (total 10)...
Building estimator 9 of 10 for this parallel run (total 10)...
Building estimator 10 of 10 for this parallel run (total 10)...
verbose=2, Accuracy: 0.885
Output:
Building estimator 1 of 10 for this parallel run (total 10)...
Building estimator 2 of 10 for this parallel run (total 10)...
Building estimator 3 of 10 for this parallel run (total 10)...
Building estimator 4 of 10 for this parallel run (total 10)...
Building estimator 5 of 10 for this parallel run (total 10)...
Building estimator 6 of 10 for this parallel run (total 10)...
Building estimator 7 of 10 for this parallel run (total 10)...
Building estimator 8 of 10 for this parallel run (total 10)...
Building estimator 9 of 10 for this parallel run (total 10)...
Building estimator 10 of 10 for this parallel run (total 10)...

--------------------------------------------------

The key steps in this example are:

Generate a synthetic binary classification dataset
Split the data into train and test sets
Create a function to capture printed output
Train BaggingClassifier models with different verbose values
Evaluate the accuracy of each model on the test set
Display the captured output for each verbose level

Some tips for using the verbose parameter:

Use verbose=0 for silent operation in production environments
Use verbose=1 or verbose=2 during development for progress monitoring
Higher verbose values are useful for debugging and understanding model behavior

Issues to consider:

Verbose output can slow down computation, especially with large datasets
In production, consider using verbose=0 to minimize overhead
Ensure that sensitive information is not exposed in verbose output when logging

See Also