The verbose
parameter in scikit-learn’s ExtraTreesClassifier
controls the level of output during model training.
ExtraTreesClassifier is an ensemble method that builds multiple decision trees using randomized feature selection and splitting criteria. It combines the predictions of these trees to make final classifications.
The verbose
parameter determines how much information is printed during the fitting process. Higher values provide more detailed output, which can be useful for monitoring training progress and debugging.
The default value for verbose
is 0, which means no output is produced during training. Common values are 1 for basic progress information and 2 for more detailed output.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.metrics import accuracy_score
import sys
from io import StringIO
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different verbose levels
verbose_levels = [0, 1, 2]
for level in verbose_levels:
# Redirect stdout to capture output
old_stdout = sys.stdout
sys.stdout = StringIO()
# Train model
etc = ExtraTreesClassifier(n_estimators=100, random_state=42, verbose=level)
etc.fit(X_train, y_train)
# Get captured output
output = sys.stdout.getvalue()
sys.stdout = old_stdout
# Evaluate model
y_pred = etc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Verbose level: {level}")
print(f"Output length: {len(output)} characters")
print(f"Accuracy: {accuracy:.3f}")
print("Sample output:")
print(output[:200] + "..." if output else "No output")
print()
Running the example gives an output like:
Verbose level: 0
Output length: 0 characters
Accuracy: 0.925
Sample output:
No output
[Parallel(n_jobs=1)]: Done 49 tasks | elapsed: 0.0s
[Parallel(n_jobs=1)]: Done 49 tasks | elapsed: 0.0s
Verbose level: 1
Output length: 0 characters
Accuracy: 0.925
Sample output:
No output
[Parallel(n_jobs=1)]: Done 40 tasks | elapsed: 0.0s
[Parallel(n_jobs=1)]: Done 40 tasks | elapsed: 0.0s
Verbose level: 2
Output length: 2392 characters
Accuracy: 0.925
Sample output:
building tree 1 of 100
building tree 2 of 100
building tree 3 of 100
building tree 4 of 100
building tree 5 of 100
building tree 6 of 100
building tree 7 of 100
building tree 8 of 100
building tree 9 ...
The key steps in this example are:
- Generate a synthetic classification dataset
- Split the data into train and test sets
- Train
ExtraTreesClassifier
models with differentverbose
levels - Capture and display the output for each verbose level
- Evaluate the accuracy of each model on the test set
Some tips for using the verbose
parameter:
- Use
verbose=0
for production code to minimize overhead - Set
verbose=1
orverbose=2
during development for progress monitoring - Higher verbose levels can be useful for debugging or understanding model behavior
Issues to consider:
- Verbose output can slow down training, especially with large datasets
- The amount of output increases with the number of trees (
n_estimators
) - Very high verbose levels may produce excessive output that’s difficult to interpret