The verbose
parameter in scikit-learn’s SGDClassifier
controls the verbosity of output during model training.
Stochastic Gradient Descent (SGD) is an optimization method used to find the parameters that minimize the loss function. The SGDClassifier
applies this method to classification problems.
The verbose
parameter determines how much information is printed during the training process. It can be useful for monitoring progress and debugging.
The default value for verbose
is 0, which means no output is produced during training. Common values are 0 (silent), 1 (print progress), and greater than 1 (print more detailed information).
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different verbose values
verbose_values = [0, 1, 2]
for v in verbose_values:
print(f"\nTraining with verbose={v}")
sgd = SGDClassifier(max_iter=10, tol=1e-3, verbose=v, random_state=42)
sgd.fit(X_train, y_train)
y_pred = sgd.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.3f}")
Running the example gives an output like:
Training with verbose=0
Accuracy: 0.740
Training with verbose=1
-- Epoch 1
Norm: 89.49, NNZs: 20, Bias: 22.738696, T: 800, Avg. loss: 13.647219
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 55.37, NNZs: 20, Bias: -7.870457, T: 1600, Avg. loss: 10.208955
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 41.46, NNZs: 20, Bias: 0.531342, T: 2400, Avg. loss: 6.728514
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 45.47, NNZs: 20, Bias: 2.992868, T: 3200, Avg. loss: 5.284373
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 36.34, NNZs: 20, Bias: 0.908330, T: 4000, Avg. loss: 4.578592
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 22.87, NNZs: 20, Bias: 2.825235, T: 4800, Avg. loss: 4.170265
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 20.50, NNZs: 20, Bias: -0.504691, T: 5600, Avg. loss: 3.403925
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 19.57, NNZs: 20, Bias: -0.363597, T: 6400, Avg. loss: 3.011103
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 18.21, NNZs: 20, Bias: -0.224667, T: 7200, Avg. loss: 2.864420
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 20.55, NNZs: 20, Bias: 0.001183, T: 8000, Avg. loss: 2.702092
Total training time: 0.00 seconds.
Accuracy: 0.740
Training with verbose=2
-- Epoch 1
Norm: 89.49, NNZs: 20, Bias: 22.738696, T: 800, Avg. loss: 13.647219
Total training time: 0.00 seconds.
-- Epoch 2
Norm: 55.37, NNZs: 20, Bias: -7.870457, T: 1600, Avg. loss: 10.208955
Total training time: 0.00 seconds.
-- Epoch 3
Norm: 41.46, NNZs: 20, Bias: 0.531342, T: 2400, Avg. loss: 6.728514
Total training time: 0.00 seconds.
-- Epoch 4
Norm: 45.47, NNZs: 20, Bias: 2.992868, T: 3200, Avg. loss: 5.284373
Total training time: 0.00 seconds.
-- Epoch 5
Norm: 36.34, NNZs: 20, Bias: 0.908330, T: 4000, Avg. loss: 4.578592
Total training time: 0.00 seconds.
-- Epoch 6
Norm: 22.87, NNZs: 20, Bias: 2.825235, T: 4800, Avg. loss: 4.170265
Total training time: 0.00 seconds.
-- Epoch 7
Norm: 20.50, NNZs: 20, Bias: -0.504691, T: 5600, Avg. loss: 3.403925
Total training time: 0.00 seconds.
-- Epoch 8
Norm: 19.57, NNZs: 20, Bias: -0.363597, T: 6400, Avg. loss: 3.011103
Total training time: 0.00 seconds.
-- Epoch 9
Norm: 18.21, NNZs: 20, Bias: -0.224667, T: 7200, Avg. loss: 2.864420
Total training time: 0.00 seconds.
-- Epoch 10
Norm: 20.55, NNZs: 20, Bias: 0.001183, T: 8000, Avg. loss: 2.702092
Total training time: 0.00 seconds.
Accuracy: 0.740
The key steps in this example are:
- Generate a synthetic binary classification dataset
- Split the data into train and test sets
- Train
SGDClassifier
models with differentverbose
values - Observe the output produced during training
- Evaluate the accuracy of each model on the test set
Some tips for using the verbose
parameter:
- Use
verbose=0
for silent operation in production environments - Set
verbose=1
for basic progress information during development - Use
verbose>1
for detailed debugging information
Issues to consider:
- Higher verbosity levels can slow down training, especially for large datasets
- In some environments, excessive output might interfere with other processes
- Balance the need for information with performance requirements