The early_stopping
parameter in scikit-learn’s SGDClassifier
determines whether to use early stopping to terminate training when validation scores stop improving.
Stochastic Gradient Descent (SGD) is an efficient method for fitting linear classifiers, but it can be challenging to determine the optimal number of iterations. Early stopping helps prevent overfitting by monitoring the model’s performance on a validation set.
When early_stopping
is set to True
, the algorithm uses a portion of the training data as a validation set. It stops training when the validation score doesn’t improve for a number of consecutive epochs.
The default value for early_stopping
is False
. When enabled, a common setting is True
with a validation fraction of 0.1 to 0.2.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
n_redundant=5, n_classes=2, random_state=42)
# Split into train, validation, and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)
# Train without early stopping
sgd_no_early = SGDClassifier(max_iter=1000, random_state=42)
sgd_no_early.fit(X_train, y_train)
# Train with early stopping
sgd_early = SGDClassifier(early_stopping=True, validation_fraction=0.2,
n_iter_no_change=5, max_iter=1000, random_state=42)
sgd_early.fit(X_train, y_train)
# Evaluate models
print(f"No early stopping - iterations: {sgd_no_early.n_iter_}, "
f"accuracy: {accuracy_score(y_test, sgd_no_early.predict(X_test)):.3f}")
print(f"With early stopping - iterations: {sgd_early.n_iter_}, "
f"accuracy: {accuracy_score(y_test, sgd_early.predict(X_test)):.3f}")
Running the example gives an output like:
No early stopping - iterations: 125, accuracy: 0.761
With early stopping - iterations: 11, accuracy: 0.769
The key steps in this example are:
- Generate a synthetic binary classification dataset
- Split the data into train, validation, and test sets
- Train SGDClassifier models with and without early stopping
- Compare the number of iterations and final accuracy of both models
Tips for using early_stopping
:
- Enable early stopping for large datasets or when the optimal number of iterations is unknown
- Set
validation_fraction
between 0.1 and 0.2 to balance between training data size and validation reliability - Monitor validation scores during training to ensure the model is improving
Issues to consider:
- Early stopping may reduce training time but could potentially stop before reaching optimal performance
- It may affect model convergence, especially with small validation sets
- Interact with other parameters like
max_iter
andtol
to fine-tune the stopping criteria