SKLearner Home | About | Contact | Examples

Configure SGDClassifier "early_stopping" Parameter

The early_stopping parameter in scikit-learn’s SGDClassifier determines whether to use early stopping to terminate training when validation scores stop improving.

Stochastic Gradient Descent (SGD) is an efficient method for fitting linear classifiers, but it can be challenging to determine the optimal number of iterations. Early stopping helps prevent overfitting by monitoring the model’s performance on a validation set.

When early_stopping is set to True, the algorithm uses a portion of the training data as a validation set. It stops training when the validation score doesn’t improve for a number of consecutive epochs.

The default value for early_stopping is False. When enabled, a common setting is True with a validation fraction of 0.1 to 0.2.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=2, random_state=42)

# Split into train, validation, and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

# Train without early stopping
sgd_no_early = SGDClassifier(max_iter=1000, random_state=42)
sgd_no_early.fit(X_train, y_train)

# Train with early stopping
sgd_early = SGDClassifier(early_stopping=True, validation_fraction=0.2,
                          n_iter_no_change=5, max_iter=1000, random_state=42)
sgd_early.fit(X_train, y_train)

# Evaluate models
print(f"No early stopping - iterations: {sgd_no_early.n_iter_}, "
      f"accuracy: {accuracy_score(y_test, sgd_no_early.predict(X_test)):.3f}")
print(f"With early stopping - iterations: {sgd_early.n_iter_}, "
      f"accuracy: {accuracy_score(y_test, sgd_early.predict(X_test)):.3f}")

Running the example gives an output like:

No early stopping - iterations: 125, accuracy: 0.761
With early stopping - iterations: 11, accuracy: 0.769

The key steps in this example are:

  1. Generate a synthetic binary classification dataset
  2. Split the data into train, validation, and test sets
  3. Train SGDClassifier models with and without early stopping
  4. Compare the number of iterations and final accuracy of both models

Tips for using early_stopping:

Issues to consider:



See Also