SKLearner Home | About | Contact | Examples

Configure SGDClassifier "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s SGDClassifier controls how quickly the model adapts to the training data.

Stochastic Gradient Descent (SGD) is an optimization algorithm that iteratively updates model parameters based on individual training examples. The learning_rate determines the step size at each iteration while moving toward a minimum of the loss function.

A higher learning rate allows faster initial learning but may overshoot the optimal solution, while a lower learning rate provides more precise convergence but may require more iterations to reach the minimum.

The default value for learning_rate is ‘optimal’, which uses a heuristic proposed by Léon Bottou. Common alternatives include ‘constant’, ‘invscaling’, and ‘adaptive’.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
import numpy as np

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate values
learning_rates = ['constant', 'optimal', 'invscaling', 'adaptive']
accuracies = []

for lr in learning_rates:
    sgd = SGDClassifier(learning_rate=lr, eta0=0.01, random_state=42)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"learning_rate={lr}, Accuracy: {accuracy:.3f}")

Running the example gives an output like:

learning_rate=constant, Accuracy: 0.880
learning_rate=optimal, Accuracy: 0.790
learning_rate=invscaling, Accuracy: 0.865
learning_rate=adaptive, Accuracy: 0.865

The key steps in this example are:

  1. Generate a synthetic binary classification dataset
  2. Split the data into train and test sets
  3. Train SGDClassifier models with different learning_rate values
  4. Evaluate the accuracy of each model on the test set

Some tips and heuristics for setting learning_rate:

Issues to consider:



See Also