Configure SGDRegressor "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s SGDRegressor controls how quickly the model adapts to the problem.

Stochastic Gradient Descent (SGD) is an optimization algorithm that iteratively adjusts model parameters to minimize the loss function. The learning rate determines the step size at each iteration while moving toward a minimum of the loss function.

The learning_rate parameter in SGDRegressor can be set to various options: ‘constant’, ‘optimal’, ‘invscaling’, or ‘adaptive’. Each option affects how the learning rate changes during training.

The default value for learning_rate is ‘invscaling’. Common choices include ‘constant’ for a fixed learning rate and ‘adaptive’ for an automatically adjusted rate.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate options
learning_rates = ['constant', 'optimal', 'invscaling', 'adaptive']
mse_scores = []

for lr in learning_rates:
    sgd = SGDRegressor(learning_rate=lr, random_state=42, max_iter=1000)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"learning_rate={lr}, MSE: {mse:.3f}")

# Find the best learning rate
best_lr = learning_rates[np.argmin(mse_scores)]
print(f"Best learning_rate: {best_lr}")

Running the example gives an output like:

learning_rate=constant, MSE: 0.010
learning_rate=optimal, MSE: 0.031
learning_rate=invscaling, MSE: 0.010
learning_rate=adaptive, MSE: 0.010
Best learning_rate: invscaling

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train SGDRegressor models with different learning_rate options
Evaluate the mean squared error of each model on the test set
Identify the best performing learning rate option

Some tips and heuristics for setting learning_rate:

‘constant’ is suitable when you have a good idea of the optimal learning rate
‘optimal’ works well for simple problems but may struggle with complex datasets
‘invscaling’ is a good default choice as it gradually decreases the learning rate
‘adaptive’ can be effective for datasets where the optimal learning rate changes during training

Issues to consider:

The optimal learning rate depends on the specific dataset and problem
Too high a learning rate can cause the model to overshoot the minimum
Too low a learning rate can result in slow convergence or getting stuck in local minima
The effectiveness of each option can vary based on the scale of your features and the complexity of the relationship between features and target variable

See Also