SKLearner Home | About | Contact | Examples

Configure SGDRegressor "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s SGDRegressor controls how quickly the model adapts to the problem.

Stochastic Gradient Descent (SGD) is an optimization algorithm that iteratively adjusts model parameters to minimize the loss function. The learning rate determines the step size at each iteration while moving toward a minimum of the loss function.

The learning_rate parameter in SGDRegressor can be set to various options: ‘constant’, ‘optimal’, ‘invscaling’, or ‘adaptive’. Each option affects how the learning rate changes during training.

The default value for learning_rate is ‘invscaling’. Common choices include ‘constant’ for a fixed learning rate and ‘adaptive’ for an automatically adjusted rate.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate options
learning_rates = ['constant', 'optimal', 'invscaling', 'adaptive']
mse_scores = []

for lr in learning_rates:
    sgd = SGDRegressor(learning_rate=lr, random_state=42, max_iter=1000)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"learning_rate={lr}, MSE: {mse:.3f}")

# Find the best learning rate
best_lr = learning_rates[np.argmin(mse_scores)]
print(f"Best learning_rate: {best_lr}")

Running the example gives an output like:

learning_rate=constant, MSE: 0.010
learning_rate=optimal, MSE: 0.031
learning_rate=invscaling, MSE: 0.010
learning_rate=adaptive, MSE: 0.010
Best learning_rate: invscaling

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train SGDRegressor models with different learning_rate options
  4. Evaluate the mean squared error of each model on the test set
  5. Identify the best performing learning rate option

Some tips and heuristics for setting learning_rate:

Issues to consider:



See Also