SKLearner Home | About | Contact | Examples

Configure SGDRegressor "eta0" Parameter

The eta0 parameter in scikit-learn’s SGDRegressor sets the initial learning rate for the model’s gradient descent optimization.

Stochastic Gradient Descent (SGD) is an iterative optimization algorithm used for fitting linear models. It updates the model’s parameters based on the gradient of the loss function with respect to a single training example at each iteration.

The eta0 parameter controls the step size taken during each update. A larger value can lead to faster initial convergence but may overshoot the optimal solution, while a smaller value provides more precise updates but may require more iterations to converge.

The default value for eta0 is 0.01.

In practice, values between 0.1 and 0.0001 are commonly used, depending on the specific problem and dataset characteristics.

import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different eta0 values
eta0_values = [0.1, 0.01, 0.001, 0.0001]
mse_scores = []

for eta0 in eta0_values:
    sgd = SGDRegressor(eta0=eta0, random_state=42, max_iter=1000, tol=1e-3)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"eta0={eta0}, MSE: {mse:.3f}")

Running the example gives an output like:

eta0=0.1, MSE: 0.010
eta0=0.01, MSE: 0.010
eta0=0.001, MSE: 0.026
eta0=0.0001, MSE: 22.820

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train SGDRegressor models with different eta0 values
  4. Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting eta0:

Issues to consider:



See Also