SKLearner Home | About | Contact | Examples

Configure SGDClassifier "eta0" Parameter

The eta0 parameter in scikit-learn’s SGDClassifier sets the initial learning rate for the stochastic gradient descent optimization.

Stochastic Gradient Descent (SGD) is an optimization algorithm that iteratively updates model parameters to minimize the loss function. The eta0 parameter determines the step size at each iteration while moving toward a minimum of the loss function.

A higher eta0 value can lead to faster convergence but may overshoot the minimum, while a lower value provides more precise updates but may result in slower convergence.

The default value for eta0 is 0.01 when using the ‘constant’ learning rate schedule.

In practice, values between 0.001 and 0.1 are commonly used, depending on the specific problem and dataset characteristics.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score
import numpy as np

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=2, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different eta0 values
eta0_values = [0.0001, 0.001, 0.01, 0.1, 1.0]
accuracies = []

for eta0 in eta0_values:
    sgd = SGDClassifier(loss='log_loss', eta0=eta0, learning_rate='constant',
                        random_state=42, max_iter=1000)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"eta0={eta0:.4f}, Accuracy: {accuracy:.3f}")

# Find best eta0
best_eta0 = eta0_values[np.argmax(accuracies)]
print(f"\nBest eta0: {best_eta0:.4f}")

Running the example gives an output like:

eta0=0.0001, Accuracy: 0.815
eta0=0.0010, Accuracy: 0.810
eta0=0.0100, Accuracy: 0.830
eta0=0.1000, Accuracy: 0.755
eta0=1.0000, Accuracy: 0.740

Best eta0: 0.0100

The key steps in this example are:

  1. Generate a synthetic binary classification dataset
  2. Split the data into train and test sets
  3. Train SGDClassifier models with different eta0 values
  4. Evaluate the accuracy of each model on the test set
  5. Identify the best eta0 value based on accuracy

Some tips for setting eta0:

Issues to consider:



See Also