SKLearner Home | About | Contact | Examples

Configure SGDRegressor "l1_ratio" Parameter

The l1_ratio parameter in scikit-learn’s SGDRegressor controls the balance between L1 and L2 regularization.

SGDRegressor uses elastic net regularization, which combines L1 and L2 penalties. The l1_ratio parameter determines the mix of these penalties, allowing for fine-tuned regularization.

l1_ratio ranges from 0 to 1. A value of 0 corresponds to L2 regularization only, while 1 means pure L1 regularization. Values between 0 and 1 represent a mix of both.

The default value for l1_ratio is 0.15, which favors L2 regularization. Common values range from 0.1 to 0.9, depending on the desired balance between feature selection and model stability.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
import matplotlib.pyplot as plt

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different l1_ratio values
l1_ratio_values = [0, 0.15, 0.5, 0.85, 1]
mse_scores = []

for ratio in l1_ratio_values:
    sgd = SGDRegressor(l1_ratio=ratio, random_state=42)
    sgd.fit(X_train, y_train)
    y_pred = sgd.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"l1_ratio={ratio}, MSE: {mse:.3f}")

Running the example gives an output like:

l1_ratio=0, MSE: 0.012
l1_ratio=0.15, MSE: 0.012
l1_ratio=0.5, MSE: 0.012
l1_ratio=0.85, MSE: 0.012
l1_ratio=1, MSE: 0.012

The key steps in this example are:

  1. Generate a synthetic regression dataset with multiple features
  2. Split the data into train and test sets
  3. Train SGDRegressor models with different l1_ratio values
  4. Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting l1_ratio:

Issues to consider:



See Also