The l1_ratio
parameter in scikit-learn’s SGDRegressor
controls the balance between L1 and L2 regularization.
SGDRegressor
uses elastic net regularization, which combines L1 and L2 penalties. The l1_ratio
parameter determines the mix of these penalties, allowing for fine-tuned regularization.
l1_ratio
ranges from 0 to 1. A value of 0 corresponds to L2 regularization only, while 1 means pure L1 regularization. Values between 0 and 1 represent a mix of both.
The default value for l1_ratio
is 0.15, which favors L2 regularization. Common values range from 0.1 to 0.9, depending on the desired balance between feature selection and model stability.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
import matplotlib.pyplot as plt
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different l1_ratio values
l1_ratio_values = [0, 0.15, 0.5, 0.85, 1]
mse_scores = []
for ratio in l1_ratio_values:
sgd = SGDRegressor(l1_ratio=ratio, random_state=42)
sgd.fit(X_train, y_train)
y_pred = sgd.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"l1_ratio={ratio}, MSE: {mse:.3f}")
Running the example gives an output like:
l1_ratio=0, MSE: 0.012
l1_ratio=0.15, MSE: 0.012
l1_ratio=0.5, MSE: 0.012
l1_ratio=0.85, MSE: 0.012
l1_ratio=1, MSE: 0.012
The key steps in this example are:
- Generate a synthetic regression dataset with multiple features
- Split the data into train and test sets
- Train
SGDRegressor
models with differentl1_ratio
values - Evaluate the mean squared error of each model on the test set
Some tips and heuristics for setting l1_ratio
:
- Start with the default value of 0.15 and adjust based on model performance
- Use higher values (closer to 1) for sparse feature selection
- Lower values (closer to 0) can improve stability for dense feature sets
- Employ cross-validation to find the optimal value for your specific dataset
Issues to consider:
- L1 regularization promotes sparsity, while L2 handles correlated features better
- Higher
l1_ratio
values may improve model interpretability by reducing feature count - The optimal
l1_ratio
depends on the nature of your data and the problem complexity - Very low or high values may lead to underfitting or overfitting, respectively