The alpha
parameter in scikit-learn’s SGDRegressor
controls the strength of regularization applied to the model’s coefficients.
Stochastic Gradient Descent (SGD) is an optimization algorithm used for fitting linear models. SGDRegressor
implements regularized linear regression models trained using SGD.
The alpha
parameter determines the amount of regularization. Higher values of alpha
increase regularization, which can help prevent overfitting but may lead to underfitting if set too high.
The default value for alpha
is 0.0001. In practice, values are often tuned in the range of 1e-5 to 1.0, depending on the scale of the target variable and the complexity of the problem.
import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different alpha values
alpha_values = [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1.0]
mse_scores = []
for alpha in alpha_values:
sgd = SGDRegressor(alpha=alpha, random_state=42, max_iter=1000)
sgd.fit(X_train, y_train)
y_pred = sgd.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"alpha={alpha}, MSE: {mse:.3f}")
Running the example gives an output like:
alpha=1e-05, MSE: 0.011
alpha=0.0001, MSE: 0.012
alpha=0.001, MSE: 0.052
alpha=0.01, MSE: 4.037
alpha=0.1, MSE: 339.606
alpha=1.0, MSE: 9813.859
The key steps in this example are:
- Generate a synthetic regression dataset with multiple features
- Split the data into train and test sets
- Train
SGDRegressor
models with differentalpha
values - Evaluate the mean squared error of each model on the test set
- Visualize the relationship between
alpha
and model performance
Some tips and heuristics for setting alpha
:
- Start with the default value and adjust based on model performance
- Use cross-validation to find the optimal
alpha
for your dataset - Consider the scale of your target variable when choosing
alpha
values
Issues to consider:
- Too low
alpha
may lead to overfitting, especially with high-dimensional data - Too high
alpha
can cause underfitting, resulting in poor model performance - The optimal
alpha
depends on the noise level and complexity of your data - Different regularization types (L1, L2, Elastic Net) may require different
alpha
ranges