Ridge regression is a linear regression technique that adds L2 regularization to ordinary least squares. The max_iter
parameter in scikit-learn’s Ridge
class controls the maximum number of iterations used in the iterative solver.
The max_iter
parameter determines the upper limit on the number of iterations performed by the solver. Increasing this value allows the solver to run for more iterations, which can lead to better convergence at the cost of increased computation time.
The default value for max_iter
is None, which means there is no explicit limit on the number of iterations. In practice, values between 1000 and 10000 are commonly used depending on the size and complexity of the dataset.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=20, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different max_iter values
max_iter_values = [100, 1000, 5000, 10000]
mse_scores = []
for max_iter in max_iter_values:
ridge = Ridge(max_iter=max_iter)
ridge.fit(X_train, y_train)
y_pred = ridge.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"max_iter={max_iter}, MSE: {mse:.3f}")
Running the example gives an output like:
max_iter=100, MSE: 380.177
max_iter=1000, MSE: 380.177
max_iter=5000, MSE: 380.177
max_iter=10000, MSE: 380.177
The key steps in this example are:
- Generate a synthetic regression dataset with noise features
- Split the data into train and test sets
- Train
Ridge
models with differentmax_iter
values - Evaluate the mean squared error (MSE) of each model on the test set
Some tips and heuristics for setting max_iter
:
- Start with the default value (None) and increase it if the model hasn’t converged
- Higher values allow for more iterations, which can improve convergence
- Consider the computational cost of using a large number of iterations
Issues to consider:
- The optimal number of iterations depends on the size and complexity of the dataset
- Too few iterations can result in poor convergence and suboptimal performance
- Setting
max_iter
too high can be computationally expensive with diminishing returns