Configure Ridge "tol" Parameter

The tol parameter in scikit-learn’s Ridge regressor controls the precision of the solution and serves as a stopping criterion for the solver.

Ridge regression is a regularized linear regression technique that adds an L2 penalty term to the ordinary least squares objective function. This helps to prevent overfitting and can improve generalization performance.

The tol parameter specifies the tolerance for the optimization solver. It determines the stopping criterion based on the precision of the solution. Lower values lead to more precise solutions but require more iterations and longer training times.

The default value for tol is 0.001.

In practice, values between 0.1 and 0.0001 are commonly used depending on the desired balance between solution quality and computational efficiency.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5,
                       n_targets=1, noise=0.5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different tol values
tol_values = [0.1, 0.01, 0.001, 0.0001]
mse_scores = []

for tol in tol_values:
    ridge = Ridge(tol=tol)
    ridge.fit(X_train, y_train)
    y_pred = ridge.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"tol={tol}, MSE: {mse:.3f}")

Running the example gives an output like:

tol=0.1, MSE: 0.200
tol=0.01, MSE: 0.200
tol=0.001, MSE: 0.200
tol=0.0001, MSE: 0.200

The key steps in this example are:

Generate a synthetic regression dataset with informative and noise features
Split the data into train and test sets
Train Ridge models with different tol values
Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting tol:

Start with the default value of 0.001 and adjust as needed based on the problem
Use lower tol for high precision solutions, increase it for faster training
Find a balance between solution quality and computational cost

Issues to consider:

Very low tolerance values can lead to long training times
Extremely high tolerance may result in poor quality solutions
The optimal value depends on the scale of the data and desired precision vs speed tradeoff

See Also