SKLearner Home | About | Contact | Examples

Configure Ridge "fit_intercept" Parameter

The fit_intercept parameter in scikit-learn’s Ridge class determines whether to calculate the intercept for the linear model.

Ridge regression is a regularized version of linear regression that adds an L2 penalty term to the loss function, which helps to prevent overfitting.

By default, fit_intercept is set to True, which means that the model will estimate an intercept term. This is generally recommended, as the intercept can help to capture the overall level of the response variable.

However, there may be cases where setting fit_intercept to False is beneficial, such as when the data is already centered around zero or when a constant term is included in the input features.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with and without intercept
ridge_intercept = Ridge(alpha=1.0, fit_intercept=True)
ridge_no_intercept = Ridge(alpha=1.0, fit_intercept=False)

ridge_intercept.fit(X_train, y_train)
ridge_no_intercept.fit(X_train, y_train)

# Evaluate performance
y_pred_intercept = ridge_intercept.predict(X_test)
y_pred_no_intercept = ridge_no_intercept.predict(X_test)

mse_intercept = mean_squared_error(y_test, y_pred_intercept)
mse_no_intercept = mean_squared_error(y_test, y_pred_no_intercept)

print(f"With intercept: MSE = {mse_intercept:.3f}")
print(f"Without intercept: MSE = {mse_no_intercept:.3f}")
print(f"\nIntercept value: {ridge_intercept.intercept_:.3f}")

Running the example gives an output like:

With intercept: MSE = 105.786
Without intercept: MSE = 105.933

Intercept value: 0.014

The key steps in this example are:

  1. Generate a synthetic regression dataset with a linear relationship and noise
  2. Split the data into train and test sets
  3. Train Ridge models with fit_intercept set to True and False
  4. Evaluate the models using mean squared error and compare their performance

Some tips and heuristics for setting fit_intercept:

Issues to consider:



See Also