The fit_intercept
parameter in scikit-learn’s SGDRegressor
controls whether the model includes a bias term (intercept) in the linear model.
Stochastic Gradient Descent (SGD) is an optimization algorithm used to find the parameters of a linear model that minimize the loss function. It updates the model parameters iteratively using a subset of the training data.
Setting fit_intercept=True
allows the model to learn an intercept, which can capture the baseline level of the target variable. Setting it to False
forces the model through the origin, which may be appropriate if you know the relationship between features and target should pass through (0,0).
The default value for fit_intercept
is True
.
In practice, fit_intercept=True
is commonly used unless there’s a specific reason to force the model through the origin.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=1, noise=0.1, bias=5.0, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train models with different fit_intercept values
sgd_with_intercept = SGDRegressor(fit_intercept=True, random_state=42)
sgd_without_intercept = SGDRegressor(fit_intercept=False, random_state=42)
sgd_with_intercept.fit(X_train, y_train)
sgd_without_intercept.fit(X_train, y_train)
# Make predictions
y_pred_with = sgd_with_intercept.predict(X_test)
y_pred_without = sgd_without_intercept.predict(X_test)
# Calculate MSE
mse_with = mean_squared_error(y_test, y_pred_with)
mse_without = mean_squared_error(y_test, y_pred_without)
print(f"MSE (with intercept): {mse_with:.4f}")
print(f"MSE (without intercept): {mse_without:.4f}")
print(f"Intercept (with intercept): {sgd_with_intercept.intercept_[0]:.4f}")
print(f"Intercept (without intercept): {sgd_without_intercept.intercept_}")
Running the example gives an output like:
MSE (with intercept): 0.0108
MSE (without intercept): 25.0518
Intercept (with intercept): 4.9986
Intercept (without intercept): [0.]
The key steps in this example are:
- Generate a synthetic regression dataset with a known bias
- Split the data into train and test sets
- Train
SGDRegressor
models withfit_intercept=True
andfit_intercept=False
- Evaluate the mean squared error of each model on the test set
- Compare the intercept values of both models
Some tips for deciding when to use fit_intercept=True
or fit_intercept=False
:
- Use
fit_intercept=True
when you expect a non-zero baseline in your target variable - Set
fit_intercept=False
if you’re certain the relationship passes through the origin - If unsure, start with
fit_intercept=True
and compare performance withFalse
Issues to consider:
- Setting
fit_intercept=False
when an intercept is needed can lead to poor model performance - Including an intercept when not needed may slightly increase model complexity
- The impact of
fit_intercept
can vary depending on feature scaling and centering