The fit_intercept parameter in scikit-learn’s SGDRegressor controls whether the model includes a bias term (intercept) in the linear model.
Stochastic Gradient Descent (SGD) is an optimization algorithm used to find the parameters of a linear model that minimize the loss function. It updates the model parameters iteratively using a subset of the training data.
Setting fit_intercept=True allows the model to learn an intercept, which can capture the baseline level of the target variable. Setting it to False forces the model through the origin, which may be appropriate if you know the relationship between features and target should pass through (0,0).
The default value for fit_intercept is True.
In practice, fit_intercept=True is commonly used unless there’s a specific reason to force the model through the origin.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=1, noise=0.1, bias=5.0, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train models with different fit_intercept values
sgd_with_intercept = SGDRegressor(fit_intercept=True, random_state=42)
sgd_without_intercept = SGDRegressor(fit_intercept=False, random_state=42)
sgd_with_intercept.fit(X_train, y_train)
sgd_without_intercept.fit(X_train, y_train)
# Make predictions
y_pred_with = sgd_with_intercept.predict(X_test)
y_pred_without = sgd_without_intercept.predict(X_test)
# Calculate MSE
mse_with = mean_squared_error(y_test, y_pred_with)
mse_without = mean_squared_error(y_test, y_pred_without)
print(f"MSE (with intercept): {mse_with:.4f}")
print(f"MSE (without intercept): {mse_without:.4f}")
print(f"Intercept (with intercept): {sgd_with_intercept.intercept_[0]:.4f}")
print(f"Intercept (without intercept): {sgd_without_intercept.intercept_}")
Running the example gives an output like:
MSE (with intercept): 0.0108
MSE (without intercept): 25.0518
Intercept (with intercept): 4.9986
Intercept (without intercept): [0.]
The key steps in this example are:
- Generate a synthetic regression dataset with a known bias
- Split the data into train and test sets
- Train
SGDRegressormodels withfit_intercept=Trueandfit_intercept=False - Evaluate the mean squared error of each model on the test set
- Compare the intercept values of both models
Some tips for deciding when to use fit_intercept=True or fit_intercept=False:
- Use
fit_intercept=Truewhen you expect a non-zero baseline in your target variable - Set
fit_intercept=Falseif you’re certain the relationship passes through the origin - If unsure, start with
fit_intercept=Trueand compare performance withFalse
Issues to consider:
- Setting
fit_intercept=Falsewhen an intercept is needed can lead to poor model performance - Including an intercept when not needed may slightly increase model complexity
- The impact of
fit_interceptcan vary depending on feature scaling and centering