SKLearner Home | About | Contact | Examples

Configure SGDRegressor "fit_intercept" Parameter

The fit_intercept parameter in scikit-learn’s SGDRegressor controls whether the model includes a bias term (intercept) in the linear model.

Stochastic Gradient Descent (SGD) is an optimization algorithm used to find the parameters of a linear model that minimize the loss function. It updates the model parameters iteratively using a subset of the training data.

Setting fit_intercept=True allows the model to learn an intercept, which can capture the baseline level of the target variable. Setting it to False forces the model through the origin, which may be appropriate if you know the relationship between features and target should pass through (0,0).

The default value for fit_intercept is True.

In practice, fit_intercept=True is commonly used unless there’s a specific reason to force the model through the origin.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=1, noise=0.1, bias=5.0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train models with different fit_intercept values
sgd_with_intercept = SGDRegressor(fit_intercept=True, random_state=42)
sgd_without_intercept = SGDRegressor(fit_intercept=False, random_state=42)

sgd_with_intercept.fit(X_train, y_train)
sgd_without_intercept.fit(X_train, y_train)

# Make predictions
y_pred_with = sgd_with_intercept.predict(X_test)
y_pred_without = sgd_without_intercept.predict(X_test)

# Calculate MSE
mse_with = mean_squared_error(y_test, y_pred_with)
mse_without = mean_squared_error(y_test, y_pred_without)

print(f"MSE (with intercept): {mse_with:.4f}")
print(f"MSE (without intercept): {mse_without:.4f}")
print(f"Intercept (with intercept): {sgd_with_intercept.intercept_[0]:.4f}")
print(f"Intercept (without intercept): {sgd_without_intercept.intercept_}")

Running the example gives an output like:

MSE (with intercept): 0.0108
MSE (without intercept): 25.0518
Intercept (with intercept): 4.9986
Intercept (without intercept): [0.]

The key steps in this example are:

  1. Generate a synthetic regression dataset with a known bias
  2. Split the data into train and test sets
  3. Train SGDRegressor models with fit_intercept=True and fit_intercept=False
  4. Evaluate the mean squared error of each model on the test set
  5. Compare the intercept values of both models

Some tips for deciding when to use fit_intercept=True or fit_intercept=False:

Issues to consider:



See Also