Configure SGDClassifier "fit_intercept" Parameter

The fit_intercept parameter in scikit-learn’s SGDClassifier determines whether to include a bias term (intercept) in the model.

Stochastic Gradient Descent (SGD) is an optimization algorithm used to find the parameters that minimize the loss function of a linear model. It updates the model’s parameters iteratively using a subset of the training data.

When fit_intercept is set to True, the model learns an intercept term, allowing the decision boundary to be shifted from the origin. When set to False, the model assumes the decision boundary passes through the origin.

The default value for fit_intercept is True.

In practice, fit_intercept=True is commonly used unless there’s a specific reason to force the decision boundary through the origin.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import accuracy_score, f1_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5,
                           n_redundant=0, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train models with different fit_intercept values
sgd_with_intercept = SGDClassifier(random_state=42)
sgd_without_intercept = SGDClassifier(fit_intercept=False, random_state=42)

sgd_with_intercept.fit(X_train, y_train)
sgd_without_intercept.fit(X_train, y_train)

# Make predictions
y_pred_with = sgd_with_intercept.predict(X_test)
y_pred_without = sgd_without_intercept.predict(X_test)

# Evaluate performance
accuracy_with = accuracy_score(y_test, y_pred_with)
accuracy_without = accuracy_score(y_test, y_pred_without)
f1_with = f1_score(y_test, y_pred_with)
f1_without = f1_score(y_test, y_pred_without)

print(f"With intercept - Accuracy: {accuracy_with:.3f}, F1-score: {f1_with:.3f}")
print(f"Without intercept - Accuracy: {accuracy_without:.3f}, F1-score: {f1_without:.3f}")

Running the example gives an output like:

With intercept - Accuracy: 0.705, F1-score: 0.709
Without intercept - Accuracy: 0.685, F1-score: 0.648

The key steps in this example are:

Generate a synthetic binary classification dataset
Split the data into train and test sets
Train two SGDClassifier models, one with fit_intercept=True and one with fit_intercept=False
Evaluate and compare the performance of both models using accuracy and F1-score

Some tips for deciding when to use fit_intercept:

Use fit_intercept=True (default) in most cases, especially when you’re unsure about the data distribution
Set fit_intercept=False if you’re certain that your data is centered around the origin
If your features are already centered (mean=0), setting fit_intercept=False might lead to faster convergence

Issues to consider:

Setting fit_intercept=False when it’s needed can severely limit the model’s ability to fit the data
Including an intercept when it’s not necessary can introduce additional complexity and potential overfitting
The impact of fit_intercept may vary depending on the scale and distribution of your features

See Also