SKLearner Home | About | Contact | Examples

Configure SGDRegressor "warm_start" Parameter

The warm_start parameter in scikit-learn’s SGDRegressor determines whether to reuse the solution of the previous call to fit as initialization for the next fit.

Stochastic Gradient Descent (SGD) is an optimization algorithm that iteratively updates model parameters to minimize the loss function. It’s particularly useful for large-scale and sparse machine learning problems.

When warm_start is set to True, the model retains the coefficients learned from the previous fit and continues training from that point. This can be beneficial for incremental learning or when fine-tuning a model with new data.

The default value for warm_start is False. It’s commonly set to True when dealing with large datasets that are processed in batches or when implementing online learning scenarios.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error
import time

# Generate synthetic dataset
X, y = make_regression(n_samples=10000, n_features=20, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with warm_start=False
sgd_cold = SGDRegressor(random_state=42)
start_time = time.time()
for _ in range(5):
    sgd_cold.fit(X_train, y_train)
cold_time = time.time() - start_time
y_pred_cold = sgd_cold.predict(X_test)
mse_cold = mean_squared_error(y_test, y_pred_cold)

# Train with warm_start=True
sgd_warm = SGDRegressor(warm_start=True, random_state=42)
start_time = time.time()
for _ in range(5):
    sgd_warm.fit(X_train, y_train)
warm_time = time.time() - start_time
y_pred_warm = sgd_warm.predict(X_test)
mse_warm = mean_squared_error(y_test, y_pred_warm)

print(f"Cold start - MSE: {mse_cold:.4f}, Time: {cold_time:.4f}s")
print(f"Warm start - MSE: {mse_warm:.4f}, Time: {warm_time:.4f}s")

Running the example gives an output like:

Cold start - MSE: 0.0106, Time: 0.0286s
Warm start - MSE: 0.0107, Time: 0.0250s

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train SGDRegressor with warm_start=False, fitting multiple times
  4. Train SGDRegressor with warm_start=True, fitting multiple times
  5. Compare the mean squared error and training time for both approaches

Some tips and heuristics for using warm_start:

Issues to consider:



See Also