Configure MLPRegressor "warm_start" Parameter

The warm_start parameter in scikit-learn’s MLPRegressor determines whether to reuse the solution of the previous call to fit as initialization for the next fit.

MLPRegressor (Multi-layer Perceptron Regressor) is a neural network model used for regression tasks. It learns a non-linear function approximator for either classification or regression.

When warm_start is set to True, the model reuses the previous solution as a starting point for the next fit, potentially speeding up training when fitting the model multiple times.

The default value for warm_start is False, which means each call to fit() will erase the previous solution. Setting it to True is useful when you want to train the model incrementally or fine-tune an existing model.

from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import time

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize models
mlp_cold = MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=50, random_state=42)
mlp_warm = MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=50, random_state=42, warm_start=True)

# Train and evaluate models
n_iterations = 5
cold_errors, warm_errors = [], []
cold_times, warm_times = [], []

for _ in range(n_iterations):
    # Cold start
    start_time = time.time()
    mlp_cold.fit(X_train, y_train)
    cold_times.append(time.time() - start_time)
    y_pred_cold = mlp_cold.predict(X_test)
    cold_errors.append(mean_squared_error(y_test, y_pred_cold))

    # Warm start
    start_time = time.time()
    mlp_warm.fit(X_train, y_train)
    warm_times.append(time.time() - start_time)
    y_pred_warm = mlp_warm.predict(X_test)
    warm_errors.append(mean_squared_error(y_test, y_pred_warm))

# Print results
for i in range(n_iterations):
    print(f"Iteration {i+1}:")
    print(f"  Cold start - MSE: {cold_errors[i]:.4f}, Time: {cold_times[i]:.4f}s")
    print(f"  Warm start - MSE: {warm_errors[i]:.4f}, Time: {warm_times[i]:.4f}s")

Running the example gives an output like:

Iteration 1:
  Cold start - MSE: 1629.8617, Time: 0.2069s
  Warm start - MSE: 1629.8617, Time: 0.1703s
Iteration 2:
  Cold start - MSE: 1629.8617, Time: 0.1767s
  Warm start - MSE: 53.1931, Time: 0.1650s
Iteration 3:
  Cold start - MSE: 1629.8617, Time: 0.1664s
  Warm start - MSE: 16.3913, Time: 0.1602s
Iteration 4:
  Cold start - MSE: 1629.8617, Time: 0.1701s
  Warm start - MSE: 10.9534, Time: 0.1629s
Iteration 5:
  Cold start - MSE: 1629.8617, Time: 0.1652s
  Warm start - MSE: 8.5747, Time: 0.1638s

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Create two MLPRegressor instances, one with warm_start=False and another with warm_start=True
Train both models multiple times, measuring performance and training time
Compare the mean squared error and training time for each iteration

Tips and heuristics for using warm_start:

Use warm_start=True when fine-tuning a model or training incrementally
Can potentially speed up training when fitting the model multiple times
Useful for online learning scenarios where data arrives in batches

Issues to consider:

May lead to getting stuck in local minima if the initial solution is poor
Not always faster, especially for small datasets or simple models
Can make hyperparameter tuning more challenging due to dependence on previous states

See Also