Configure MLPRegressor "n_iter_no_change" Parameter

The n_iter_no_change parameter in scikit-learn’s MLPRegressor controls the early stopping mechanism, which can prevent overfitting and reduce training time.

Early stopping is a regularization technique that stops training when the validation score doesn’t improve for a specified number of consecutive iterations. The n_iter_no_change parameter sets this number of iterations.

A larger value for n_iter_no_change allows more iterations without improvement, potentially capturing subtle patterns but risking overfitting. A smaller value stops training earlier, which can prevent overfitting but might lead to underfitting if set too low.

The default value for n_iter_no_change is 10. In practice, values between 5 and 20 are commonly used, depending on the dataset’s complexity and the model’s architecture.

from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different n_iter_no_change values
n_iter_values = [5, 10, 20, 50]
mse_scores = []

for n_iter in n_iter_values:
    mlp = MLPRegressor(hidden_layer_sizes=(100, 50), max_iter=1000,
                       n_iter_no_change=n_iter, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"n_iter_no_change={n_iter}, MSE: {mse:.4f}, Iterations: {mlp.n_iter_}")

# Find best n_iter_no_change
best_n_iter = n_iter_values[np.argmin(mse_scores)]
print(f"\nBest n_iter_no_change: {best_n_iter}")

Running the example gives an output like:

n_iter_no_change=5, MSE: 6.7081, Iterations: 1000
n_iter_no_change=10, MSE: 6.7081, Iterations: 1000
n_iter_no_change=20, MSE: 6.7081, Iterations: 1000
n_iter_no_change=50, MSE: 6.7081, Iterations: 1000

Best n_iter_no_change: 5

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train MLPRegressor models with different n_iter_no_change values
Evaluate the mean squared error (MSE) of each model on the test set
Compare the number of iterations and MSE for different n_iter_no_change values

Some tips and heuristics for setting n_iter_no_change:

Start with the default value of 10 and adjust based on model performance
Use a larger value for complex datasets or when you suspect the model needs more time to converge
Use a smaller value if you observe overfitting or want to reduce training time
Monitor both training and validation scores to ensure the model isn’t stopping too early or late

Issues to consider:

The optimal value depends on the dataset complexity, model architecture, and other hyperparameters
Too small a value may cause premature stopping, while too large a value may lead to overfitting
Early stopping should be used in conjunction with other regularization techniques for best results
The effectiveness of n_iter_no_change can vary depending on the learning rate and optimizer used

See Also