Configure MLPRegressor "max_iter" Parameter

The max_iter parameter in scikit-learn’s MLPRegressor controls the maximum number of iterations for the solver to converge.

MLPRegressor is a multi-layer perceptron neural network for regression tasks. It uses backpropagation with gradient descent to minimize the loss function.

The max_iter parameter sets an upper limit on the number of epochs (complete passes through the training data) during training. It helps prevent excessive computation time and potential overfitting.

By default, max_iter is set to 200. Common values range from 100 to 1000, depending on the complexity of the problem and dataset size.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different max_iter values
max_iter_values = [50, 200, 500, 1000]
mse_scores = []

for iter_val in max_iter_values:
    mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=iter_val, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"max_iter={iter_val}, MSE: {mse:.3f}, Converged: {mlp.n_iter_ < iter_val}")

# Find best max_iter
best_max_iter = max_iter_values[np.argmin(mse_scores)]
print(f"\nBest max_iter: {best_max_iter}")

Running the example gives an output like:

max_iter=50, MSE: 13905.188, Converged: False
max_iter=200, MSE: 671.344, Converged: False
max_iter=500, MSE: 139.311, Converged: False
max_iter=1000, MSE: 30.530, Converged: False

Best max_iter: 1000

Key steps in this example:

Generate a synthetic regression dataset
Split the data into train and test sets
Train MLPRegressor models with different max_iter values
Evaluate mean squared error (MSE) for each model on the test set
Check if the model converged before reaching max_iter
Determine the best max_iter value based on lowest MSE

Tips and heuristics for setting max_iter:

Start with the default value of 200 and increase if the model hasn’t converged
Monitor the convergence using the n_iter_ attribute of the fitted model
Use early stopping with early_stopping=True to prevent overfitting
Consider increasing max_iter if using larger or more complex networks

Issues to consider:

Too low max_iter may result in underfitting if the model doesn’t converge
Very high max_iter can lead to overfitting and increased training time
The optimal value depends on the dataset complexity and network architecture
Always check if the model converged using the n_iter_ attribute

See Also