The max_iter
parameter in scikit-learn’s MLPRegressor
controls the maximum number of iterations for the solver to converge.
MLPRegressor
is a multi-layer perceptron neural network for regression tasks. It uses backpropagation with gradient descent to minimize the loss function.
The max_iter
parameter sets an upper limit on the number of epochs (complete passes through the training data) during training. It helps prevent excessive computation time and potential overfitting.
By default, max_iter
is set to 200. Common values range from 100 to 1000, depending on the complexity of the problem and dataset size.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different max_iter values
max_iter_values = [50, 200, 500, 1000]
mse_scores = []
for iter_val in max_iter_values:
mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=iter_val, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"max_iter={iter_val}, MSE: {mse:.3f}, Converged: {mlp.n_iter_ < iter_val}")
# Find best max_iter
best_max_iter = max_iter_values[np.argmin(mse_scores)]
print(f"\nBest max_iter: {best_max_iter}")
Running the example gives an output like:
max_iter=50, MSE: 13905.188, Converged: False
max_iter=200, MSE: 671.344, Converged: False
max_iter=500, MSE: 139.311, Converged: False
max_iter=1000, MSE: 30.530, Converged: False
Best max_iter: 1000
Key steps in this example:
- Generate a synthetic regression dataset
- Split the data into train and test sets
- Train
MLPRegressor
models with differentmax_iter
values - Evaluate mean squared error (MSE) for each model on the test set
- Check if the model converged before reaching max_iter
- Determine the best max_iter value based on lowest MSE
Tips and heuristics for setting max_iter
:
- Start with the default value of 200 and increase if the model hasn’t converged
- Monitor the convergence using the
n_iter_
attribute of the fitted model - Use early stopping with
early_stopping=True
to prevent overfitting - Consider increasing
max_iter
if using larger or more complex networks
Issues to consider:
- Too low
max_iter
may result in underfitting if the model doesn’t converge - Very high
max_iter
can lead to overfitting and increased training time - The optimal value depends on the dataset complexity and network architecture
- Always check if the model converged using the
n_iter_
attribute