The max_iter
parameter in scikit-learn’s MLPClassifier
controls the maximum number of iterations for the solver to converge.
MLPClassifier
is a multi-layer perceptron neural network for classification tasks. It uses backpropagation with gradient descent to optimize the network weights and biases.
The max_iter
parameter sets an upper limit on the number of iterations (epochs) the solver can perform. If the solver hasn’t converged within this limit, it will stop and may result in suboptimal performance.
The default value for max_iter
is 200. In practice, values between 200 and 1000 are commonly used, depending on the complexity of the dataset and network architecture.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, n_classes=3, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different max_iter values
max_iter_values = [100, 200, 500, 1000]
accuracies = []
for max_iter in max_iter_values:
mlp = MLPClassifier(hidden_layer_sizes=(100,), max_iter=max_iter, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
print(f"max_iter={max_iter}, Accuracy: {accuracy:.3f}, Converged: {mlp.n_iter_ < max_iter}")
Running the example gives an output like:
max_iter=100, Accuracy: 0.885, Converged: False
max_iter=200, Accuracy: 0.890, Converged: False
max_iter=500, Accuracy: 0.885, Converged: False
max_iter=1000, Accuracy: 0.885, Converged: True
The key steps in this example are:
- Generate a synthetic multi-class classification dataset
- Split the data into train and test sets
- Train
MLPClassifier
models with differentmax_iter
values - Evaluate the accuracy and convergence of each model
Some tips and heuristics for setting max_iter
:
- Start with the default value of 200 and increase if the model hasn’t converged
- Monitor the
n_iter_
attribute to check if the model converged before reaching max_iter - Use early stopping with a validation set to prevent overfitting on complex datasets
Issues to consider:
- Setting max_iter too low may result in underfitting if the model doesn’t converge
- Very high max_iter values can lead to overfitting or unnecessarily long training times
- The optimal max_iter depends on the dataset complexity and network architecture
- Consider using adaptive learning rate methods like ‘adam’ solver for faster convergence