Configure MLPClassifier "learning_rate_init" Parameter

The learning_rate_init parameter in scikit-learn’s MLPClassifier controls the step size at each iteration while moving toward a minimum of the loss function.

MLPClassifier implements a multi-layer perceptron (MLP) algorithm that trains using backpropagation. The learning rate determines how quickly the model adapts to the problem, with larger values resulting in faster initial learning.

The learning_rate_init parameter significantly affects both the speed of convergence and the quality of the final solution. A learning rate that’s too high may cause the model to converge too quickly to a suboptimal solution, while a rate that’s too low may result in slow learning or getting stuck in local minima.

The default value for learning_rate_init is 0.001.

In practice, values between 0.0001 and 0.1 are commonly used, often adjusted based on model performance and convergence behavior.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_classes=3, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate_init values
learning_rates = [0.0001, 0.001, 0.01, 0.1]
accuracies = []

for lr in learning_rates:
    mlp = MLPClassifier(hidden_layer_sizes=(100,), max_iter=1000,
                        learning_rate_init=lr, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"learning_rate_init={lr}, Accuracy: {accuracy:.3f}, Iterations: {mlp.n_iter_}")

Running the example gives an output like:

learning_rate_init=0.0001, Accuracy: 0.870, Iterations: 1000
learning_rate_init=0.001, Accuracy: 0.870, Iterations: 505
learning_rate_init=0.01, Accuracy: 0.905, Iterations: 119
learning_rate_init=0.1, Accuracy: 0.865, Iterations: 70

The key steps in this example are:

Generate a synthetic multi-class classification dataset
Split the data into train and test sets
Train MLPClassifier models with different learning_rate_init values
Evaluate the accuracy of each model on the test set
Compare both accuracy and number of iterations needed for convergence

Some tips and heuristics for setting learning_rate_init:

Start with the default value of 0.001 and adjust based on model performance
If learning is slow or stuck, try increasing the learning rate
If the model is unstable or overshooting, try decreasing the learning rate
Consider using adaptive learning rate methods like ‘adam’ or ‘adamax’

Issues to consider:

The optimal learning rate can vary greatly depending on the dataset and model architecture
A learning rate that’s too high may cause the model to diverge or oscillate
A learning rate that’s too low may result in slow convergence or getting trapped in local optima
The learning rate often interacts with other hyperparameters, such as batch size and network architecture

See Also