The learning_rate
parameter in scikit-learn’s MLPClassifier
controls the step size at each iteration while moving toward a minimum of the loss function.
Multi-layer Perceptron (MLP) is a feedforward artificial neural network that uses backpropagation for training. The learning_rate
parameter determines how quickly or slowly the model learns from the training data.
A higher learning rate can lead to faster convergence but may overshoot the optimal solution, while a lower learning rate might require more iterations to converge but can find a more precise solution.
The default value for learning_rate
is ‘constant’, which uses a fixed learning rate of 0.001.
Common alternatives include ‘adaptive’, which decreases the learning rate when the training loss stops improving, and ‘invscaling’, which gradually decreases the learning rate over time.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_classes=3, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different learning_rate values
learning_rates = ['constant', 'invscaling', 'adaptive']
max_iter = 1000
for lr in learning_rates:
mlp = MLPClassifier(hidden_layer_sizes=(100,), learning_rate=lr, max_iter=max_iter, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"learning_rate={lr}, Accuracy: {accuracy:.3f}, Iterations: {mlp.n_iter_}")
Running the example gives an output like:
learning_rate=constant, Accuracy: 0.870, Iterations: 505
learning_rate=invscaling, Accuracy: 0.870, Iterations: 505
learning_rate=adaptive, Accuracy: 0.870, Iterations: 505
The key steps in this example are:
- Generate a synthetic multi-class classification dataset
- Split the data into train and test sets
- Train
MLPClassifier
models with differentlearning_rate
values - Evaluate the accuracy and convergence speed of each model
Some tips and heuristics for setting learning_rate
:
- Start with the default ‘constant’ learning rate and experiment with ‘adaptive’ or ‘invscaling’ if performance is unsatisfactory
- Use ‘adaptive’ when dealing with large or complex datasets to automatically adjust the learning rate
- Consider ‘invscaling’ for datasets where you want more control over the learning rate decay
Issues to consider:
- The optimal learning rate depends on the specific dataset and problem complexity
- A learning rate that’s too high can cause the model to diverge or oscillate around the optimal solution
- A learning rate that’s too low may result in slow convergence or getting stuck in local minima
- The effectiveness of different learning rates can vary depending on the chosen optimizer and other hyperparameters