Configure MLPRegressor "learning_rate" Parameter

The learning_rate parameter in scikit-learn’s MLPRegressor controls the step size at each iteration while moving toward a minimum of the loss function.

MLPRegressor is a multi-layer perceptron regressor that optimizes the squared-loss using gradient descent. The learning_rate parameter determines how quickly or slowly the model learns from the training data.

A higher learning rate can lead to faster convergence but may overshoot the optimal solution, while a lower learning rate may require more iterations to converge but can find a more precise solution.

The default value for learning_rate is ‘constant’, which uses a fixed learning rate of 0.001. Other options include ‘invscaling’ and ‘adaptive’.

These refer to learning rate schedules. The actual learning rate is set via the learning_rate_init parameter.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate values
learning_rates = ['constant', 'invscaling', 'adaptive']
mse_scores = []

for lr in learning_rates:
    mlp = MLPRegressor(hidden_layer_sizes=(100,), learning_rate=lr, max_iter=1000, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"learning_rate={lr}, MSE: {mse:.3f}")

Running the example gives an output like:

learning_rate=constant, MSE: 30.530
learning_rate=invscaling, MSE: 30.530
learning_rate=adaptive, MSE: 30.530

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train MLPRegressor models with different learning_rate values
Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting learning_rate:

Start with the default ‘constant’ value and experiment with ‘adaptive’ for automatic adjustment
If using a constant rate, try values on a logarithmic scale (e.g., 0.1, 0.01, 0.001)
Monitor the loss during training to detect if the learning rate is too high (unstable loss) or too low (slow convergence)

Issues to consider:

The optimal learning rate depends on the specific dataset and model architecture
A learning rate that’s too high can cause the model to converge to a suboptimal solution or diverge
A learning rate that’s too low can result in slow convergence or getting stuck in local minima
The ‘adaptive’ option can be useful for finding a good learning rate automatically, but may not always be optimal

See Also