Configure MLPRegressor "learning_rate_init" Parameter

The learning_rate_init parameter in scikit-learn’s MLPRegressor controls the initial learning rate for weight updates during training.

MLPRegressor is a multi-layer perceptron regressor that uses backpropagation with gradient descent for optimization. It’s a versatile model capable of learning non-linear relationships in data.

The learning_rate_init parameter determines the step size at the beginning of training. A larger value can lead to faster initial learning but may overshoot optimal weights, while a smaller value provides more precise updates but may result in slower convergence.

The default value for learning_rate_init is 0.001. In practice, values between 0.0001 and 0.1 are commonly used, depending on the specific problem and dataset characteristics.

from sklearn.neural_network import MLPRegressor
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different learning_rate_init values
learning_rates = [0.0001, 0.001, 0.01, 0.1]
mse_scores = []

for lr in learning_rates:
    mlp = MLPRegressor(hidden_layer_sizes=(100,), learning_rate_init=lr, max_iter=1000, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"learning_rate_init={lr}, MSE: {mse:.3f}")

# Find best learning rate
best_lr = learning_rates[np.argmin(mse_scores)]
print(f"Best learning_rate_init: {best_lr}")

Running the example gives an output like:

learning_rate_init=0.0001, MSE: 8040.666
learning_rate_init=0.001, MSE: 30.530
learning_rate_init=0.01, MSE: 2.109
learning_rate_init=0.1, MSE: 0.523
Best learning_rate_init: 0.1

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train MLPRegressor models with different learning_rate_init values
Evaluate the mean squared error of each model on the test set
Identify the best performing learning rate

Tips and heuristics for setting learning_rate_init:

Start with the default value of 0.001 and adjust based on model performance
If the loss isn’t decreasing, try a smaller learning rate
If the loss is decreasing very slowly, try a larger learning rate
Consider using adaptive learning rate methods like ‘adam’ or ‘adaptive’

Issues to consider:

The optimal learning rate depends on the scale and distribution of your features
A learning rate that’s too high can cause the model to diverge
A learning rate that’s too low can result in slow convergence or getting stuck in local minima
The learning rate interacts with other parameters like max_iter and batch_size

See Also