Configure MLPRegressor "solver" Parameter

The solver parameter in scikit-learn’s MLPRegressor determines the algorithm used for weight optimization during training.

Multi-layer Perceptron (MLP) is a type of artificial neural network that can be used for regression tasks. The solver parameter affects how the network learns from the data and can significantly impact both performance and training time.

The solver parameter offers different optimization algorithms, each with its own strengths and weaknesses. The choice of solver can affect convergence speed, final model performance, and the ability to handle different types of problems.

The default value for solver is ‘adam’. Common alternatives include ‘sgd’ (stochastic gradient descent) and ’lbfgs’ (Limited-memory BFGS).

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different solver options
solvers = ['adam', 'sgd', 'lbfgs']
mse_scores = []

for solver in solvers:
    mlp = MLPRegressor(solver=solver, random_state=42, max_iter=1000)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"Solver: {solver}, MSE: {mse:.4f}")

best_solver = solvers[np.argmin(mse_scores)]
print(f"Best solver: {best_solver}")

Running the example gives an output like:

Solver: adam, MSE: 30.5298
Solver: sgd, MSE: 0.4607
Solver: lbfgs, MSE: 0.3230
Best solver: lbfgs

The key steps in this example are:

Generate a synthetic regression dataset
Split the data into train and test sets
Train MLPRegressor models with different solver options
Evaluate the mean squared error of each model on the test set
Identify the best-performing solver

Some tips and heuristics for choosing the solver:

‘adam’ is a good default choice for most problems
‘sgd’ can be effective for large datasets or online learning
’lbfgs’ often works well for smaller datasets

Issues to consider:

‘adam’ and ‘sgd’ support early stopping, while ’lbfgs’ does not
‘sgd’ requires tuning of learning rate and schedule
’lbfgs’ can converge faster on some problems but uses more memory

See Also