Configure MLPClassifier "solver" Parameter

The solver parameter in scikit-learn’s MLPClassifier determines the algorithm used to optimize the neural network weights during training.

MLPClassifier is a multi-layer perceptron neural network for classification tasks. It learns non-linear decision boundaries by adjusting weights between interconnected neurons organized in layers.

The solver parameter affects both the training speed and the quality of the final model. Different solvers are better suited for different types of problems and dataset sizes.

The default value for solver is ‘adam’. Common alternatives include ‘sgd’ (stochastic gradient descent) and ’lbfgs’ (Limited-memory BFGS).

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import time

# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=3, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different solvers
solvers = ['adam', 'sgd', 'lbfgs']
results = []

for solver in solvers:
    start_time = time.time()
    mlp = MLPClassifier(solver=solver, random_state=42, max_iter=1000)
    mlp.fit(X_train, y_train)
    train_time = time.time() - start_time

    y_pred = mlp.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    results.append((solver, accuracy, train_time))
    print(f"Solver: {solver}, Accuracy: {accuracy:.3f}, Training Time: {train_time:.2f} seconds")

Running the example gives an output like:

Solver: adam, Accuracy: 0.933, Training Time: 11.90 seconds
Solver: sgd, Accuracy: 0.936, Training Time: 19.35 seconds
Solver: lbfgs, Accuracy: 0.923, Training Time: 9.13 seconds

The key steps in this example are:

Generate a synthetic multi-class classification dataset
Split the data into train and test sets
Train MLPClassifier models with different solver options
Measure training time and evaluate accuracy for each solver

Some tips and heuristics for setting solver:

Use ‘adam’ for large datasets or when training with mini-batches
Try ’lbfgs’ for smaller datasets (less than a few thousand samples)
Use ‘sgd’ if you need fine control over learning rate schedules

Issues to consider:

‘adam’ and ‘sgd’ support early stopping, while ’lbfgs’ does not
’lbfgs’ may converge faster and perform better for small datasets
‘sgd’ is sensitive to feature scaling and may require more tuning

See Also