Configure MLPClassifier "hidden_layer_sizes" Parameter

The hidden_layer_sizes parameter in scikit-learn’s MLPClassifier determines the architecture of the neural network by specifying the number of hidden layers and the number of neurons in each layer.

MLPClassifier implements a multi-layer perceptron (MLP) neural network for classification tasks. The hidden_layer_sizes parameter defines the network’s depth and width, which directly impacts its capacity to learn complex patterns.

This parameter accepts a tuple where each element represents the number of neurons in a hidden layer. The length of the tuple determines the number of hidden layers.

The default value for hidden_layer_sizes is (100,), which creates a single hidden layer with 100 neurons.

Common configurations include (100,), (100, 100), and (100, 50, 25) for increasing network depth.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=3, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different hidden_layer_sizes
layer_sizes = [(100,), (100, 50), (100, 100), (50, 25, 10)]
accuracies = []

for layers in layer_sizes:
    mlp = MLPClassifier(hidden_layer_sizes=layers, max_iter=1000, random_state=42)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    accuracies.append(accuracy)
    print(f"hidden_layer_sizes={layers}, Accuracy: {accuracy:.3f}")

Running the example gives an output like:

hidden_layer_sizes=(100,), Accuracy: 0.885
hidden_layer_sizes=(100, 50), Accuracy: 0.895
hidden_layer_sizes=(100, 100), Accuracy: 0.895
hidden_layer_sizes=(50, 25, 10), Accuracy: 0.860

The key steps in this example are:

Generate a synthetic multi-class classification dataset
Split the data into train and test sets
Create MLPClassifier models with different hidden_layer_sizes configurations
Train each model and evaluate its accuracy on the test set
Compare the performance of different network architectures

Some tips and heuristics for setting hidden_layer_sizes:

Start with a simple architecture and gradually increase complexity
Use more neurons in earlier layers and fewer in later layers
Consider the input dimensionality when choosing the size of the first hidden layer
Experiment with both deep (more layers) and wide (more neurons) architectures

Issues to consider:

Deeper networks can learn more complex patterns but are prone to overfitting
Larger networks require more computational resources and training time
The optimal architecture depends on the complexity of the problem and available data
Too few neurons may lead to underfitting, while too many can cause overfitting

See Also