The hidden_layer_sizes
parameter in scikit-learn’s MLPClassifier
determines the architecture of the neural network by specifying the number of hidden layers and the number of neurons in each layer.
MLPClassifier
implements a multi-layer perceptron (MLP) neural network for classification tasks. The hidden_layer_sizes
parameter defines the network’s depth and width, which directly impacts its capacity to learn complex patterns.
This parameter accepts a tuple where each element represents the number of neurons in a hidden layer. The length of the tuple determines the number of hidden layers.
The default value for hidden_layer_sizes
is (100,), which creates a single hidden layer with 100 neurons.
Common configurations include (100,), (100, 100), and (100, 50, 25) for increasing network depth.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, n_classes=3, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different hidden_layer_sizes
layer_sizes = [(100,), (100, 50), (100, 100), (50, 25, 10)]
accuracies = []
for layers in layer_sizes:
mlp = MLPClassifier(hidden_layer_sizes=layers, max_iter=1000, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
print(f"hidden_layer_sizes={layers}, Accuracy: {accuracy:.3f}")
Running the example gives an output like:
hidden_layer_sizes=(100,), Accuracy: 0.885
hidden_layer_sizes=(100, 50), Accuracy: 0.895
hidden_layer_sizes=(100, 100), Accuracy: 0.895
hidden_layer_sizes=(50, 25, 10), Accuracy: 0.860
The key steps in this example are:
- Generate a synthetic multi-class classification dataset
- Split the data into train and test sets
- Create
MLPClassifier
models with differenthidden_layer_sizes
configurations - Train each model and evaluate its accuracy on the test set
- Compare the performance of different network architectures
Some tips and heuristics for setting hidden_layer_sizes
:
- Start with a simple architecture and gradually increase complexity
- Use more neurons in earlier layers and fewer in later layers
- Consider the input dimensionality when choosing the size of the first hidden layer
- Experiment with both deep (more layers) and wide (more neurons) architectures
Issues to consider:
- Deeper networks can learn more complex patterns but are prone to overfitting
- Larger networks require more computational resources and training time
- The optimal architecture depends on the complexity of the problem and available data
- Too few neurons may lead to underfitting, while too many can cause overfitting