The hidden_layer_sizes
parameter in scikit-learn’s MLPRegressor
defines the architecture of the neural network by specifying the number of hidden layers and the number of neurons in each layer.
Multi-layer Perceptron (MLP) is a feedforward neural network model that maps input data to a set of outputs. The hidden_layer_sizes
parameter determines the network’s capacity and ability to model complex relationships in the data.
This parameter accepts a tuple where each element represents the number of neurons in a hidden layer. The length of the tuple defines the number of hidden layers in the network.
The default value for hidden_layer_sizes
is (100,), which creates a single hidden layer with 100 neurons.
Common configurations include (50,50) for two hidden layers with 50 neurons each, or (100,50,25) for three hidden layers with decreasing neuron counts.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different hidden_layer_sizes
layer_sizes = [(100,), (50,50), (100,50,25)]
mse_scores = []
for layers in layer_sizes:
mlp = MLPRegressor(hidden_layer_sizes=layers, max_iter=1000, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"hidden_layer_sizes={layers}, MSE: {mse:.3f}")
# Find best configuration
best_config = layer_sizes[np.argmin(mse_scores)]
print(f"Best configuration: {best_config}")
Running the example gives an output like:
hidden_layer_sizes=(100,), MSE: 30.530
hidden_layer_sizes=(50, 50), MSE: 11.140
hidden_layer_sizes=(100, 50, 25), MSE: 5.085
Best configuration: (100, 50, 25)
The key steps in this example are:
- Generate a synthetic regression dataset
- Split the data into train and test sets
- Train
MLPRegressor
models with differenthidden_layer_sizes
configurations - Evaluate the mean squared error (MSE) of each model on the test set
- Identify the best performing configuration
Some tips and heuristics for setting hidden_layer_sizes
:
- Start with a simple architecture and gradually increase complexity
- Consider using a pyramid structure with decreasing neuron counts in deeper layers
- Experiment with both shallow (1-2 layers) and deep (3+ layers) architectures
- Use cross-validation to find the optimal configuration for your specific dataset
Issues to consider:
- Larger networks have more capacity but are prone to overfitting and longer training times
- The optimal architecture depends on the complexity of the underlying data relationships
- Too few neurons may lead to underfitting, while too many can cause overfitting
- Consider using regularization techniques (e.g.,
alpha
parameter) with larger networks