The alpha
parameter in scikit-learn’s MLPClassifier
controls the strength of L2 regularization applied to the model’s weights.
MLPClassifier
is a multi-layer perceptron neural network model used for classification tasks. It learns non-linear decision boundaries by training on the input data.
The alpha
parameter adds a penalty term to the loss function, discouraging large weights and helping to prevent overfitting. Larger values of alpha
result in stronger regularization.
The default value for alpha
is 0.0001. In practice, values are often tuned in the range of 1e-5 to 1.0, depending on the specific problem and dataset.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
n_redundant=5, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different alpha values
alpha_values = [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1.0]
accuracies = []
for alpha in alpha_values:
mlp = MLPClassifier(hidden_layer_sizes=(100,), alpha=alpha, max_iter=1000, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
print(f"alpha={alpha:.5f}, Accuracy: {accuracy:.3f}")
Running the example gives an output like:
alpha=0.00001, Accuracy: 0.945
alpha=0.00010, Accuracy: 0.945
alpha=0.00100, Accuracy: 0.945
alpha=0.01000, Accuracy: 0.945
alpha=0.10000, Accuracy: 0.955
alpha=1.00000, Accuracy: 0.945
The key steps in this example are:
- Generate a synthetic classification dataset with informative and redundant features
- Split the data into train and test sets
- Train
MLPClassifier
models with differentalpha
values - Evaluate the accuracy of each model on the test set
Some tips and heuristics for setting alpha
:
- Start with the default value of 0.0001 and adjust based on model performance
- Use smaller
alpha
values for complex datasets with many features - Increase
alpha
if the model shows signs of overfitting (high training accuracy, low test accuracy)
Issues to consider:
- The optimal
alpha
value depends on the size and complexity of the dataset - Too small
alpha
values may lead to overfitting, while too large values can cause underfitting alpha
interacts with other hyperparameters like learning rate and network architecture- Cross-validation can help find the best
alpha
value for your specific problem