SKLearner Home | About | Contact | Examples

Configure MLPRegressor "beta_2" Parameter

The beta_2 parameter in scikit-learn’s MLPRegressor controls the decay rate for the second moment estimate in the Adam optimizer.

Adam (Adaptive Moment Estimation) is an optimization algorithm used for updating network weights. The beta_2 parameter specifically affects how the optimizer estimates the second moment (uncentered variance) of the gradients.

A higher beta_2 value results in a slower decay of the second moment estimate, which can help smooth out the learning process in the presence of noisy gradients. Conversely, a lower value allows for quicker adaptation to changes in the gradient.

The default value for beta_2 is 0.999. In practice, values between 0.9 and 0.999 are commonly used, with 0.999 being a popular choice for many problems.

import numpy as np
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different beta_2 values
beta_2_values = [0.9, 0.99, 0.999, 0.9999]
mse_scores = []

for beta_2 in beta_2_values:
    mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=500, random_state=42, beta_2=beta_2)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"beta_2={beta_2}, MSE: {mse:.4f}")

Running the example gives an output like:

beta_2=0.9, MSE: 3.3350
beta_2=0.99, MSE: 21.6741
beta_2=0.999, MSE: 139.3109
beta_2=0.9999, MSE: 152.1716

The key steps in this example are:

  1. Generate a synthetic regression dataset with multiple features
  2. Split the data into train and test sets
  3. Train MLPRegressor models with different beta_2 values
  4. Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting beta_2:

Issues to consider:



See Also