SKLearner Home | About | Contact | Examples

Configure MLPRegressor "beta_1" Parameter

The beta_1 parameter in scikit-learn’s MLPRegressor controls the exponential decay rate for the first moment estimates in the Adam optimizer.

MLPRegressor uses the Adam (Adaptive Moment Estimation) optimizer by default, which is an algorithm for first-order gradient-based optimization of stochastic objective functions. The beta_1 parameter influences how quickly the optimizer adapts to changes in the gradient.

beta_1 represents the exponential decay rate for the first moment estimates. A higher value gives more weight to past gradients, while a lower value makes the optimizer more responsive to recent gradients.

The default value for beta_1 is 0.9. In practice, values between 0.9 and 0.999 are commonly used, with 0.9 being a good starting point for most problems.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different beta_1 values
beta_1_values = [0.8, 0.9, 0.95, 0.99]
mse_scores = []

for beta_1 in beta_1_values:
    mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=1000, random_state=42, beta_1=beta_1)
    mlp.fit(X_train, y_train)
    y_pred = mlp.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"beta_1={beta_1}, MSE: {mse:.3f}")

# Find best beta_1
best_beta_1 = beta_1_values[np.argmin(mse_scores)]
print(f"Best beta_1: {best_beta_1}")

Running the example gives an output like:

beta_1=0.8, MSE: 31.643
beta_1=0.9, MSE: 30.530
beta_1=0.95, MSE: 28.340
beta_1=0.99, MSE: 21.502
Best beta_1: 0.99

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train MLPRegressor models with different beta_1 values
  4. Evaluate the mean squared error of each model on the test set
  5. Identify the best performing beta_1 value

Some tips and heuristics for setting beta_1:

Issues to consider:



See Also