The tol
parameter in scikit-learn’s MLPRegressor
controls the tolerance for optimization convergence.
Multi-layer Perceptron (MLP) is a type of artificial neural network that can be used for regression tasks. It consists of multiple layers of neurons and uses backpropagation for training.
The tol
parameter determines the tolerance for the optimization. It affects when the optimizer considers the loss to have converged, potentially impacting both training time and model performance.
The default value for tol
is 1e-4 (0.0001).
In practice, values between 1e-5 and 1e-3 are commonly used, depending on the desired trade-off between precision and training time.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
import numpy as np
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different tol values
tol_values = [1e-2, 1e-3, 1e-4, 1e-5]
mse_scores = []
for tol in tol_values:
mlp = MLPRegressor(hidden_layer_sizes=(100,), max_iter=1000, tol=tol, random_state=42)
mlp.fit(X_train, y_train)
y_pred = mlp.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
mse_scores.append(mse)
print(f"tol={tol}, MSE: {mse:.4f}, Iterations: {mlp.n_iter_}")
# Find best tol value
best_tol = tol_values[np.argmin(mse_scores)]
print(f"\nBest tol value: {best_tol}")
Running the example gives an output like:
tol=0.01, MSE: 30.5298, Iterations: 1000
tol=0.001, MSE: 30.5298, Iterations: 1000
tol=0.0001, MSE: 30.5298, Iterations: 1000
tol=1e-05, MSE: 30.5298, Iterations: 1000
Best tol value: 0.01
The key steps in this example are:
- Generate a synthetic regression dataset
- Split the data into train and test sets
- Train
MLPRegressor
models with differenttol
values - Evaluate the mean squared error (MSE) of each model on the test set
- Compare the number of iterations and MSE for different
tol
values
Some tips and heuristics for setting tol
:
- Start with the default value of 1e-4 and adjust based on model performance and training time
- Decrease
tol
for potentially better performance, but at the cost of longer training time - Increase
tol
to reduce training time, but be cautious of potential underfitting
Issues to consider:
- A smaller
tol
value may lead to more precise solutions but can significantly increase training time - Too large a
tol
value might cause premature convergence, resulting in suboptimal performance - The optimal
tol
value depends on the specific problem and dataset characteristics - Monitor both model performance and training time when tuning this parameter