SKLearner Home | About | Contact | Examples

Configure AdaBoostRegressor "loss" Parameter

The loss parameter in scikit-learn’s AdaBoostRegressor determines the loss function to use when updating the weights of weak learners.

AdaBoost (Adaptive Boosting) is an ensemble learning method that combines multiple weak learners, typically decision trees, to create a strong predictor. The loss parameter affects how the algorithm adjusts the importance of misclassified samples between iterations.

The loss parameter accepts three options: ’linear’, ‘square’, and ’exponential’. Each option defines a different way to calculate the loss and update the sample weights.

The default value for loss is ’linear’. In practice, ’linear’ and ‘square’ are commonly used, while ’exponential’ is less frequent due to its sensitivity to outliers.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different loss functions
loss_functions = ['linear', 'square', 'exponential']
mse_scores = []

for loss in loss_functions:
    regressor = AdaBoostRegressor(n_estimators=100, loss=loss, random_state=42)
    regressor.fit(X_train, y_train)
    y_pred = regressor.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"Loss function: {loss}, MSE: {mse:.4f}")

# Find the best performing loss function
best_loss = loss_functions[np.argmin(mse_scores)]
print(f"\nBest performing loss function: {best_loss}")

Running the example gives an output like:

Loss function: linear, MSE: 9149.6702
Loss function: square, MSE: 7670.0453
Loss function: exponential, MSE: 9415.7582

Best performing loss function: square

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train AdaBoostRegressor models with different loss functions
  4. Evaluate the mean squared error of each model on the test set
  5. Identify the best performing loss function

Some tips for choosing the loss parameter:

Issues to consider:



See Also