SKLearner Home | About | Contact | Examples

Configure AdaBoostRegressor "estimator" Parameter

The estimator parameter in scikit-learn’s AdaBoostRegressor determines the base regressor used in the ensemble.

AdaBoost (Adaptive Boosting) is an ensemble learning method that combines multiple weak learners to create a strong predictor. The estimator parameter specifies the type of weak learner to use as the base model.

By default, AdaBoostRegressor uses DecisionTreeRegressor with max_depth=3 as the base estimator. This default works well in many cases, but changing the base estimator can significantly impact the model’s performance and characteristics.

Common alternatives to the default include decision trees with different depths, linear models like LinearRegression, or other regressors that can be considered “weak learners”.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and evaluate AdaBoostRegressor with different base estimators
estimators = {
    'Default': None,
    'DecisionTree(max_depth=1)': DecisionTreeRegressor(max_depth=1),
    'LinearRegression': LinearRegression()
}

for name, estimator in estimators.items():
    model = AdaBoostRegressor(estimator=estimator, random_state=42)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f"{name} - MSE: {mse:.4f}")

Running the example gives an output like:

Default - MSE: 3767.9097
DecisionTree(max_depth=1) - MSE: 6149.6108
LinearRegression - MSE: 0.0097

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Create AdaBoostRegressor instances with different base estimators
  4. Train models and evaluate using mean squared error
  5. Compare the performance of different base estimators

Some tips and heuristics for setting the estimator parameter:

Issues to consider:



See Also