SKLearner Home | About | Contact | Examples

Configure StackingRegressor "estimators" Parameter

The estimators parameter in scikit-learn’s StackingRegressor defines the set of first-level estimators used in the stacking ensemble.

Stacking is an ensemble learning technique that combines multiple base models to improve prediction performance. The estimators parameter specifies the list of these base models.

Effective configuration of estimators is crucial for the performance of the stacking ensemble. It typically involves selecting diverse models that capture different aspects of the data.

The default value for estimators is an empty list. Common configurations include a mix of different algorithm types, such as linear models, decision trees, and neural networks.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import StackingRegressor
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define different estimator configurations
configs = [
    ("Basic", [('lr', LinearRegression())]),
    ("Diverse", [('lr', LinearRegression()), ('ridge', Ridge()), ('dt', DecisionTreeRegressor())]),
    ("Complex", [('lr', LinearRegression()), ('ridge', Ridge()), ('dt', DecisionTreeRegressor()),
                 ('dt2', DecisionTreeRegressor(max_depth=5))])
]

for name, estimators in configs:
    stacking = StackingRegressor(estimators=estimators, final_estimator=LinearRegression())
    stacking.fit(X_train, y_train)
    y_pred = stacking.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f"{name} configuration - MSE: {mse:.4f}")

Running the example gives an output like:

Basic configuration - MSE: 0.0095
Diverse configuration - MSE: 0.0095
Complex configuration - MSE: 0.0095

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Define different estimators configurations for StackingRegressor
  4. Train and evaluate models with each configuration
  5. Compare mean squared error (MSE) for each setup

Tips for configuring estimators:

Issues to consider:



See Also