Configure VotingRegressor "weights" Parameter

The weights parameter in scikit-learn’s VotingRegressor allows you to assign different importance to each base regressor in the ensemble.

VotingRegressor combines predictions from multiple regressors to create a more robust model. The weights parameter determines the contribution of each regressor to the final prediction.

By default, weights is set to None, which means all regressors have equal importance. Custom weights can be used to give more influence to better-performing or more reliable regressors.

Common configurations include equal weights (e.g., [1, 1, 1]), normalized weights based on individual regressor performance, or weights determined through cross-validation.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor, VotingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create base regressors
rf = RandomForestRegressor(n_estimators=100, random_state=42)
lr = LinearRegression()
svr = SVR(kernel='rbf')

# Create VotingRegressor instances with different weight configurations
vr_equal = VotingRegressor(estimators=[('rf', rf), ('lr', lr), ('svr', svr)])
vr_weighted = VotingRegressor(estimators=[('rf', rf), ('lr', lr), ('svr', svr)],
                              weights=[2, 1, 1])

# Train and evaluate models
models = [vr_equal, vr_weighted]
for model in models:
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    print(f"{model.__class__.__name__} - Weights: {model.weights}, MSE: {mse:.4f}")

Running the example gives an output like:

VotingRegressor - Weights: None, MSE: 2571.5179
VotingRegressor - Weights: [2, 1, 1], MSE: 2423.3455

Key steps in this example:

Generate a synthetic regression dataset
Split the data into train and test sets
Create base regressors (RandomForestRegressor, LinearRegression, SVR)
Create VotingRegressor instances with different weight configurations
Train the models and evaluate their performance using mean squared error

Tips and heuristics for setting weights:

Start with equal weights and adjust based on individual regressor performance
Use cross-validation to determine optimal weights
Consider the strengths and weaknesses of each base regressor when assigning weights

Issues to consider:

Weights should be non-negative values
The scale of weights matters (e.g., [1, 1, 2] is equivalent to [0.5, 0.5, 1])
Overfitting can occur if weights are tuned too aggressively to the training data

See Also