SKLearner Home | About | Contact | Examples

Configure StackingClassifier "passthrough" Parameter

The passthrough parameter in scikit-learn’s StackingClassifier determines whether to include the original features in the final meta-classifier.

StackingClassifier is an ensemble method that combines multiple base classifiers by training a meta-classifier on their predictions. The passthrough parameter controls whether the original input features are also passed to the meta-classifier.

When passthrough=True, the meta-classifier receives both the predictions from base classifiers and the original features. This can potentially improve performance but increases the input dimensionality for the meta-classifier.

The default value for passthrough is False. Setting it to True can be beneficial when the original features contain information not fully captured by the base classifiers.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import StackingClassifier
from sklearn.metrics import roc_auc_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                           n_redundant=5, n_classes=2, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define base classifiers
base_classifiers = [
    ('rf', RandomForestClassifier(n_estimators=100, random_state=42)),
    ('lr', LogisticRegression(random_state=42))
]

# Create StackingClassifier with passthrough=False
stacking_false = StackingClassifier(
    estimators=base_classifiers,
    final_estimator=LogisticRegression(),
    passthrough=False,
    cv=5
)

# Create StackingClassifier with passthrough=True
stacking_true = StackingClassifier(
    estimators=base_classifiers,
    final_estimator=LogisticRegression(),
    passthrough=True,
    cv=5
)

# Fit and evaluate models
for model in [stacking_false, stacking_true]:
    model.fit(X_train, y_train)
    y_pred_proba = model.predict_proba(X_test)[:, 1]
    auc = roc_auc_score(y_test, y_pred_proba)
    print(f"AUC score with passthrough={model.passthrough}: {auc:.3f}")

Running the example gives an output like:

AUC score with passthrough=False: 0.973
AUC score with passthrough=True: 0.979

The key steps in this example are:

  1. Generate a synthetic classification dataset with informative and redundant features
  2. Split the data into train and test sets
  3. Create two StackingClassifier models, one with passthrough=False and one with passthrough=True
  4. Train both models and evaluate their performance using ROC AUC score

Tips for configuring the passthrough parameter:

Issues to consider:



See Also