SKLearner Home | About | Contact | Examples

Configure BaggingRegressor "oob_score" Parameter

The oob_score parameter in scikit-learn’s BaggingRegressor enables out-of-bag (OOB) estimation of the generalization error.

Bagging (Bootstrap Aggregating) is an ensemble method that fits multiple base estimators on random subsets of the original dataset. The oob_score parameter allows for model evaluation using samples not used in training individual estimators.

When oob_score is set to True, the model computes an additional score using only the samples that were not used in the training of each base estimator. This provides an unbiased estimate of the model’s performance without the need for a separate validation set.

The default value for oob_score is False. It’s commonly set to True when you want to get an estimate of the model’s performance without using a separate validation set or cross-validation.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import BaggingRegressor
from sklearn.metrics import r2_score

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with oob_score=False (default)
br_no_oob = BaggingRegressor(random_state=42)
br_no_oob.fit(X_train, y_train)
y_pred_no_oob = br_no_oob.predict(X_test)
r2_no_oob = r2_score(y_test, y_pred_no_oob)

# Train with oob_score=True
br_oob = BaggingRegressor(oob_score=True, random_state=42)
br_oob.fit(X_train, y_train)
y_pred_oob = br_oob.predict(X_test)
r2_oob = r2_score(y_test, y_pred_oob)

print(f"R-squared (oob_score=False): {r2_no_oob:.3f}")
print(f"R-squared (oob_score=True): {r2_oob:.3f}")
print(f"OOB Score: {br_oob.oob_score_:.3f}")

Running the example gives an output like:

R-squared (oob_score=False): 0.809
R-squared (oob_score=True): 0.809

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train two BaggingRegressor models, one with oob_score=False and one with oob_score=True
  4. Evaluate both models using R-squared score on the test set
  5. Print the OOB score for the model with oob_score=True

Tips for using oob_score:

Issues to consider:



See Also