SKLearner Home | About | Contact | Examples

Configure ExtraTreesRegressor "oob_score" Parameter

The oob_score parameter in scikit-learn’s ExtraTreesRegressor enables out-of-bag (OOB) error estimation during training.

Extra Trees (Extremely Randomized Trees) is an ensemble method similar to Random Forests, but with additional randomization in the tree-building process. It creates multiple decision trees and aggregates their predictions.

The oob_score parameter, when set to True, uses samples not selected during bootstrap to estimate the generalization accuracy. This provides an unbiased estimate of the model’s performance without needing a separate validation set.

By default, oob_score is set to False. It’s commonly enabled when you want to monitor model performance during training without using a separate validation set.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.metrics import mean_squared_error
import numpy as np

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit ExtraTreesRegressor with oob_score=True
et_oob = ExtraTreesRegressor(n_estimators=100, bootstrap=True, oob_score=True, random_state=42)
et_oob.fit(X_train, y_train)

# Print OOB score
print(f"OOB Score: {et_oob.oob_score_:.3f}")

# Evaluate on test set
y_pred = et_oob.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Test MSE: {mse:.3f}")

# Compare with model without OOB scoring
et_no_oob = ExtraTreesRegressor(n_estimators=100, bootstrap=True, oob_score=False, random_state=42)
et_no_oob.fit(X_train, y_train)
y_pred_no_oob = et_no_oob.predict(X_test)
mse_no_oob = mean_squared_error(y_test, y_pred_no_oob)
print(f"Test MSE (without OOB): {mse_no_oob:.3f}")

Running the example gives an output like:

OOB Score: 0.859
Test MSE: 2122.805
Test MSE (without OOB): 2122.805

Key steps in this example:

  1. Generate a synthetic regression dataset
  2. Split data into train and test sets
  3. Create ExtraTreesRegressor with oob_score=True
  4. Fit model and print OOB score
  5. Compare OOB score with test set performance
  6. Create and evaluate model with oob_score=False for comparison

Tips for using oob_score:

Issues to consider:



See Also