SKLearner Home | About | Contact | Examples

Configure ExtraTreesRegressor "min_weight_fraction_leaf" Parameter

The min_weight_fraction_leaf parameter in scikit-learn’s ExtraTreesRegressor controls the minimum weighted fraction of the sum total of weights required to be at a leaf node.

The Extra Trees algorithm, short for Extremely Randomized Trees, is an ensemble method that builds multiple decision trees with increased randomization. It differs from Random Forests in how it selects split points and uses the entire learning sample to grow trees.

The min_weight_fraction_leaf parameter sets a threshold for the minimum fraction of samples required at a leaf node. This affects the depth and complexity of the trees in the ensemble.

The default value for min_weight_fraction_leaf is 0.0, which means no minimum fraction is imposed. In practice, values between 0.0 and 0.5 are commonly used, with smaller values allowing for more complex trees.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different min_weight_fraction_leaf values
leaf_fractions = [0.0, 0.1, 0.2, 0.3]
mse_scores = []

for fraction in leaf_fractions:
    etr = ExtraTreesRegressor(n_estimators=100, min_weight_fraction_leaf=fraction, random_state=42)
    etr.fit(X_train, y_train)
    y_pred = etr.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"min_weight_fraction_leaf={fraction}, MSE: {mse:.3f}")

Running the example gives an output like:

min_weight_fraction_leaf=0.0, MSE: 2036.183
min_weight_fraction_leaf=0.1, MSE: 8603.181
min_weight_fraction_leaf=0.2, MSE: 12409.299
min_weight_fraction_leaf=0.3, MSE: 14671.269

The key steps in this example are:

  1. Generate a synthetic regression dataset
  2. Split the data into train and test sets
  3. Train ExtraTreesRegressor models with different min_weight_fraction_leaf values
  4. Evaluate the Mean Squared Error (MSE) of each model on the test set

Some tips and heuristics for setting min_weight_fraction_leaf:

Issues to consider:



See Also