Configure DecisionTreeRegressor "max_features" Parameter

The max_features parameter in scikit-learn’s DecisionTreeRegressor controls the number of features to consider when looking for the best split at each node of the tree.

Decision Tree is a non-parametric supervised learning algorithm used for both classification and regression tasks. It learns simple decision rules inferred from the data features to make predictions.

The max_features parameter determines how many features are considered at each split. It can be set as an integer, float, or string value. A smaller value can reduce overfitting, while a larger value can improve model performance but may lead to more complex trees.

The default value for max_features is None, which means that all features are considered at every split.

In practice, common values are "sqrt" (square root of the total number of features), "log2" (logarithm base 2 of the total number of features), or a float value between 0 and 1 representing the fraction of features to consider.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, n_informative=10,
                       noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different max_features values
max_features_values = [5, 10, "sqrt", "log2", None]
mse_scores = []

for mf in max_features_values:
    dt = DecisionTreeRegressor(max_features=mf, random_state=42)
    dt.fit(X_train, y_train)
    y_pred = dt.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"max_features={mf}, MSE: {mse:.3f}")

Running the example gives an output like:

max_features=5, MSE: 16620.603
max_features=10, MSE: 21154.319
max_features=sqrt, MSE: 34597.493
max_features=log2, MSE: 34597.493
max_features=None, MSE: 20519.298

The key steps in this example are:

Generate a synthetic regression dataset with informative and noise features
Split the data into train and test sets
Train DecisionTreeRegressor models with different max_features values
Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting max_features:

Try values between the square root and the total number of features
Consider the trade-off between model complexity and performance
Use cross-validation to select the optimal value for your specific dataset

Issues to consider:

A smaller max_features value can increase model interpretability but may underfit
A larger max_features value can improve performance but may lead to overfitting
The optimal value depends on the number and relevance of the features in the dataset

See Also