SKLearner Home | About | Contact | Examples

Configure DecisionTreeRegressor "criterion" Parameter

The criterion parameter in scikit-learn’s DecisionTreeRegressor determines the function used to measure the quality of a split at each node of the tree.

It supports three different options: “squared_error” (equivalent to “mse”) for mean squared error, “friedman_mse” for mean squared error with Friedman’s improvement score, and “absolute_error” (equivalent to “mae”) for mean absolute error.

The default value for criterion is “squared_error”, which is generally a good choice for most regression problems. “absolute_error” can be more robust to outliers in the data.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=10, n_informative=5,
                       noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different criterion values
criterion_values = ["squared_error", "friedman_mse", "absolute_error"]
mse_scores = []
mae_scores = []

for criterion in criterion_values:
    dt = DecisionTreeRegressor(criterion=criterion, random_state=42)
    dt.fit(X_train, y_train)
    y_pred = dt.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mae = mean_absolute_error(y_test, y_pred)
    mse_scores.append(mse)
    mae_scores.append(mae)
    print(f"criterion={criterion}, MSE: {mse:.3f}, MAE: {mae:.3f}")

Running the example gives an output like:

criterion=squared_error, MSE: 481.286, MAE: 17.384
criterion=friedman_mse, MSE: 480.421, MAE: 17.443
criterion=absolute_error, MSE: 566.492, MAE: 18.470

The key steps in this example are:

  1. Generate a synthetic regression dataset with informative and noise features
  2. Split the data into train and test sets
  3. Train DecisionTreeRegressor models with different criterion values
  4. Evaluate the mean squared error and mean absolute error of each model on the test set

Some tips and heuristics for setting criterion:

Issues to consider:



See Also