SKLearner Home | About | Contact | Examples

Scikit-Learn Get GridSearchCV "multimetric_" Attribute

The multimetric_ attribute in scikit-learn’s GridSearchCV class allows you to evaluate multiple metrics during the hyperparameter tuning process. By configuring this attribute, you can assess the performance of different hyperparameter combinations using various evaluation metrics simultaneously.

The grid search method is a powerful technique for hyperparameter optimization. It exhaustively searches through a specified grid of hyperparameter values and selects the combination that yields the best performance based on a chosen evaluation metric. When using multimetric_, you can extend this evaluation to consider multiple metrics, providing a more comprehensive assessment of the model’s performance.

Configuring the multimetric_ attribute is particularly useful when you want to optimize your model based on multiple criteria. For example, you might want to find the hyperparameters that maximize both precision and recall, or minimize both mean squared error and mean absolute error. By evaluating multiple metrics, you can make more informed decisions about the optimal hyperparameter settings for your specific problem.

from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV

# Generate a synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Create a RandomForestRegressor estimator
rf = RandomForestRegressor(random_state=42)

# Define the parameter grid
param_grid = {
    'n_estimators': [5, 10, 50],
    'max_depth': [None, 5, 10],
    'min_samples_split': [2, 5, 10]
}

# Define the evaluation metrics
scoring = ['neg_mean_squared_error', 'neg_mean_absolute_error', 'r2']

# Create a GridSearchCV object with multimetric_ attribute
grid_search = GridSearchCV(estimator=rf, param_grid=param_grid, scoring=scoring, refit='neg_mean_squared_error', cv=5)

# Fit the GridSearchCV object
grid_search.fit(X, y)

# Access the cv_results_ attribute
cv_results = grid_search.cv_results_

# View the computed metric scores for each hyperparameter combination
for metric in scoring:
    print(f"Mean {metric} scores:")
    print(cv_results[f'mean_test_{metric}'])

Running the example gives an output like:

Mean neg_mean_squared_error scores:
[-3492.35317803 -3014.75257152 -2724.98854702 -3565.39019004
 -3044.57358806 -2776.12838769 -3533.9862478  -3091.48377249
 -2927.64831163 -4889.82981275 -4537.49557754 -4329.02029683
 -4884.95488152 -4533.32788178 -4321.33717677 -4872.14925062
 -4541.47261513 -4330.42916875 -3584.81749516 -3047.50945101
 -2770.7603297  -3528.14239708 -3024.24168636 -2793.01710748
 -3561.82538021 -3105.84373036 -2943.95676489]
Mean neg_mean_absolute_error scores:
[-45.96345833 -42.24887854 -39.83825699 -46.77608926 -42.82187279
 -40.21173261 -46.2825783  -42.99261201 -41.46781208 -54.70476796
 -52.90128089 -51.4394526  -54.64324253 -52.80033737 -51.39101685
 -54.48934779 -52.83751767 -51.44532026 -46.35395731 -42.38970843
 -40.1820285  -46.15380194 -42.2420137  -40.40714788 -46.34443227
 -43.02151952 -41.60463459]
Mean r2 scores:
[0.79780036 0.82646515 0.8436458  0.79339947 0.824501   0.84053192
 0.79549326 0.82198094 0.83180249 0.71823209 0.73828229 0.75086861
 0.71843121 0.73842313 0.7513126  0.71923198 0.73805285 0.75082006
 0.79258171 0.82438643 0.84092062 0.79566948 0.82584376 0.83946088
 0.7936689  0.82111316 0.83079434]

The key steps in this example are:

  1. Preparing a synthetic regression dataset using make_regression for demonstrating the configuration of multimetric_ attribute.
  2. Creating a RandomForestRegressor estimator and defining the parameter grid for hyperparameter tuning.
  3. Specifying the evaluation metrics to be computed using the scoring parameter.
  4. Configuring the GridSearchCV object with the multimetric_ attribute by passing the scoring parameter and specifying the metric for refitting the best estimator.
  5. Fitting the GridSearchCV object on the synthetic dataset.
  6. Accessing the computed metric scores for each hyperparameter combination from the cv_results_ attribute.
  7. Printing the mean scores for each metric to assess the model’s performance across different hyperparameter settings.

By configuring the multimetric_ attribute in GridSearchCV, you can evaluate and optimize your model based on multiple evaluation metrics, enabling you to make more informed decisions about the best hyperparameter combination for your specific problem.



See Also