The GridSearchCV
class in scikit-learn is a powerful tool for hyperparameter tuning and model selection. It allows you to define a grid of hyperparameter values and fits a specified model for each combination of those values using cross-validation.
After the grid search process is complete, the GridSearchCV
object stores the best mean cross-validated score across all hyperparameter combinations in the best_score_
attribute. This score represents the highest average performance achieved by the model during the cross-validation process.
Accessing the best_score_
attribute allows you to quickly evaluate the performance of the best model found by the grid search. It provides a convenient way to assess the effectiveness of the hyperparameter tuning process and determine the optimal hyperparameter settings for your model.
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
# Generate a synthetic classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)
# Create a RandomForestClassifier instance
rf = RandomForestClassifier(random_state=42)
# Define the parameter grid
param_grid = {
'n_estimators': [50, 100, 200],
'max_depth': [None, 5, 10],
'min_samples_split': [2, 5, 10]
}
# Create a GridSearchCV object
grid_search = GridSearchCV(estimator=rf, param_grid=param_grid, cv=5)
# Fit the GridSearchCV object
grid_search.fit(X, y)
# Access the best_score_ attribute
best_score = grid_search.best_score_
# Print the best score
print("Best score:", best_score)
Running the example gives an output like:
Best score: 0.9039999999999999
The key steps in this example are:
- Generating a synthetic classification dataset using
make_classification
for demonstration purposes. - Creating an instance of the
RandomForestClassifier
model. - Defining the parameter grid with hyperparameters to tune, such as
n_estimators
,max_depth
, andmin_samples_split
. - Creating a
GridSearchCV
object with the model, parameter grid, and cross-validation strategy. - Fitting the
GridSearchCV
object on the dataset to perform the grid search. - Accessing the
best_score_
attribute from the fittedGridSearchCV
object to retrieve the best mean cross-validated score.