RandomizedSearchCV is an efficient method for hyperparameter optimization that randomly samples from specified parameter distributions to find the best combination of hyperparameters for a given model.
After running a random search, you can quickly identify the index of the best performing hyperparameter configuration using the best_index_
attribute.
The best_index_
attribute is an integer that represents the index of the hyperparameter configuration that achieved the highest score during the random search cross-validation process.
Accessing best_index_
is useful when you want to retrieve the best hyperparameter values for further analysis, model training, or deployment without manually searching through the cv_results_
dictionary.
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
# Generate a random classification dataset
X, y = make_classification(n_samples=100, n_classes=2, n_informative=5, n_redundant=5, random_state=42)
# Set up a RandomForestClassifier
rf = RandomForestClassifier(random_state=42)
# Define hyperparameter distributions to sample from
param_dist = {
'n_estimators': randint(5, 50),
'max_depth': [3, 5, 10, None],
'min_samples_split': randint(2, 10),
}
# Run random search with 5-fold cross-validation
random_search = RandomizedSearchCV(rf, param_distributions=param_dist, n_iter=10, cv=5, random_state=42)
random_search.fit(X, y)
# Access the best_index_ attribute
best_index = random_search.best_index_
# Retrieve the best hyperparameters
best_params = random_search.cv_results_['params'][best_index]
# Print the best hyperparameters
print("Best hyperparameters:")
print(best_params)
Running the example gives an output like:
Best hyperparameters:
{'max_depth': None, 'min_samples_split': 4, 'n_estimators': 26}
The example follows these steps:
- Generate a synthetic classification dataset using
make_classification
. - Configure a
RandomForestClassifier
and define the hyperparameter distributions to sample from. - Run
RandomizedSearchCV
with the classifier, hyperparameter distributions, 10 iterations, and 5-fold cross-validation. - After fitting, access the
best_index_
attribute from therandom_search
object. - Use
best_index_
to retrieve the best hyperparameters from thecv_results_
dictionary.
By leveraging the best_index_
attribute, you can efficiently access the best performing hyperparameter configuration found during the random search, allowing you to easily utilize those hyperparameters for further model development and deployment.