RandomizedSearchCV is a powerful tool for hyperparameter optimization that allows you to sample from distributions of hyperparameter values.
After running a random search, you can access detailed results about the cross-validation process using the cv_results_
attribute.
The cv_results_
attribute is a dictionary that contains key information about the performance of each hyperparameter configuration tested during the random search.
This includes metrics like mean test scores, fit times, and score times. Accessing cv_results_
is useful for analyzing and comparing the performance of different configurations to select the best one for your model.
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
# Generate a random classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, n_informative=5, n_redundant=5, random_state=42)
# Set up a RandomForestClassifier
rf = RandomForestClassifier(random_state=42)
# Define hyperparameter distributions to sample from
param_dist = {
'n_estimators': randint(50, 200),
'max_depth': [3, 5, 10, None],
'min_samples_split': randint(2, 10),
}
# Run random search with 5-fold cross-validation
random_search = RandomizedSearchCV(rf, param_distributions=param_dist, n_iter=10, cv=5, random_state=42)
random_search.fit(X, y)
# Access cv_results_ attribute
cv_results = random_search.cv_results_
# Retrieve mean test scores for each configuration
mean_test_scores = cv_results['mean_test_score']
# Print mean test scores
print("Mean test scores:")
print(mean_test_scores)
Running the example gives an output like:
Mean test scores:
[0.926 0.924 0.887 0.912 0.923 0.924 0.914 0.927 0.908 0.887]
The steps are as follows:
- Import required libraries and prepare a synthetic classification dataset using
make_classification
. - Configure a
RandomForestClassifier
and define distributions to sample hyperparameters from. - Run
RandomizedSearchCV
with the classifier, hyperparameter distributions, 10 iterations, and 5-fold cross-validation. - After fitting, access the
cv_results_
attribute from therandom_search
object. - Retrieve the
mean_test_score
values for each hyperparameter configuration fromcv_results_
.
By accessing the detailed results in cv_results_
, you can gain insights into the performance of different hyperparameter configurations and make informed decisions about which ones to use for your final model.