SKLearner Home | About | Contact | Examples

Scikit-Learn Get RandomizedSearchCV "cv_results_" Attribute

RandomizedSearchCV is a powerful tool for hyperparameter optimization that allows you to sample from distributions of hyperparameter values.

After running a random search, you can access detailed results about the cross-validation process using the cv_results_ attribute.

The cv_results_ attribute is a dictionary that contains key information about the performance of each hyperparameter configuration tested during the random search.

This includes metrics like mean test scores, fit times, and score times. Accessing cv_results_ is useful for analyzing and comparing the performance of different configurations to select the best one for your model.

from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint

# Generate a random classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, n_informative=5, n_redundant=5, random_state=42)

# Set up a RandomForestClassifier
rf = RandomForestClassifier(random_state=42)

# Define hyperparameter distributions to sample from
param_dist = {
    'n_estimators': randint(50, 200),
    'max_depth': [3, 5, 10, None],
    'min_samples_split': randint(2, 10),
}

# Run random search with 5-fold cross-validation
random_search = RandomizedSearchCV(rf, param_distributions=param_dist, n_iter=10, cv=5, random_state=42)
random_search.fit(X, y)

# Access cv_results_ attribute
cv_results = random_search.cv_results_

# Retrieve mean test scores for each configuration
mean_test_scores = cv_results['mean_test_score']

# Print mean test scores
print("Mean test scores:")
print(mean_test_scores)

Running the example gives an output like:

Mean test scores:
[0.926 0.924 0.887 0.912 0.923 0.924 0.914 0.927 0.908 0.887]

The steps are as follows:

  1. Import required libraries and prepare a synthetic classification dataset using make_classification.
  2. Configure a RandomForestClassifier and define distributions to sample hyperparameters from.
  3. Run RandomizedSearchCV with the classifier, hyperparameter distributions, 10 iterations, and 5-fold cross-validation.
  4. After fitting, access the cv_results_ attribute from the random_search object.
  5. Retrieve the mean_test_score values for each hyperparameter configuration from cv_results_.

By accessing the detailed results in cv_results_, you can gain insights into the performance of different hyperparameter configurations and make informed decisions about which ones to use for your final model.



See Also