The refit_time_
attribute of a RandomizedSearchCV
object stores the time, in seconds, it took to refit the best model on the full dataset after the random search.
When refit=True
(default), RandomizedSearchCV
will clone the best estimator found during the search and refit it on the whole dataset. This final model is accessible via the best_estimator_
attribute.
Accessing refit_time_
is useful for evaluating the computational cost of refitting the best model, which can be a significant factor in the overall time complexity of the hyperparameter search process.
from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
# Generate a random regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=42)
# Set up a RandomForestRegressor
rf = RandomForestRegressor(random_state=42)
# Define hyperparameter distributions to sample from
param_dist = {
'n_estimators': randint(5, 10),
'max_depth': [3, 5, 10, None],
'min_samples_split': randint(2, 10),
}
# Run random search with 5-fold cross-validation and refit the best model
random_search = RandomizedSearchCV(rf, param_distributions=param_dist, n_iter=10, cv=5, refit=True, random_state=42)
random_search.fit(X, y)
# Access refit_time_ attribute
refit_time = random_search.refit_time_
# Print refit time
print(f"Time to refit the best model: {refit_time:.2f} seconds")
Running the example gives an output like:
Time to refit the best model: 0.01 seconds
- Prepare a synthetic regression dataset using
make_regression
. - Configure a
RandomForestRegressor
and define distributions to sample hyperparameters from. - Run
RandomizedSearchCV
with the regressor, hyperparameter distributions, 10 iterations, 5-fold cross-validation, andrefit=True
. - After fitting, access the
refit_time_
attribute from therandom_search
object. - Print the refit time to see how long it took to refit the best model on the full dataset.
By monitoring the refit_time_
, you can assess the computational overhead of refitting the best model found during the random search, which can help in optimizing your hyperparameter tuning workflow.