RandomizedSearchCV is a powerful tool for hyperparameter optimization that allows you to sample from distributions of hyperparameter values.
When running a random search, you can specify a custom scoring metric for evaluating the performance of different hyperparameter configurations using the scoring
parameter.
The scorer_
attribute of a fitted RandomizedSearchCV
object stores the actual scorer that was used during the search.
Accessing scorer_
is useful for understanding how the models were evaluated and can be helpful if you want to use the same scorer for evaluating the final selected model.
from sklearn.datasets import make_regression
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import mean_absolute_error
from scipy.stats import randint
# Generate a random regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=42)
# Define a custom scorer that calculates mean absolute error (MAE)
def custom_scorer(estimator, X, y):
y_pred = estimator.predict(X)
mae = mean_absolute_error(y, y_pred)
return -mae # Negate MAE since sklearn optimizes for higher scores
# Set up a RandomForestRegressor
rf = RandomForestRegressor(random_state=42)
# Define hyperparameter distributions to sample from
param_dist = {
'n_estimators': randint(5, 50),
'max_depth': [3, 5, 10, None],
'min_samples_split': randint(2, 10),
}
# Run random search with custom scorer and 5-fold cross-validation
random_search = RandomizedSearchCV(rf, param_distributions=param_dist,
scoring=custom_scorer, n_iter=10, cv=5, random_state=42)
random_search.fit(X, y)
# Access scorer_ attribute
scorer = random_search.scorer_
# Print scorer
print("Scorer used in RandomizedSearchCV:")
print(scorer)
Running the example gives an output like:
Scorer used in RandomizedSearchCV:
<function custom_scorer at 0x107317060>
The steps are as follows:
- Prepare a synthetic regression dataset using
make_regression
. - Define a custom scoring function
custom_scorer
that calculates the negated mean absolute error (MAE). - Configure a
RandomForestRegressor
and define distributions to sample hyperparameters from. - Run
RandomizedSearchCV
with the regressor, hyperparameter distributions, custom scorer, 10 iterations, and 5-fold cross-validation. - After fitting, access the
scorer_
attribute from therandom_search
object and print its value.
By specifying a custom scorer in RandomizedSearchCV
, you can evaluate models based on a metric that is most relevant to your problem. The scorer_
attribute allows you to access the actual scorer used during the search, which can be useful for consistency in model evaluation.