The random_state
parameter in RandomizedSearchCV
controls the randomness of the hyperparameter search process. It ensures reproducibility of the search results by fixing the random seed used for generating parameter combinations.
The default value for random_state
is None
, which means the search process will be different each time the code is run. Setting random_state
to an integer value will make the search deterministic, producing the same results across multiple runs with the same random seed.
As a heuristic, set random_state
to a fixed integer when you need reproducible results, such as for debugging or comparison purposes. Use None
or different integer values to explore different random subsets of the hyperparameter space.
from sklearn.datasets import make_classification
from sklearn.svm import SVC
from sklearn.model_selection import RandomizedSearchCV
from scipy.stats import randint
# Generate a synthetic binary classification dataset
X, y = make_classification(n_samples=100, n_features=10, n_informative=5, n_redundant=5, random_state=42)
# Define a parameter distribution for SVC hyperparameters
param_dist = {'C': randint(1, 10), 'kernel': ['linear', 'rbf']}
# Create a base SVC model
svc = SVC()
# List of random_state values to test
random_states = [None, 42, 123]
for rs in random_states:
# Run RandomizedSearchCV with the current random_state value
search = RandomizedSearchCV(svc, param_dist, n_iter=5, cv=3, random_state=rs)
search.fit(X, y)
print(f"Best score for random_state={rs}: {search.best_score_:.3f}")
print(f"Best parameters for random_state={rs}: {search.best_params_}")
print()
Running the example gives an output like:
Best score for random_state=None: 0.880
Best parameters for random_state=None: {'C': 6, 'kernel': 'rbf'}
Best score for random_state=42: 0.890
Best parameters for random_state=42: {'C': 7, 'kernel': 'rbf'}
Best score for random_state=123: 0.890
Best parameters for random_state=123: {'C': 7, 'kernel': 'rbf'}
The steps are as follows:
- Generate a synthetic binary classification dataset using
make_classification()
. - Define a parameter distribution dictionary
param_dist
forSVC
hyperparameters. - Create a base
SVC
model. - Iterate over different
random_state
values (None
,42
,123
). - For each
random_state
value:- Run
RandomizedSearchCV
with 5 iterations and 3-fold cross-validation. - Print the best score and best parameters for the current
random_state
value.
- Run