The error_score
parameter in scikit-learn’s GridSearchCV
specifies the value to assign to the score if an error occurs in model fitting. Handling errors gracefully during hyperparameter tuning can ensure that the grid search process completes without interruption.
Grid search is a method for exhaustively searching over a specified set of parameter values to find the best combination. It trains and evaluates the model for each combination of parameters, handling any errors that occur according to the error_score
setting.
The error_score
parameter can be set to np.nan
(default), a specific float value, or the string ‘raise’. By default, model fitting errors will result in the score being nan
. Setting error_score
to a numerical value (e.g., 0
) allows the grid search to continue smoothly, while ‘raise’ will stop the process and raise the error.
from sklearn.datasets import make_classification
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
import numpy as np
# create a synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)
# define the parameter grid
param_grid = {'C': [0.1, 1, 10], 'gamma': [1, 0.1, 0.01]}
# define and perform a grid search with default error_score
grid_default = GridSearchCV(estimator=SVC(), param_grid=param_grid, error_score=np.nan, cv=5)
grid_default.fit(X, y)
# define and perform a grid search with error_score set to 0
grid_zero = GridSearchCV(estimator=SVC(), param_grid=param_grid, error_score=0, cv=5)
grid_zero.fit(X, y)
# report the results
print("Best parameters found with default error_score (nan):")
print(grid_default.best_params_)
print("Best parameters found with error_score set to 0:")
print(grid_zero.best_params_)
Running the example gives an output like:
Best parameters found with default error_score (nan):
{'C': 1, 'gamma': 0.01}
Best parameters found with error_score set to 0:
{'C': 1, 'gamma': 0.01}
The key steps in this example are:
- Generate a synthetic binary classification dataset using
make_classification
. - Define a parameter grid for
SVC
withC
andgamma
values to search over. - Create a
GridSearchCV
object with the defaulterror_score
(np.nan
). - Fit the grid search object to find the best parameters, handling errors as
nan
. - Create another
GridSearchCV
object witherror_score
set to0
. - Fit the second grid search object and handle errors by assigning a score of
0
. - Print out the best parameters found by each grid search, showing how
error_score
impacts the results.