The return_train_score
parameter in scikit-learn’s GridSearchCV
controls whether training scores are computed and returned in the cv_results_
attribute. By default, this parameter is set to False
, meaning only validation scores are reported for each hyperparameter combination.
Setting return_train_score
to True
will compute training scores in addition to validation scores. This can provide valuable insights into whether a model is overfitting (high training scores but low validation scores) or underfitting (low scores on both sets).
from sklearn.datasets import make_regression
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge
# generate a synthetic regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=42)
# create a pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('ridge', Ridge())
])
# define the parameter grid
param_grid = {'ridge__alpha': [0.1, 1.0, 10.0]}
# run GridSearchCV with default 'return_train_score' (False)
grid_search_default = GridSearchCV(estimator=pipeline, param_grid=param_grid, cv=5)
grid_search_default.fit(X, y)
# run GridSearchCV with 'return_train_score' set to True
grid_search_train = GridSearchCV(estimator=pipeline, param_grid=param_grid, cv=5, return_train_score=True)
grid_search_train.fit(X, y)
# print the cv_results_ without train scores
print("CV results without train scores:")
print(grid_search_default.cv_results_)
# print the cv_results_ with train scores
print("CV results with train scores:")
print(grid_search_train.cv_results_)
Running the example gives an output like:
CV results without train scores:
{'mean_fit_time': array([0.00139108, 0.00100942, 0.00122395]), 'std_fit_time': array([2.89526146e-04, 5.24390811e-06, 2.30003954e-04]), 'mean_score_time': array([0.00055017, 0.00052509, 0.00051861]), 'std_score_time': array([2.32443320e-05, 4.34611893e-05, 2.53865524e-05]), 'param_ridge__alpha': masked_array(data=[0.1, 1.0, 10.0],
mask=[False, False, False],
fill_value=1e+20), 'params': [{'ridge__alpha': 0.1}, {'ridge__alpha': 1.0}, {'ridge__alpha': 10.0}], 'split0_test_score': array([0.99999748, 0.99976147, 0.9818852 ]), 'split1_test_score': array([0.99999799, 0.99983417, 0.98733181]), 'split2_test_score': array([0.99999749, 0.99986857, 0.99000378]), 'split3_test_score': array([0.99999874, 0.99986289, 0.98883627]), 'split4_test_score': array([0.99999648, 0.99974793, 0.98175641]), 'mean_test_score': array([0.99999764, 0.99981501, 0.9859627 ]), 'std_test_score': array([7.38456956e-07, 5.07835855e-05, 3.48657640e-03]), 'rank_test_score': array([1, 2, 3], dtype=int32)}
CV results with train scores:
{'mean_fit_time': array([0.00124483, 0.00109639, 0.00112467]), 'std_fit_time': array([4.01812467e-04, 7.52905488e-05, 2.40980140e-04]), 'mean_score_time': array([0.00057836, 0.0006794 , 0.00050402]), 'std_score_time': array([1.30767442e-04, 2.25248183e-04, 1.24772896e-05]), 'param_ridge__alpha': masked_array(data=[0.1, 1.0, 10.0],
mask=[False, False, False],
fill_value=1e+20), 'params': [{'ridge__alpha': 0.1}, {'ridge__alpha': 1.0}, {'ridge__alpha': 10.0}], 'split0_test_score': array([0.99999748, 0.99976147, 0.9818852 ]), 'split1_test_score': array([0.99999799, 0.99983417, 0.98733181]), 'split2_test_score': array([0.99999749, 0.99986857, 0.99000378]), 'split3_test_score': array([0.99999874, 0.99986289, 0.98883627]), 'split4_test_score': array([0.99999648, 0.99974793, 0.98175641]), 'mean_test_score': array([0.99999764, 0.99981501, 0.9859627 ]), 'std_test_score': array([7.38456956e-07, 5.07835855e-05, 3.48657640e-03]), 'rank_test_score': array([1, 2, 3], dtype=int32), 'split0_train_score': array([0.9999978 , 0.99981222, 0.98578377]), 'split1_train_score': array([0.99999828, 0.99985773, 0.98888046]), 'split2_train_score': array([0.99999855, 0.99987773, 0.9902186 ]), 'split3_train_score': array([0.99999836, 0.99986652, 0.9893865 ]), 'split4_train_score': array([0.99999859, 0.99988565, 0.99102202]), 'mean_train_score': array([0.99999832, 0.99985997, 0.98905827]), 'std_train_score': array([2.84199683e-07, 2.57041481e-05, 1.79245133e-03])}
The key steps in this example are:
- Generate a synthetic regression dataset using
make_regression
- Create a
Pipeline
withStandardScaler
andRidge
steps - Define a parameter grid for the
alpha
parameter ofRidge
- Run
GridSearchCV
with defaultreturn_train_score
(False) and print thecv_results_
attribute - Run
GridSearchCV
withreturn_train_score
set to True and print thecv_results_
attribute - Compare the two
cv_results_
outputs, noting the additional training score columns (mean_train_score
,std_train_score
,split0_train_score
, etc.) whenreturn_train_score
is True
This example demonstrates how to use the return_train_score
parameter in GridSearchCV
to include training scores in the results, enabling evaluation of overfitting or underfitting in the model.