The verbose
parameter in scikit-learn’s GradientBoostingRegressor
controls the amount of information printed during the training process.
GradientBoostingRegressor
is an ensemble learning algorithm that builds models sequentially, each new model correcting errors made by the previous ones. It is effective for regression tasks.
The verbose
parameter controls the amount of information printed during training. A higher value increases verbosity, showing more details about the training process.
The default value for verbose
is 0, meaning no information is printed. Common values include 1 for basic information and higher values for more detailed logs.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
# Generate synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different verbose values
verbose_values = [0, 1, 2]
for v in verbose_values:
print(f"Training with verbose={v}")
gbr = GradientBoostingRegressor(verbose=v, random_state=42)
gbr.fit(X_train, y_train)
Running the example gives an output like:
Training with verbose=0
Training with verbose=1
Iter Train Loss Remaining Time
1 15639.0171 0.27s
2 14025.2982 0.26s
3 12648.9240 0.27s
4 11489.1200 0.26s
5 10472.7671 0.26s
6 9598.3205 0.26s
7 8847.4255 0.26s
8 8167.1511 0.25s
9 7574.5246 0.25s
10 7050.7603 0.25s
20 3741.6244 0.21s
30 2197.1783 0.18s
40 1375.2888 0.16s
50 906.3430 0.13s
60 625.6540 0.10s
70 455.0845 0.08s
80 337.7009 0.06s
90 262.8778 0.03s
100 212.3854 0.00s
Training with verbose=2
Iter Train Loss Remaining Time
1 15639.0171 0.29s
2 14025.2982 0.28s
3 12648.9240 0.27s
4 11489.1200 0.27s
5 10472.7671 0.27s
6 9598.3205 0.26s
7 8847.4255 0.26s
8 8167.1511 0.26s
9 7574.5246 0.25s
10 7050.7603 0.25s
11 6599.9888 0.25s
12 6174.8602 0.24s
13 5787.0896 0.24s
14 5410.2320 0.24s
15 5073.8338 0.24s
16 4759.4362 0.23s
17 4469.7631 0.23s
18 4204.3685 0.23s
19 3960.2994 0.22s
20 3741.6244 0.22s
21 3534.4619 0.22s
22 3354.1048 0.22s
23 3173.9828 0.21s
24 3006.9971 0.21s
25 2853.4060 0.21s
26 2710.7689 0.20s
27 2571.8114 0.20s
28 2438.8436 0.20s
29 2315.4097 0.20s
30 2197.1783 0.19s
31 2095.5956 0.19s
32 1996.8812 0.19s
33 1898.2286 0.18s
34 1809.4907 0.18s
35 1722.2440 0.18s
36 1650.2903 0.18s
37 1568.9035 0.17s
38 1496.7612 0.17s
39 1436.2522 0.17s
40 1375.2888 0.17s
41 1310.8514 0.16s
42 1255.6380 0.16s
43 1202.1333 0.16s
44 1153.3463 0.15s
45 1107.4294 0.15s
46 1065.8504 0.15s
47 1022.9279 0.15s
48 982.7985 0.14s
49 943.5579 0.14s
50 906.3430 0.14s
51 870.4796 0.13s
52 836.8071 0.13s
53 805.1794 0.13s
54 775.2262 0.13s
55 747.9249 0.12s
56 719.2642 0.12s
57 696.5100 0.12s
58 673.1561 0.12s
59 649.5509 0.11s
60 625.6540 0.11s
61 604.1354 0.11s
62 586.0589 0.10s
63 565.8572 0.10s
64 548.4621 0.10s
65 529.3157 0.10s
66 512.8351 0.09s
67 495.7652 0.09s
68 481.8541 0.09s
69 467.6236 0.09s
70 455.0845 0.08s
71 441.2193 0.08s
72 428.5847 0.08s
73 413.3311 0.07s
74 401.1224 0.07s
75 391.2015 0.07s
76 380.1832 0.07s
77 369.3776 0.06s
78 357.1929 0.06s
79 348.2380 0.06s
80 337.7009 0.06s
81 329.2550 0.05s
82 321.3797 0.05s
83 312.6293 0.05s
84 305.6230 0.04s
85 297.5882 0.04s
86 289.3643 0.04s
87 281.9410 0.04s
88 275.6386 0.03s
89 269.1171 0.03s
90 262.8778 0.03s
91 257.8060 0.02s
92 251.7892 0.02s
93 246.2700 0.02s
94 240.6213 0.02s
95 234.9281 0.01s
96 229.9583 0.01s
97 224.4990 0.01s
98 220.5929 0.01s
99 216.5582 0.00s
100 212.3854 0.00s
The key steps in this example are:
- Generate a synthetic regression dataset.
- Split the data into training and testing sets.
- Train
GradientBoostingRegressor
models with differentverbose
values. - Observe and compare the training logs for each verbosity level.
Some tips and heuristics for setting verbose
:
- Use
verbose=1
for basic progress information. - Use higher values for detailed information when debugging or optimizing.
- Adjust verbosity based on the complexity of the model and dataset size.
Issues to consider:
- High verbosity can slow down training due to logging overhead.
- Excessive logging may clutter the output, making it hard to find relevant information.
- Balance the need for information with training efficiency.