The shrinking
parameter in scikit-learn’s SVC
class controls whether the shrinking heuristic is used during training.
Support Vector Machines (SVMs) are powerful models for classification and regression. The SVC
class is scikit-learn’s implementation of support vector classification.
The shrinking
parameter determines whether the shrinking heuristic is used during the SVM optimization process. When enabled, shrinking can significantly speed up training, especially on large datasets.
The default value for shrinking
is True
. The only other valid option is False
, which disables the heuristic.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import time
# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
n_redundant=5, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with shrinking enabled and disabled
for shrinking in [True, False]:
start = time.time()
svc = SVC(kernel='linear', shrinking=shrinking, random_state=42)
svc.fit(X_train, y_train)
y_pred = svc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
end = time.time()
print(f"shrinking={shrinking}, Training time: {end - start:.3f}s, Accuracy: {accuracy:.3f}")
Running the example gives an output like:
shrinking=True, Training time: 5.404s, Accuracy: 0.834
shrinking=False, Training time: 30.949s, Accuracy: 0.834
The key steps in this example are:
- Generate a large synthetic binary classification dataset
- Split the data into train and test sets
- Train
SVC
models withshrinking
enabled and disabled - Measure the training time and test accuracy for each model
Some tips and heuristics for setting shrinking
:
- Shrinking is most beneficial when training on large datasets
- If training time is not a bottleneck, leaving
shrinking
enabled is usually best
Issues to consider:
- The impact of shrinking on training speed depends on the size and complexity of the dataset
- Disabling shrinking should not significantly affect the model’s accuracy