Configure SVC "tol" Parameter

The tol parameter in scikit-learn’s SVC class controls the tolerance for stopping criteria during optimization.

Support Vector Machines (SVMs) are powerful algorithms for classification and regression tasks. The SVC class implements support vector classification for binary and multi-class problems.

The tol parameter sets the tolerance for the stopping criterion. It determines the minimal change in the cost function between iterations that is required for the optimizer to continue.

Smaller values of tol lead to tighter convergence criteria and potentially longer training times. Larger values allow for looser convergence and faster training, but may result in lower accuracy.

The default value for tol is 1e-3, which is a good starting point for most datasets.

In practice, values between 1e-5 and 1e-2 are commonly used depending on the desired balance between training time and model accuracy.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import time

# Generate synthetic dataset
X, y = make_classification(n_samples=10000, n_features=20, n_informative=10,
                           n_redundant=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different tol values
tol_values = [1e-5, 1e-4, 1e-3, 1e-2]
accuracies = []
train_times = []

for tol in tol_values:
    start_time = time.time()
    svc = SVC(tol=tol, random_state=42)
    svc.fit(X_train, y_train)
    end_time = time.time()

    y_pred = svc.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)

    accuracies.append(accuracy)
    train_times.append(end_time - start_time)

    print(f"tol={tol:.1e}, Accuracy: {accuracy:.3f}, Training time: {end_time - start_time:.2f}s")

The output will look like:

tol=1.0e-05, Accuracy: 0.944, Training time: 0.67s
tol=1.0e-04, Accuracy: 0.944, Training time: 0.65s
tol=1.0e-03, Accuracy: 0.944, Training time: 0.67s
tol=1.0e-02, Accuracy: 0.944, Training time: 1.03s

The key steps in this example are:

Generate a synthetic binary classification dataset with informative and redundant features
Split the data into train and test sets
Train SVC models with different tol values
Evaluate the accuracy and training time of each model on the test set

Some tips and heuristics for setting tol:

Start with the default value of 1e-3 and adjust based on results
Smaller values lead to tighter convergence but longer training times
Larger values allow faster training but may reduce accuracy

Issues to consider:

The optimal value of tol depends on the specific dataset and problem
There is a trade-off between training time and model accuracy
Very small tol values can lead to much longer training times with diminishing returns in accuracy

See Also