Configure SVC "probability" Parameter

The probability parameter in scikit-learn’s SVC class determines whether the model should enable probability estimates. When set to True, the classifier will fit an additional model to estimate class probabilities.

Support Vector Machines (SVMs) like SVC do not directly provide probability estimates. Instead, when probability is True, scikit-learn’s SVC uses Platt scaling to calibrate the decision function scores into probabilities.

The default value for probability is False to save computational cost. When probability estimates are needed, it is common to set probability=True.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, n_features=10,
                           n_informative=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVC with probability=False
svc_no_prob = SVC(probability=False)
svc_no_prob.fit(X_train, y_train)
y_pred_no_prob = svc_no_prob.predict(X_test)
accuracy_no_prob = accuracy_score(y_test, y_pred_no_prob)
print(f"SVC with probability=False, Accuracy: {accuracy_no_prob:.3f}")

# Train SVC with probability=True
svc_prob = SVC(probability=True)
svc_prob.fit(X_train, y_train)
y_pred_prob = svc_prob.predict(X_test)
accuracy_prob = accuracy_score(y_test, y_pred_prob)
print(f"SVC with probability=True, Accuracy: {accuracy_prob:.3f}")

# Get probability estimates
probabilities = svc_prob.predict_proba(X_test)
print("First 5 probability estimates:")
print(probabilities[:5])

# Attempting to get probabilities from svc_no_prob will raise an AttributeError
try:
    svc_no_prob.predict_proba(X_test)
except AttributeError as e:
    print(f"Error when trying to get probabilities from svc_no_prob: {e}")

The output will look like:

SVC with probability=False, Accuracy: 0.920
SVC with probability=True, Accuracy: 0.920
First 5 probability estimates:
[[0.97766859 0.02233141]
 [0.10281995 0.89718005]
 [0.93834819 0.06165181]
 [0.96767281 0.03232719]
 [0.94002999 0.05997001]]
Error when trying to get probabilities from svc_no_prob: This 'SVC' has no attribute 'predict_proba'

The key steps in this example are:

Generate a synthetic binary classification dataset
Split the data into train and test sets
Train two SVC models, one with probability=False and one with probability=True
Use predict_proba() to get probability estimates from the model with probability=True
Show that predict_proba() is not available for the model with probability=False

Some tips and heuristics for setting probability:

Set probability=True when you need probability estimates for your application
Using probability=True increases the computational cost and training time of the model
The probabilities obtained may need to be calibrated for some applications

Issues to consider:

Outputting probabilities requires more memory to store the additional model parameters
The underlying SVC model does not directly estimate probabilities, so the calibrated estimates may be less reliable than those from inherently probabilistic models

See Also