SKLearner Home | About | Contact | Examples

Configure SVC "probability" Parameter

The probability parameter in scikit-learn’s SVC class determines whether the model should enable probability estimates. When set to True, the classifier will fit an additional model to estimate class probabilities.

Support Vector Machines (SVMs) like SVC do not directly provide probability estimates. Instead, when probability is True, scikit-learn’s SVC uses Platt scaling to calibrate the decision function scores into probabilities.

The default value for probability is False to save computational cost. When probability estimates are needed, it is common to set probability=True.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, n_features=10,
                           n_informative=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVC with probability=False
svc_no_prob = SVC(probability=False)
svc_no_prob.fit(X_train, y_train)
y_pred_no_prob = svc_no_prob.predict(X_test)
accuracy_no_prob = accuracy_score(y_test, y_pred_no_prob)
print(f"SVC with probability=False, Accuracy: {accuracy_no_prob:.3f}")

# Train SVC with probability=True
svc_prob = SVC(probability=True)
svc_prob.fit(X_train, y_train)
y_pred_prob = svc_prob.predict(X_test)
accuracy_prob = accuracy_score(y_test, y_pred_prob)
print(f"SVC with probability=True, Accuracy: {accuracy_prob:.3f}")

# Get probability estimates
probabilities = svc_prob.predict_proba(X_test)
print("First 5 probability estimates:")
print(probabilities[:5])

# Attempting to get probabilities from svc_no_prob will raise an AttributeError
try:
    svc_no_prob.predict_proba(X_test)
except AttributeError as e:
    print(f"Error when trying to get probabilities from svc_no_prob: {e}")

The output will look like:

SVC with probability=False, Accuracy: 0.920
SVC with probability=True, Accuracy: 0.920
First 5 probability estimates:
[[0.97766859 0.02233141]
 [0.10281995 0.89718005]
 [0.93834819 0.06165181]
 [0.96767281 0.03232719]
 [0.94002999 0.05997001]]
Error when trying to get probabilities from svc_no_prob: This 'SVC' has no attribute 'predict_proba'

The key steps in this example are:

  1. Generate a synthetic binary classification dataset
  2. Split the data into train and test sets
  3. Train two SVC models, one with probability=False and one with probability=True
  4. Use predict_proba() to get probability estimates from the model with probability=True
  5. Show that predict_proba() is not available for the model with probability=False

Some tips and heuristics for setting probability:

Issues to consider:



See Also