The decision_function_shape
parameter in scikit-learn’s SVC
class determines the shape of the decision function used for multi-class classification.
SVC (Support Vector Classification) is a powerful algorithm for classification tasks. It finds the optimal hyperplane that maximally separates the classes in the feature space.
The decision_function_shape
parameter controls whether the binary SVC problem is extended to a multi-class case using a one-vs-one or one-vs-rest scheme. It takes the values ‘ovo’ for one-vs-one and ‘ovr’ for one-vs-rest.
The default value is ‘ovr’, which trains n_classes binary SVCs, each distinguishing one class from the rest. ‘ovo’ trains n_classes * (n_classes - 1) / 2 binary SVCs for each pair of classes, which can be more computationally expensive.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
# Generate synthetic multi-class dataset
X, y = make_classification(n_samples=1000, n_classes=4, n_features=10,
n_informative=8, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train with different decision_function_shape values
shapes = ['ovo', 'ovr']
accuracies = []
for shape in shapes:
svc = SVC(decision_function_shape=shape, random_state=42)
svc.fit(X_train, y_train)
y_pred = svc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
accuracies.append(accuracy)
print(f"decision_function_shape='{shape}', Accuracy: {accuracy:.3f}")
Running the example gives an output like:
decision_function_shape='ovo', Accuracy: 0.835
decision_function_shape='ovr', Accuracy: 0.835
The key steps in this example are:
- Generate a synthetic multi-class classification dataset
- Split the data into train and test sets
- Train
SVC
models with differentdecision_function_shape
values - Evaluate the accuracy of each model on the test set
Some tips and heuristics for setting decision_function_shape
:
- Use the default ‘ovr’ unless you have a specific reason to use ‘ovo’
- ‘ovo’ can be beneficial when you have a large number of classes
- ‘ovo’ is more computationally expensive than ‘ovr’
Issues to consider:
- The choice of ‘ovo’ vs ‘ovr’ can affect the model’s performance and interpretability
- ‘ovo’ can lead to ambiguous regions in the decision space where multiple classes overlap
- The optimal choice may depend on the specific characteristics of your dataset and problem