SKLearner Home | About | Contact | Examples

Scikit-Learn "LinearSVC" versus "SVC"

LinearSVC and SVC are both powerful algorithms for binary classification tasks in scikit-learn. This example demonstrates their differences and helps you choose the appropriate model for your needs.

LinearSVC is optimized for linear decision boundaries and is generally faster on large datasets with high dimensionality. Its key hyperparameters include C (regularization parameter), max_iter (maximum number of iterations), and dual (dual or primal optimization problem).

SVC, on the other hand, supports various kernel functions, making it more versatile but potentially slower due to additional computational complexity. Its key hyperparameters include C (regularization parameter), kernel (specifies the kernel type to be used in the algorithm), and gamma (kernel coefficient for ‘rbf’, ‘poly’, and ‘sigmoid’).

The main difference is that LinearSVC is best suited for large datasets with linear relationships, offering faster computation, while SVC provides more flexibility with kernel functions, performing better on datasets with non-linear relationships.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC, SVC
from sklearn.metrics import accuracy_score, f1_score

# Generate synthetic binary classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42, n_features=20, n_informative=2, n_redundant=2, n_clusters_per_class=1)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit and evaluate LinearSVC with default hyperparameters
linear_svc = LinearSVC(random_state=42)
linear_svc.fit(X_train, y_train)
y_pred_linear_svc = linear_svc.predict(X_test)
print(f"LinearSVC accuracy: {accuracy_score(y_test, y_pred_linear_svc):.3f}")
print(f"LinearSVC F1 score: {f1_score(y_test, y_pred_linear_svc):.3f}")

# Fit and evaluate SVC with a linear kernel
svc = SVC(kernel='linear', random_state=42)
svc.fit(X_train, y_train)
y_pred_svc = svc.predict(X_test)
print(f"\nSVC accuracy: {accuracy_score(y_test, y_pred_svc):.3f}")
print(f"SVC F1 score: {f1_score(y_test, y_pred_svc):.3f}")

Running the example gives an output like:

LinearSVC accuracy: 0.905
LinearSVC F1 score: 0.899

SVC accuracy: 0.920
SVC F1 score: 0.915

The steps are as follows:

  1. Generate a synthetic binary classification dataset using make_classification.
  2. Split the data into training and test sets using train_test_split.
  3. Instantiate LinearSVC with default hyperparameters, fit it on the training data, and evaluate its performance on the test set.
  4. Instantiate SVC with a linear kernel, fit it on the training data, and evaluate its performance on the test set.
  5. Compare the test set performance (accuracy and F1 score) of both models.


See Also