The Area Under the Receiver Operating Characteristic Curve (AUC-ROC or simply AUC) is a popular metric for evaluating the performance of binary classification models. AUC measures the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve.
The ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The AUC represents the area under this curve, providing an aggregate measure of performance across all possible classification thresholds. AUC ranges in value from 0 to 1, with a model whose predictions are 100% wrong having an AUC of 0.0 and a model whose predictions are 100% correct having an AUC of 1.0.
AUC is particularly useful for imbalanced classification problems, where the number of instances in one class is significantly higher than the other. In such cases, accuracy can be misleading, and AUC provides a more reliable metric for evaluating the model’s performance.
However, it’s important to note that AUC summarizes the model’s performance over all possible thresholds and may not reflect the performance at a specific threshold. If the model needs to be optimized for a particular threshold, other metrics like precision, recall, or F1 score might be more appropriate.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_auc_score
# Generate a synthetic imbalanced binary classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.9, 0.1], random_state=42)
# Split the dataset into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a logistic regression model
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)
# Generate predicted probabilities on the test set
y_pred_prob = clf.predict_proba(X_test)[:, 1]
# Calculate the AUC score
auc = roc_auc_score(y_test, y_pred_prob)
print(f"AUC: {auc:.2f}")
Running the code above provides an output similar to:
AUC: 0.86
Here’s a summary of the key steps:
- Generate an imbalanced binary classification dataset using
make_classification()
, with 90% of the instances belonging to one class and 10% to the other. - Split the dataset into training and test sets using
train_test_split()
. - Train a logistic regression model on the training set using the
LogisticRegression
class. - Generate predicted probabilities for the positive class on the test set using
predict_proba()
. - Calculate the AUC score using
roc_auc_score()
by comparing the predicted probabilities with the true labels.
This example demonstrates how to use the roc_auc_score()
function from scikit-learn to evaluate the performance of a binary classification model using the AUC metric. By generating an imbalanced dataset, splitting it into train and test sets, training a logistic regression model, generating predicted probabilities, and calculating the AUC score, we can assess the model’s ability to distinguish between classes, even when the class distribution is imbalanced.