Scikit-Learn balanced_accuracy_score() Metric

The balanced accuracy score is a metric used to evaluate the performance of classification models, particularly in cases where the dataset is imbalanced. It calculates the average recall obtained on each class, providing a more informative measure than regular accuracy.

Balanced accuracy is computed as the average of recall scores for each class. Recall, also known as sensitivity or true positive rate, is the ratio of correctly predicted positive instances to the total number of actual positive instances. By averaging the recall scores across all classes, balanced accuracy accounts for the performance on both majority and minority classes.

Balanced accuracy is commonly used for binary and multiclass classification problems, especially when dealing with imbalanced datasets. In such scenarios, a classifier that simply predicts the majority class all the time can achieve high regular accuracy, even though it fails to classify the minority class correctly. Balanced accuracy addresses this limitation by giving equal importance to the performance on each class.

However, it’s important to note that balanced accuracy does not take into account the cost of different types of errors, which may vary depending on the problem at hand. In some cases, false positives or false negatives may have different consequences, and this should be considered when evaluating a classifier’s performance.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import balanced_accuracy_score

# Generate imbalanced synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.8, 0.2], random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a logistic regression classifier
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Calculate balanced accuracy
balanced_acc = balanced_accuracy_score(y_test, y_pred)
print(f"Balanced Accuracy: {balanced_acc:.2f}")

Running the example gives an output like:

Balanced Accuracy: 0.73

The steps involved in this example are as follows:

Generate an imbalanced binary classification dataset using make_classification(), with 80% of samples belonging to the majority class and 20% to the minority class.
Split the dataset into training and test sets using train_test_split(), with 80% of the data used for training and 20% for testing.
Train a logistic regression classifier on the training set using LogisticRegression.
Make predictions on the test set using the trained classifier’s predict() method.
Calculate the balanced accuracy score using balanced_accuracy_score() by comparing the predicted labels to the true labels.

See Also