Scikit-Learn label_ranking_average_precision_score() Metric

Label ranking average precision (LRAP) is a metric for evaluating the performance of multi-label ranking models. It measures how well the predicted labels match the true labels.

The label_ranking_average_precision_score() function in scikit-learn calculates LRAP by averaging the precision of the relevant labels in the predicted ranking across all samples. It takes the true labels and the predicted probabilities as input and returns a float value between 0 and 1, with 1 indicating perfect ranking.

LRAP is used for multi-label classification tasks where each instance can belong to multiple labels. It is particularly useful when the number of labels is manageable, but it becomes computationally intensive with a very large number of labels. A high LRAP score indicates that the model ranks the relevant labels higher in its predictions.

from sklearn.datasets import make_multilabel_classification
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import label_ranking_average_precision_score

# Generate synthetic multi-label classification dataset
X, y = make_multilabel_classification(n_samples=1000, n_classes=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a multi-label classifier using Logistic Regression
clf = MultiOutputClassifier(LogisticRegression(random_state=42))
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Calculate label ranking average precision score
lrap_score = label_ranking_average_precision_score(y_test, y_pred)
print(f"Label Ranking Average Precision Score: {lrap_score:.2f}")

Running the example gives an output like:

Label Ranking Average Precision Score: 0.86

Generate a synthetic multi-label classification dataset using make_multilabel_classification(), creating 1000 samples with 5 possible labels.
Split the dataset into training and test sets, reserving 20% of the data for testing.
Train a multi-label classifier using MultiOutputClassifier with LogisticRegression as the base estimator.
Make label predictions on the test set using the predict() method.
Calculate the label ranking average precision score with label_ranking_average_precision_score() by comparing the true test labels to the predicted labels, and print the result.

See Also