Scikit-Learn log_loss() Metric

Log loss, also known as logistic loss or cross-entropy loss, evaluates the performance of a classification model.

It measures the uncertainty of the probability predictions made by the model, with lower values indicating better performance.

log_loss() is calculated by taking the negative log of the predicted probabilities for the true class labels and averaging over all samples.

It is commonly used in binary and multiclass classification problems to assess models that output probability estimates.

Log loss is sensitive to incorrect and overconfident predictions, making it a robust metric for probabilistic classifiers.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import log_loss

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a logistic regression classifier
clf = LogisticRegression(solver='liblinear', random_state=42)
clf.fit(X_train, y_train)

# Predict probabilities on test set
y_prob = clf.predict_proba(X_test)

# Calculate log loss
loss = log_loss(y_test, y_prob)
print(f"Log Loss: {loss:.2f}")

Running the example gives an output like:

Log Loss: 0.37

The steps are as follows:

Generate a synthetic binary classification dataset using make_classification().
Split the dataset into training and test sets using train_test_split().
Train a LogisticRegression classifier on the training set.
Use the trained classifier to predict probabilities on the test set with predict_proba().
Calculate the log loss using log_loss() by comparing the true labels to the predicted probabilities.

First, we generate a synthetic binary classification dataset using the make_classification() function from scikit-learn. This function creates a dataset with 1000 samples and 2 classes, allowing us to simulate a classification problem without using real-world data.

Next, we split the dataset into training and test sets using the train_test_split() function. This step is crucial for evaluating the performance of our classifier on unseen data. We use 80% of the data for training and reserve 20% for testing.

With our data prepared, we train a logistic regression classifier using the LogisticRegression class from scikit-learn. We specify the liblinear solver and set a random state for reproducibility. The fit() method is called on the classifier object, passing in the training features (X_train) and labels (y_train) to learn the underlying patterns in the data.

After training, we use the trained classifier to predict probabilities on the test set by calling the predict_proba() method with X_test. This generates predicted probabilities for each class label in the test set.

Finally, we evaluate the log loss of our classifier using the log_loss() function. This function takes the true labels (y_test) and the predicted probabilities (y_prob) as input and calculates the negative log likelihood of the predictions, averaged over all samples. The resulting log loss score is printed, giving us a quantitative measure of our classifier’s performance.

This example demonstrates how to use the log_loss() function from scikit-learn to evaluate the performance of a binary classification model.

See Also