Scikit-Learn fbeta_score() Metric

F-beta score combines precision and recall, weighted by a factor beta.

The fbeta_score() function in scikit-learn balances the importance of precision and recall, making it ideal for binary and multiclass classification where the precision-recall trade-off is crucial.

However, it has limitations in handling imbalanced datasets and might not reflect overall model performance accurately in such cases.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import fbeta_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.7, 0.3], random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Logistic Regression classifier
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)

# Predict on test set
y_pred = clf.predict(X_test)

# Calculate F-beta scores
fbeta_05 = fbeta_score(y_test, y_pred, beta=0.5)
fbeta_1 = fbeta_score(y_test, y_pred, beta=1)
fbeta_2 = fbeta_score(y_test, y_pred, beta=2)

print(f"F-beta score (beta=0.5): {fbeta_05:.2f}")
print(f"F-beta score (beta=1): {fbeta_1:.2f}")
print(f"F-beta score (beta=2): {fbeta_2:.2f}")

Running the example gives an output like:

F-beta score (beta=0.5): 0.73
F-beta score (beta=1): 0.67
F-beta score (beta=2): 0.63

The steps are as follows:

Generate a synthetic binary classification dataset with a slight class imbalance using make_classification().
Split the dataset into training and test sets using train_test_split().
Train a LogisticRegression classifier on the training set.
Use the trained classifier to make predictions on the test set with predict().
Calculate F-beta scores for beta values of 0.5, 1, and 2 using fbeta_score().

First, we generate a synthetic dataset with 1000 samples and two classes, with a 70-30 class imbalance using make_classification(). This setup helps demonstrate the effectiveness of the F-beta score in scenarios where class distribution is not equal.

Next, we split the dataset into training and test sets, with 80% for training and 20% for testing, using train_test_split(). This ensures that we have separate data for evaluating our model’s performance.

We then train a LogisticRegression classifier on the training data by calling the fit() method with X_train and y_train. The logistic regression model is chosen for its simplicity and effectiveness in binary classification problems.

After training, we predict the labels for the test set using the predict() method on X_test, resulting in predicted labels y_pred.

Finally, we calculate the F-beta scores using fbeta_score() for beta values of 0.5, 1, and 2. The F-beta score balances precision and recall, with different beta values emphasizing precision (beta < 1) or recall (beta > 1) more. This provides a nuanced view of model performance beyond simple accuracy. The calculated F-beta scores are printed, giving insight into the model’s ability to balance precision and recall under different weightings.

See Also