F-beta score combines precision and recall, weighted by a factor beta.
The fbeta_score()
function in scikit-learn balances the importance of precision and recall, making it ideal for binary and multiclass classification where the precision-recall trade-off is crucial.
However, it has limitations in handling imbalanced datasets and might not reflect overall model performance accurately in such cases.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import fbeta_score
# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_classes=2, weights=[0.7, 0.3], random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a Logistic Regression classifier
clf = LogisticRegression(random_state=42)
clf.fit(X_train, y_train)
# Predict on test set
y_pred = clf.predict(X_test)
# Calculate F-beta scores
fbeta_05 = fbeta_score(y_test, y_pred, beta=0.5)
fbeta_1 = fbeta_score(y_test, y_pred, beta=1)
fbeta_2 = fbeta_score(y_test, y_pred, beta=2)
print(f"F-beta score (beta=0.5): {fbeta_05:.2f}")
print(f"F-beta score (beta=1): {fbeta_1:.2f}")
print(f"F-beta score (beta=2): {fbeta_2:.2f}")
Running the example gives an output like:
F-beta score (beta=0.5): 0.73
F-beta score (beta=1): 0.67
F-beta score (beta=2): 0.63
The steps are as follows:
- Generate a synthetic binary classification dataset with a slight class imbalance using
make_classification()
. - Split the dataset into training and test sets using
train_test_split()
. - Train a
LogisticRegression
classifier on the training set. - Use the trained classifier to make predictions on the test set with
predict()
. - Calculate F-beta scores for beta values of 0.5, 1, and 2 using
fbeta_score()
.
First, we generate a synthetic dataset with 1000 samples and two classes, with a 70-30 class imbalance using make_classification()
. This setup helps demonstrate the effectiveness of the F-beta score in scenarios where class distribution is not equal.
Next, we split the dataset into training and test sets, with 80% for training and 20% for testing, using train_test_split()
. This ensures that we have separate data for evaluating our model’s performance.
We then train a LogisticRegression
classifier on the training data by calling the fit()
method with X_train
and y_train
. The logistic regression model is chosen for its simplicity and effectiveness in binary classification problems.
After training, we predict the labels for the test set using the predict()
method on X_test
, resulting in predicted labels y_pred
.
Finally, we calculate the F-beta scores using fbeta_score()
for beta values of 0.5, 1, and 2. The F-beta score balances precision and recall, with different beta values emphasizing precision (beta < 1) or recall (beta > 1) more. This provides a nuanced view of model performance beyond simple accuracy. The calculated F-beta scores are printed, giving insight into the model’s ability to balance precision and recall under different weightings.