SKLearner Home | About | Contact | Examples

Scikit-Learn d2_pinball_score() Metric

d2_pinball_score is a regression metric that measures the accuracy of probabilistic predictions.

It calculates the pinball loss, which is a measure of the difference between predicted and actual quantiles. The metric ranges from 0 to 1, with 1 indicating perfect predictions and lower values indicating poorer predictions.

This metric is suitable for regression problems where prediction intervals are of interest. However, it may not be appropriate for classification problems or for regression problems where point predictions are the primary interest.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import d2_pinball_score

# Generate synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Gradient Boosting Regressor
regressor = GradientBoostingRegressor(random_state=42)
regressor.fit(X_train, y_train)

# Predict on test set
y_pred = regressor.predict(X_test)

# Calculate d2 pinball score
d2_pinball = d2_pinball_score(y_test, y_pred, alpha=0.5)
print(f"d2 Pinball Score: {d2_pinball:.2f}")

Running the example gives an output like:

d2 Pinball Score: 0.72
  1. Generate a synthetic regression dataset using make_regression().
  2. Split the dataset into training and test sets using train_test_split().
  3. Train a GradientBoostingRegressor on the training set.
  4. Use the trained regressor to make predictions on the test set with predict().
  5. Calculate the d2_pinball_score using the d2_pinball_score() function by comparing the predicted and true labels.

The code example demonstrates how to implement the d2_pinball_score() metric in scikit-learn for evaluating regression models. This metric is particularly useful for assessing the accuracy of probabilistic predictions in regression tasks.



See Also