Scikit-Learn mean_poisson_deviance() Metric

Mean Poisson Deviance is a measure of how well a model predicts counts or event rates.

It is calculated as the average of the Poisson deviance between the observed and predicted values.

Lower values indicate better model performance, with a value of 0 indicating perfect predictions.

This metric is commonly used for count data or event rates, such as predicting the number of occurrences of an event in a fixed period. However, it is not suitable for continuous or binary classification problems, being only appropriate for count data.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import PoissonRegressor
from sklearn.metrics import mean_poisson_deviance

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=2, noise=0.1, random_state=42)
y = y - y.min()  # Shift target to ensure it's positive

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Poisson Regressor
model = PoissonRegressor()
model.fit(X_train, y_train)

# Predict on test set
y_pred = model.predict(X_test)

# Calculate mean Poisson deviance
mpd = mean_poisson_deviance(y_test, y_pred)
print(f"Mean Poisson Deviance: {mpd:.2f}")

Running the example gives an output like:

Mean Poisson Deviance: 0.41

The steps are as follows:

Generate a synthetic regression dataset using make_regression(), ensuring target values are positive. This step ensures we simulate a problem appropriate for Poisson regression.
Split the dataset into training and test sets using train_test_split(). This allows us to evaluate the model’s performance on unseen data.
Train a PoissonRegressor on the training set. This model is suitable for count data regression.
Use the trained regressor to make predictions on the test set with predict(). This step applies the model to new data.
Calculate the mean Poisson deviance of the predictions using mean_poisson_deviance() by comparing the predicted values to the true values. This metric gives us a quantitative measure of the model’s performance.

This example demonstrates how to use the mean_poisson_deviance() function from scikit-learn to evaluate the performance of a regression model for count data.

See Also