Mean Poisson Deviance is a measure of how well a model predicts counts or event rates.
It is calculated as the average of the Poisson deviance between the observed and predicted values.
Lower values indicate better model performance, with a value of 0 indicating perfect predictions.
This metric is commonly used for count data or event rates, such as predicting the number of occurrences of an event in a fixed period. However, it is not suitable for continuous or binary classification problems, being only appropriate for count data.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import PoissonRegressor
from sklearn.metrics import mean_poisson_deviance
# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=2, noise=0.1, random_state=42)
y = y - y.min() # Shift target to ensure it's positive
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a Poisson Regressor
model = PoissonRegressor()
model.fit(X_train, y_train)
# Predict on test set
y_pred = model.predict(X_test)
# Calculate mean Poisson deviance
mpd = mean_poisson_deviance(y_test, y_pred)
print(f"Mean Poisson Deviance: {mpd:.2f}")
Running the example gives an output like:
Mean Poisson Deviance: 0.41
The steps are as follows:
- Generate a synthetic regression dataset using
make_regression()
, ensuring target values are positive. This step ensures we simulate a problem appropriate for Poisson regression. - Split the dataset into training and test sets using
train_test_split()
. This allows us to evaluate the model’s performance on unseen data. - Train a
PoissonRegressor
on the training set. This model is suitable for count data regression. - Use the trained regressor to make predictions on the test set with
predict()
. This step applies the model to new data. - Calculate the mean Poisson deviance of the predictions using
mean_poisson_deviance()
by comparing the predicted values to the true values. This metric gives us a quantitative measure of the model’s performance.
This example demonstrates how to use the mean_poisson_deviance()
function from scikit-learn to evaluate the performance of a regression model for count data.