Scikit-Learn explained_variance_score() Metric

Explained variance score measures the proportion of the variance in the dependent variable that is predictable from the independent variables.

It compares the variance of the errors (residuals) of the model with the variance of the actual values. The score ranges from 0 to 1, where 1 indicates perfect prediction.

A value close to 1 indicates that the model explains a large portion of the variance. Values closer to 0 suggest the model does not explain much of the variance.

This metric is commonly used in regression problems to understand how well the model captures the variability of the target variable. However, it does not account for overfitting and may not be reliable for small datasets or highly complex models.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import explained_variance_score

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict on test set
y_pred = model.predict(X_test)

# Calculate explained variance score
evs = explained_variance_score(y_test, y_pred)
print(f"Explained Variance Score: {evs:.2f}")

Running the example gives an output like:

Explained Variance Score: 1.00

The steps are as follows:

Generate a synthetic regression dataset using make_regression() with 1000 samples and 20 features.
Split the dataset into training and test sets using train_test_split(), reserving 20% for testing.
Train a LinearRegression model on the training data using the fit() method.
Use the trained model to predict the target values for the test set with predict().
Calculate the explained variance score using explained_variance_score(), comparing the predicted and true values of the test set.
Print the resulting explained variance score to evaluate the model’s performance.

This example demonstrates how to use the explained_variance_score() function from scikit-learn to evaluate the performance of a regression model.

See Also