Scikit-Learn mean_absolute_error() Metric

Mean Absolute Error (MAE) measures the average magnitude of errors in a set of predictions, without considering their direction.

It is calculated as the average of the absolute differences between predicted and actual values.

A lower MAE indicates better performance, with 0 being perfect. MAE is used for regression problems to evaluate the accuracy of continuous predictions.

It doesn’t consider the direction of errors and may not highlight larger errors as much as other metrics like Mean Squared Error (MSE).

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error

# Generate synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=1, noise=0.1, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict on test set
y_pred = model.predict(X_test)

# Calculate mean absolute error
mae = mean_absolute_error(y_test, y_pred)
print(f"Mean Absolute Error: {mae:.2f}")

Running the example gives an output like:

Mean Absolute Error: 0.08

Generate a synthetic regression dataset using the make_regression() function, creating data with 1000 samples and 1 feature.
Split the dataset into training and testing sets using train_test_split(), reserving 20% of the data for testing.
Train a linear regression model using the LinearRegression class from scikit-learn, fitting it to the training data with fit().
Predict the target values for the test set using the trained model’s predict() method.
Calculate the Mean Absolute Error (MAE) using the mean_absolute_error() function by comparing the predicted values (y_pred) to the actual test values (y_test).

This example demonstrates how to use the mean_absolute_error() function from scikit-learn to evaluate the performance of a regression model.

See Also