Scikit-Learn ElasticNet Regression Model

ElasticNet is a linear regression model that combines L1 (Lasso) and L2 (Ridge) regularization techniques. It is particularly useful when dealing with high-dimensional datasets that may have correlated features.

The key hyperparameters of ElasticNet are alpha, which controls the overall regularization strength, and l1_ratio, which determines the balance between L1 and L2 penalties. Common values for alpha range from 0.1 to 1.0, while l1_ratio is typically set between 0 and 1.

ElasticNet is suitable for regression problems, especially when working with datasets that have a large number of features.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_absolute_error

# generate regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.5, random_state=42)

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# create model
model = ElasticNet(alpha=0.5, l1_ratio=0.5)

# fit model
model.fit(X_train, y_train)

# evaluate model
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
print('Mean Absolute Error: %.3f' % mae)

# make a prediction
row = [[1.91964511, 1.05497484, 0.58295121, 0.16552129, 1.7411697,
        0.19983267, 0.51308785, 0.20136889, 1.35178086, 0.14795505]]
yhat = model.predict(row)
print('Predicted Value: %.3f' % yhat[0])

Running the example gives an output like:

Mean Absolute Error: 46.678
Predicted Value: 273.374

The steps in this example are as follows:

A synthetic regression dataset is generated using make_regression() with specified parameters such as the number of samples (n_samples), features (n_features), and noise level (noise). The dataset is then split into training and test sets using train_test_split().
An ElasticNet model is created with alpha set to 0.5 (equal weighting of L1 and L2 penalties) and l1_ratio set to 0.5 (equal balance between L1 and L2 regularization). The model is then fit on the training data using the fit() method.
The model’s performance is evaluated by making predictions on the test set (y_pred) and comparing them to the actual values (y_test) using the mean absolute error metric.
A single prediction is made by passing a new data point to the predict() method.

This example demonstrates how to quickly set up and use an ElasticNet model for regression tasks in scikit-learn. It showcases the simplicity of creating, fitting, and evaluating the model, as well as making predictions on new data.

See Also