LassoLarsIC is a linear model that uses the LARS (Least Angle Regression) algorithm with cross-validation to determine the best value for the regularization parameter. This algorithm is suitable for regression problems and helps to automatically select important features by applying L1 regularization.
The key hyperparameters of LassoLarsIC
include the criterion
(either ‘aic’ or ‘bic’), which determines the information criterion used for model selection.
The algorithm is appropriate for regression problems.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LassoLarsIC
from sklearn.metrics import mean_squared_error
# generate regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=1)
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
# create model
model = LassoLarsIC(criterion='aic')
# fit model
model.fit(X_train, y_train)
# evaluate model
yhat = model.predict(X_test)
mse = mean_squared_error(y_test, yhat)
print('Mean Squared Error: %.3f' % mse)
# make a prediction
row = [[0.5, -0.1, 0.3, 0.2, -0.5, 0.1, -0.3, 0.2, 0.1, 0.3]]
yhat = model.predict(row)
print('Predicted: %.3f' % yhat[0])
Running the example gives an output like:
Mean Squared Error: 0.013
Predicted: 74.900
The steps are as follows:
Generate a synthetic regression dataset using the
make_regression()
function. This creates a dataset with a specified number of samples (n_samples
), features (n_features
), and noise level (noise
) for added variability. Use a fixed random seed (random_state
) for reproducibility. Split the dataset into training and testing sets usingtrain_test_split()
.Create a
LassoLarsIC
model with thecriterion
set to ‘aic’ (Akaike Information Criterion).Fit the model on the training data using the
fit()
method.Evaluate the model’s performance by predicting on the test set and calculating the mean squared error (
mse
) using themean_squared_error()
function.Make a single prediction by passing a new data sample to the
predict()
method.
This example demonstrates how to use the LassoLarsIC
model for regression tasks, emphasizing its capability to automatically select important features through regularization. The use of the AIC criterion helps in model selection, ensuring a balance between model complexity and goodness of fit.