SKLearner Home | About | Contact | Examples

Scikit-Learn LassoLarsIC Regression Model

LassoLarsIC is a linear model that uses the LARS (Least Angle Regression) algorithm with cross-validation to determine the best value for the regularization parameter. This algorithm is suitable for regression problems and helps to automatically select important features by applying L1 regularization.

The key hyperparameters of LassoLarsIC include the criterion (either ‘aic’ or ‘bic’), which determines the information criterion used for model selection.

The algorithm is appropriate for regression problems.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LassoLarsIC
from sklearn.metrics import mean_squared_error

# generate regression dataset
X, y = make_regression(n_samples=100, n_features=10, noise=0.1, random_state=1)

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# create model
model = LassoLarsIC(criterion='aic')

# fit model
model.fit(X_train, y_train)

# evaluate model
yhat = model.predict(X_test)
mse = mean_squared_error(y_test, yhat)
print('Mean Squared Error: %.3f' % mse)

# make a prediction
row = [[0.5, -0.1, 0.3, 0.2, -0.5, 0.1, -0.3, 0.2, 0.1, 0.3]]
yhat = model.predict(row)
print('Predicted: %.3f' % yhat[0])

Running the example gives an output like:

Mean Squared Error: 0.013
Predicted: 74.900

The steps are as follows:

  1. Generate a synthetic regression dataset using the make_regression() function. This creates a dataset with a specified number of samples (n_samples), features (n_features), and noise level (noise) for added variability. Use a fixed random seed (random_state) for reproducibility. Split the dataset into training and testing sets using train_test_split().

  2. Create a LassoLarsIC model with the criterion set to ‘aic’ (Akaike Information Criterion).

  3. Fit the model on the training data using the fit() method.

  4. Evaluate the model’s performance by predicting on the test set and calculating the mean squared error (mse) using the mean_squared_error() function.

  5. Make a single prediction by passing a new data sample to the predict() method.

This example demonstrates how to use the LassoLarsIC model for regression tasks, emphasizing its capability to automatically select important features through regularization. The use of the AIC criterion helps in model selection, ensuring a balance between model complexity and goodness of fit.



See Also