LassoLars is a regression algorithm that combines the Lasso method with the Least Angle Regression (LARS) algorithm. LassoLarsCV, on the other hand, extends LassoLars by incorporating built-in cross-validation for automatic hyperparameter tuning.
In scikit-learn, the LassoLars
class provides an implementation of the Lasso model using the LARS algorithm. Key hyperparameters include alpha
(regularization strength) and fit_intercept
(whether to calculate the intercept). Manually tuning these hyperparameters can be challenging without prior knowledge.
LassoLarsCV
simplifies this process by using cross-validation to automatically select the optimal alpha value. Its key hyperparameters include cv
(number of cross-validation folds) and alphas
(list of alpha values to try). This automation helps ensure better model performance but at the cost of increased computational time.
The primary difference between the two is that LassoLars
requires manual alpha selection, while LassoLarsCV
automates this process. LassoLars
is faster and suitable for quick experiments when good alpha values are known, whereas LassoLarsCV
is better for thorough model selection, especially with new datasets.
from sklearn.datasets import make_regression
from sklearn.linear_model import LassoLars, LassoLarsCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Generate synthetic regression dataset
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Fit and evaluate LassoLars with default hyperparameters
lasso_lars = LassoLars(alpha=1.0, fit_intercept=True)
lasso_lars.fit(X_train, y_train)
y_pred_lars = lasso_lars.predict(X_test)
print(f"LassoLars MSE: {mean_squared_error(y_test, y_pred_lars):.3f}")
# Fit and evaluate LassoLarsCV with cross-validation
lasso_lars_cv = LassoLarsCV(cv=5)
lasso_lars_cv.fit(X_train, y_train)
y_pred_lars_cv = lasso_lars_cv.predict(X_test)
print(f"\nLassoLarsCV MSE: {mean_squared_error(y_test, y_pred_lars_cv):.3f}")
print(f"Best alpha: {lasso_lars_cv.alpha_}")
Running the example gives an output like:
LassoLars MSE: 10.715
LassoLarsCV MSE: 0.011
Best alpha: 0.0014663914873503468
- Generate a synthetic regression dataset using
make_regression
. - Split the data into training and test sets using
train_test_split
. - Instantiate
LassoLars
with default hyperparameters, fit it on the training data, and evaluate its performance on the test set. - Instantiate
LassoLarsCV
with 5-fold cross-validation, fit it on the training data, and evaluate its performance on the test set. - Compare the test set performance (mean squared error) of both models and print the best alpha found by
LassoLarsCV
.