SKLearner Home | About | Contact | Examples

Configure Lasso "selection" Parameter

Lasso (Least Absolute Shrinkage and Selection Operator) is a linear regression model that performs regularization and feature selection by adding an L1 penalty term to the loss function. The selection parameter in scikit-learn’s Lasso class determines the algorithm used to select variables at each iteration.

The two options for selection are ‘cyclic’ (default) and ‘random’. ‘cyclic’ selects features sequentially, while ‘random’ selects them randomly.

The default value for selection is ‘cyclic’. In practice, ‘cyclic’ is often used for its deterministic behavior, but ‘random’ can be faster for high-dimensional datasets.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=1000, n_features=100, n_informative=10,
                       noise=5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different selection values
selection_values = ['cyclic', 'random']
mse_scores = []

for sel in selection_values:
    lasso = Lasso(selection=sel, random_state=42)
    lasso.fit(X_train, y_train)
    y_pred = lasso.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_scores.append(mse)
    print(f"selection={sel}, MSE: {mse:.3f}")

Running the example gives an output like:

selection=cyclic, MSE: 38.693
selection=random, MSE: 38.618

The key steps in this example are:

  1. Generate a synthetic regression dataset with informative and noise features
  2. Split the data into train and test sets
  3. Train Lasso models with different selection values
  4. Evaluate the mean squared error of each model on the test set

Some tips and heuristics for setting selection:

Issues to consider:



See Also