SKLearner Home | About | Contact | Examples

Configure Lasso "precompute" Parameter

The precompute parameter in scikit-learn’s Lasso class allows you to specify whether to precompute the Gram matrix (X^T * X) or compute it on-the-fly.

Lasso, or Least Absolute Shrinkage and Selection Operator, is a linear regression model that performs L1 regularization. It adds a penalty term to the loss function, encouraging sparse coefficients and feature selection.

The precompute parameter can be set to True, False, or an array-like object. When True, the Gram matrix is precomputed before fitting the model. When False, it’s computed on-the-fly during training. You can also pass a precomputed Gram matrix.

The default value for precompute is False.

In practice, setting precompute to True is beneficial when the number of features is large compared to the number of samples, as it can speed up training. However, it requires more memory to store the precomputed matrix.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso
from sklearn.metrics import r2_score
import time

# Generate synthetic dataset
X, y = make_regression(n_samples=100000, n_features=1000, noise=0.5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different precompute settings
precompute_settings = [True, False]
scores = []
times = []

for setting in precompute_settings:
    start = time.time()
    lasso = Lasso(precompute=setting, random_state=42)
    lasso.fit(X_train, y_train)
    y_pred = lasso.predict(X_test)
    score = r2_score(y_test, y_pred)
    end = time.time()
    scores.append(score)
    times.append(end - start)
    print(f"precompute={setting}, R^2 Score: {score:.3f}, Time: {end - start:.3f}s")

Running the example gives an output like:

precompute=True, R^2 Score: 1.000, Time: 1.905s
precompute=False, R^2 Score: 1.000, Time: 1.657s

The key steps in this example are:

  1. Generate a synthetic regression dataset with 1000 features
  2. Split the data into train and test sets
  3. Train Lasso models with precompute set to True and False
  4. Evaluate the R^2 score and training time for each model

Some tips and heuristics for setting precompute:

Issues to consider:



See Also