Scikit-Learn MultiTaskLasso Model

MultiTaskLasso is an extension of Lasso regression that can handle multiple regression tasks simultaneously by enforcing sparsity across all tasks. This method is particularly useful for problems where outputs are related or influenced by common features.

Key hyperparameters for MultiTaskLasso include alpha (regularization strength) and max_iter (maximum number of iterations for optimization).

This algorithm is suitable for multi-output regression problems, where there are multiple dependent variables to predict simultaneously.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import MultiTaskLasso
from sklearn.metrics import mean_squared_error

# generate synthetic dataset for multi-output regression
X, y = make_regression(n_samples=100, n_features=5, n_targets=3, noise=0.1, random_state=1)

# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

# create model
model = MultiTaskLasso(alpha=1.0, max_iter=1000)

# fit model
model.fit(X_train, y_train)

# evaluate model
yhat = model.predict(X_test)
mse = mean_squared_error(y_test, yhat)
print('Mean Squared Error: %.3f' % mse)

# make a prediction
row = [[-1.10325445, -0.49821356, -0.05962247, -0.89224592, -0.70158632]]
yhat = model.predict(row)
print('Predicted: %s' % yhat[0])

Running the example gives an output like:

Mean Squared Error: 1.595
Predicted: [-170.0502607   -95.75512989 -138.72270689]

The steps are as follows:

Generate a synthetic multi-output regression dataset using the make_regression() function. This creates a dataset with a specified number of samples (n_samples), features (n_features), and target variables (n_targets). A fixed random seed (random_state) ensures reproducibility. Split the dataset into training and test sets using train_test_split().
Instantiate a MultiTaskLasso model with specified hyperparameters (alpha for regularization strength and max_iter for the maximum number of iterations).
Fit the MultiTaskLasso model on the training data using the fit() method.
Evaluate the model’s performance by predicting the test set values (yhat) and comparing them to the actual values (y_test) using the mean squared error metric.
Make a single prediction by passing a new data sample to the predict() method.

This example demonstrates how to implement and use a MultiTaskLasso model for multi-output regression tasks, highlighting its ability to handle multiple related regression problems simultaneously with enforced sparsity across tasks.

See Also