MultiTaskLassoCV
is an algorithm for performing multi-task regression, where multiple regression tasks are solved jointly, and the model parameters are regularized.
The key hyperparameters include alphas
(array of alpha values to try), cv
(cross-validation splitting strategy), and n_jobs
(number of jobs to run in parallel).
This algorithm is appropriate for multi-target regression problems, where multiple outputs are predicted simultaneously.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import MultiTaskLassoCV
from sklearn.metrics import mean_squared_error
# generate multi-output regression dataset
X, y = make_regression(n_samples=100, n_features=20, n_targets=3, noise=0.1, random_state=1)
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
# create model
model = MultiTaskLassoCV(cv=5)
# fit model
model.fit(X_train, y_train)
# evaluate model
yhat = model.predict(X_test)
mse = mean_squared_error(y_test, yhat)
print('Mean Squared Error: %.3f' % mse)
# make a prediction
row = [[-0.255214, -0.306027, -1.550660, -0.974401, -0.414431, 1.103627, -0.684111, -0.788554, -0.888296, -0.977236,
-0.435426, 0.252740, 0.202236, -0.939693, 0.042733, -0.191863, -1.577503, -0.637749, 0.421209, -0.503362]]
yhat = model.predict(row)
print('Predicted: %s' % yhat[0])
Running the example gives an output like:
Mean Squared Error: 0.212
Predicted: [-366.49365558 -309.28908121 -327.604364 ]
The steps are as follows:
First, a synthetic multi-output regression dataset is generated using the
make_regression()
function. This creates a dataset with a specified number of samples (n_samples
), features (n_features
), targets (n_targets
), and noise level (noise
), with a fixed random seed (random_state
) for reproducibility. The dataset is split into training and test sets usingtrain_test_split()
.Next, a
MultiTaskLassoCV
model is instantiated with the cross-validation parameter set to 5. The model is then fit on the training data using thefit()
method.The performance of the model is evaluated by comparing the predictions (
yhat
) to the actual values (y_test
) using the mean squared error metric.A single prediction can be made by passing a new data sample to the
predict()
method.
This example demonstrates how to set up and use a MultiTaskLassoCV
model for multi-target regression tasks, showcasing the simplicity and effectiveness of this algorithm in scikit-learn.
The model can handle multiple outputs simultaneously and is particularly useful when the outputs are related or when there is a need to enforce sparsity in the model coefficients. Once fit, the model can be used to make predictions on new data, enabling its use in real-world multi-target regression problems.