SKLearner Home | About | Contact | Examples

Configure LinearRegression "positive" Parameter

The positive parameter in scikit-learn’s LinearRegression constrains the model coefficients to be non-negative.

This is useful when domain knowledge suggests that the features should have a positive relationship with the target variable, and negative coefficients would not be meaningful or interpretable.

By default, positive is set to False, allowing coefficients to be either positive or negative.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Generate synthetic dataset
X, y = make_regression(n_samples=100, n_features=5, n_informative=3, noise=10, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different 'positive' values
lr_unconstrained = LinearRegression(positive=False)
lr_unconstrained.fit(X_train, y_train)

lr_constrained = LinearRegression(positive=True)
lr_constrained.fit(X_train, y_train)

# Evaluate models
y_pred_unconstrained = lr_unconstrained.predict(X_test)
y_pred_constrained = lr_constrained.predict(X_test)

mse_unconstrained = mean_squared_error(y_test, y_pred_unconstrained)
mse_constrained = mean_squared_error(y_test, y_pred_constrained)

print(f"Unconstrained MSE: {mse_unconstrained:.2f}")
print(f"Constrained MSE: {mse_constrained:.2f}")

print("\nUnconstrained Coefficients:")
print(lr_unconstrained.coef_)

print("\nConstrained Coefficients:")
print(lr_constrained.coef_)

Running the example gives an output like:

Unconstrained MSE: 127.72
Constrained MSE: 119.50

Unconstrained Coefficients:
[57.20237595 35.281705   -0.73589378 63.00489133 -1.19808211]

Constrained Coefficients:
[57.21270229 35.18974713  0.         63.15067422  0.        ]

The key steps in this example are:

  1. Generate a synthetic regression dataset with some positively correlated features
  2. Split the data into train and test sets
  3. Train LinearRegression models with positive=False and positive=True
  4. Evaluate the mean squared error (MSE) of each model on the test set
  5. Print the model coefficients to show the effect of the positive parameter

Some tips and heuristics for using positive:

Issues to consider:



See Also