SKLearner Home | About | Contact | Examples

Configure LinearRegression "n_jobs" Parameter

The n_jobs parameter in scikit-learn’s LinearRegression controls the number of jobs to run in parallel when fitting the model. By leveraging multiple cores, it can significantly speed up the training process, especially on larger datasets.

LinearRegression is an ordinary least squares linear regression model. It fits a linear model to minimize the residual sum of squares between the observed targets and the predictions.

The n_jobs parameter determines the number of jobs run in parallel. Each job is run on a separate processing core for efficient computation.

The default value for n_jobs is 1, meaning no parallelism is used. Setting n_jobs to -1 will use all available cores on the machine.

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import time

# Generate synthetic dataset
X, y = make_regression(n_samples=10000, n_features=1000, noise=0.5, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with different n_jobs values
n_jobs_values = [1, 2, 3, 4, -1]
fit_times = []

for n in n_jobs_values:
    start = time.time()
    lr = LinearRegression(n_jobs=n)
    lr.fit(X_train, y_train)
    end = time.time()
    fit_time = end - start
    fit_times.append(fit_time)
    print(f"n_jobs={n}, Fit Time: {fit_time:.3f} seconds")

Running the example gives an output like:

n_jobs=1, Fit Time: 0.938 seconds
n_jobs=2, Fit Time: 0.837 seconds
n_jobs=3, Fit Time: 0.830 seconds
n_jobs=4, Fit Time: 1.307 seconds
n_jobs=-1, Fit Time: 1.978 seconds

The key steps in this example are:

  1. Generate a large synthetic regression dataset with noise
  2. Split the data into train and test sets
  3. Train LinearRegression models with different n_jobs values
  4. Compare the model fit times for each n_jobs setting

Some tips and heuristics for setting n_jobs:

Issues to consider:



See Also