SKLearner Home | About | Contact | Examples

Configure LinearDiscriminantAnalysis "covariance_estimator" Parameter

The covariance_estimator parameter in scikit-learn’s LinearDiscriminantAnalysis allows you to specify a custom method for estimating class covariance matrices.

Linear Discriminant Analysis (LDA) is a classification algorithm that projects data onto a lower-dimensional space to maximize class separability. It assumes classes have identical covariance matrices.

The covariance_estimator parameter determines how these class covariance matrices are estimated. By default, LDA uses the empirical covariance, but custom estimators can improve performance, especially with high-dimensional data or small sample sizes.

The default value for covariance_estimator is None, which uses the empirical covariance. Common alternatives include shrinkage estimators and the Ledoit-Wolf estimator.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.covariance import ShrunkCovariance, LedoitWolf
from sklearn.metrics import accuracy_score

# Generate synthetic dataset
X, y = make_classification(n_samples=1000, n_features=20, n_classes=3,
                           n_informative=10, random_state=42)

# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train LDA models with different covariance estimators
estimators = [
    ('Default', None),
    ('Shrinkage', ShrunkCovariance(shrinkage=0.5)),
    ('Ledoit-Wolf', LedoitWolf())
]

for name, estimator in estimators:
    lda = LinearDiscriminantAnalysis(solver='lsqr', covariance_estimator=estimator)
    lda.fit(X_train, y_train)
    y_pred = lda.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    print(f"{name} estimator accuracy: {accuracy:.3f}")

Running the example gives an output like:

Default estimator accuracy: 0.740
Shrinkage estimator accuracy: 0.735
Ledoit-Wolf estimator accuracy: 0.740

The key steps in this example are:

  1. Generate a synthetic multi-class dataset suitable for LDA
  2. Split the data into train and test sets
  3. Create LDA models with different covariance estimators
  4. Train the models and evaluate their accuracy on the test set

Some tips for using covariance_estimator:

Issues to consider:



See Also