SKLearner Home | About | Contact | Examples

Scikit-Learn EllipticEnvelope Model

Elliptic Envelope is an algorithm used for identifying outliers in data. It fits a robust covariance estimate to the dataset, determining the shape of the data distribution and identifying points that deviate significantly.

The key hyperparameters of EllipticEnvelope include contamination (the proportion of outliers in the data) and support_fraction (the proportion of points to be included in the support of the raw MCD estimate).

The algorithm is appropriate for outlier detection in various problem types, particularly useful in anomaly detection scenarios.

from sklearn.datasets import make_blobs
from sklearn.covariance import EllipticEnvelope
import matplotlib.pyplot as plt
import numpy as np

# generate synthetic dataset with outliers
X, _ = make_blobs(n_samples=300, centers=1, cluster_std=1.0, random_state=42)
X_outliers = X[::10] + 5  # introduce some outliers
X = np.concatenate([X, X_outliers], axis=0)

# create EllipticEnvelope model
model = EllipticEnvelope(contamination=0.1)

# fit model
model.fit(X)

# predict outliers
yhat = model.predict(X)

# plot the data and the outliers
plt.scatter(X[:, 0], X[:, 1], c=yhat, cmap='coolwarm', edgecolor='k', s=20)
plt.title('Elliptic Envelope Outlier Detection')
plt.show()

Running the example gives an output like:

Scikit-Learn EllipticEnvelope

The steps are as follows:

This example demonstrates how to quickly set up and use an EllipticEnvelope model for outlier detection tasks, showcasing the algorithm’s ability to identify anomalies in a dataset effectively.



See Also