Scikit-Learn SkewedChi2Sampler for Data Preparation

SkewedChi2Sampler is a feature map for the chi-squared kernel, used to approximate the chi-squared kernel, which is beneficial in non-linear feature mapping.

The key hyperparameters include skewedness, which adjusts the skew of the kernel, and n_components, which determines the number of features.

This transformation is suitable for datasets where chi-squared kernel approximation can enhance kernel-based learning algorithms.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.kernel_approximation import SkewedChi2Sampler
import matplotlib.pyplot as plt
import numpy as np

# generate synthetic dataset
X, y = make_classification(n_samples=100, n_features=5, random_state=1)
X = np.abs(X)

# apply SkewedChi2Sampler
scs = SkewedChi2Sampler(skewedness=0.5, n_components=2)
X_transformed = scs.fit_transform(X)

# plot the transformed data
plt.scatter(X_transformed[:, 0], X_transformed[:, 1], c=y, cmap='viridis')
plt.title('Transformed Data using SkewedChi2Sampler')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.show()

Running the example gives an output like:

Scikit-Learn SkewedChi2Sampler plot

Generate a synthetic classification dataset using make_classification(). The dataset contains 100 samples with 5 features.
Instantiate SkewedChi2Sampler with skewedness=0.5 and n_components=2. This sets up the feature mapper to use a chi-squared kernel approximation with the specified skewness and number of output features.
Transform the dataset using fit_transform() method of SkewedChi2Sampler.
Plot the transformed data using matplotlib to visualize the effect of the transformation.

This example demonstrates how to use SkewedChi2Sampler to apply chi-squared kernel approximation to a dataset, showing how the transformation affects the data in a 2D plot.

See Also