SKLearner Home | About | Contact | Examples

Scikit-Learn TruncatedSVD Model

TruncatedSVD is a dimensionality reduction algorithm that performs singular value decomposition on the input data matrix. It is commonly used to reduce the number of features in datasets, particularly those with a large number of features like text data and sparse matrices.

The key hyperparameters of TruncatedSVD include n_components, which specifies the number of singular values and vectors to compute, and algorithm, which determines the algorithm to use for computing the SVD. Common values for algorithm are 'randomized' and 'arpack'.

The algorithm is appropriate for dimensionality reduction tasks.

from sklearn.decomposition import TruncatedSVD
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# generate a high-dimensional dataset
X, y = make_classification(n_samples=100, n_features=50, n_classes=2, random_state=1)

# create and fit the model
svd = TruncatedSVD(n_components=2, random_state=1)
X_reduced = svd.fit_transform(X)

# plot the reduced dataset
plt.scatter(X_reduced[:, 0], X_reduced[:, 1], c=y)
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.title('2D projection using TruncatedSVD')
plt.show()

Running the example gives an output like:

Scikit-Learn TruncatedSVD

The steps are as follows:

  1. Generate Data: A synthetic high-dimensional dataset is generated using make_classification() with 100 samples and 50 features.

  2. Instantiate Model: A TruncatedSVD model is created with n_components=2 to reduce the data to 2 dimensions.

  3. Fit and Transform: The model is fit on the dataset and then used to transform the data, reducing its dimensionality.

  4. Visualize: The reduced data is plotted in a scatter plot, showing the projection of the data into 2D space.

This example demonstrates how to use TruncatedSVD for dimensionality reduction, transforming a high-dimensional dataset into a lower-dimensional space for visualization or further processing.



See Also