Scikit-Learn NMF Model

Non-Negative Matrix Factorization (NMF) is a dimensionality reduction technique that decomposes a matrix into non-negative factors. It is particularly useful for data with non-negative values, such as text mining and image processing.

The key hyperparameters of NMF include n_components (number of components), init (initialization method), and solver (optimization algorithm).

NMF is suitable for dimensionality reduction tasks.

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.decomposition import NMF
import matplotlib.pyplot as plt
import numpy as np

# generate synthetic dataset
X, _ = make_classification(n_samples=100, n_features=10, random_state=1)
X = np.abs(X)

# fit the model
model = NMF(n_components=2, init='random', random_state=1)
X_transformed = model.fit_transform(X)

# plot the transformed dataset
plt.scatter(X_transformed[:, 0], X_transformed[:, 1])
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.title('NMF Transformed Data')
plt.show()

Running the example gives an output like:

Scikit-Learn NMF

The steps are as follows:

First, a synthetic dataset is generated using the make_classification() function. This creates a dataset with a specified number of samples (n_samples) and features (n_features), and a fixed random seed (random_state) for reproducibility.
Next, an NMF model is instantiated with n_components set to 2, using random initialization (init='random'). The model is then fit on the dataset using the fit_transform() method to perform the decomposition.
The transformed dataset is plotted using matplotlib to visualize the data in the reduced-dimensional space.

This example demonstrates the basic steps to apply NMF for dimensionality reduction using scikit-learn. The transformed data can then be used for further analysis or visualization.

See Also