Non-Negative Matrix Factorization (NMF) is a dimensionality reduction technique that decomposes a matrix into non-negative factors. It is particularly useful for data with non-negative values, such as text mining and image processing.
The key hyperparameters of NMF include n_components (number of components), init (initialization method), and solver (optimization algorithm).
NMF is suitable for dimensionality reduction tasks.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.decomposition import NMF
import matplotlib.pyplot as plt
import numpy as np
# generate synthetic dataset
X, _ = make_classification(n_samples=100, n_features=10, random_state=1)
X = np.abs(X)
# fit the model
model = NMF(n_components=2, init='random', random_state=1)
X_transformed = model.fit_transform(X)
# plot the transformed dataset
plt.scatter(X_transformed[:, 0], X_transformed[:, 1])
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.title('NMF Transformed Data')
plt.show()
Running the example gives an output like:

The steps are as follows:
First, a synthetic dataset is generated using the
make_classification()function. This creates a dataset with a specified number of samples (n_samples) and features (n_features), and a fixed random seed (random_state) for reproducibility.Next, an
NMFmodel is instantiated withn_componentsset to 2, using random initialization (init='random'). The model is then fit on the dataset using thefit_transform()method to perform the decomposition.The transformed dataset is plotted using
matplotlibto visualize the data in the reduced-dimensional space.
This example demonstrates the basic steps to apply NMF for dimensionality reduction using scikit-learn. The transformed data can then be used for further analysis or visualization.