MaxAbsScaler
is a data preprocessing technique used to scale each feature by its maximum absolute value.
It is especially useful for data with a lot of zeros or sparse data.
This scaler preserves the sparsity of the data and scales the data to the range [-1, 1]. It is suitable for preprocessing data for machine learning algorithms.
from sklearn.datasets import make_classification
from sklearn.preprocessing import MaxAbsScaler
from sklearn.model_selection import train_test_split
import numpy as np
# generate synthetic dataset
X, y = make_classification(n_samples=100, n_features=5, random_state=1)
# split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
# create and fit MaxAbsScaler
scaler = MaxAbsScaler()
scaler.fit(X_train)
# transform the train and test sets
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)
# show a sample of the original and scaled data
print("Original data sample:", X_train[0])
print("Scaled data sample:", X_train_scaled[0])
Running the example gives an output like:
Original data sample: [ 0.9825172 0.58591043 -0.17816707 0.57699061 0.33847597]
Scaled data sample: [ 0.57929303 0.1933151 -0.06399996 0.25563568 0.1066192 ]
The steps are as follows:
Generate a synthetic dataset using
make_classification()
with specified features and a fixed random seed for reproducibility. Split the dataset into training and test sets usingtrain_test_split()
.Instantiate
MaxAbsScaler
and fit it on the training data using thefit()
method.Transform both the training and test sets using the
transform()
method of the scaler.Display a sample of the original and scaled data to illustrate the effect of the scaling.
This example demonstrates how to apply MaxAbsScaler
to a dataset, preserving sparsity and scaling features within the range of [-1, 1].