The Labeled Faces in the Wild (LFW) pairs dataset is used for evaluating face verification algorithms.
This dataset contains pairs of images with labels indicating whether the pairs match or not.
Key arguments include subset
to specify the portion of the dataset to load, and color
to determine if the images are loaded in color.
This is a classification problem where algorithms like Support Vector Machines (SVMs) and Convolutional Neural Networks (CNNs) are often applied.
from sklearn.datasets import fetch_lfw_pairs
# Fetch the dataset
dataset = fetch_lfw_pairs(subset='train', color=True)
# Display dataset shape and types
print(f"Dataset shape: {dataset.pairs.shape}")
print(f"Image pair shape: {dataset.pairs[0].shape}")
print(f"Labels shape: {dataset.target.shape}")
# Show summary statistics
print(f"Number of pairs: {len(dataset.pairs)}")
print(f"Number of positive pairs: {sum(dataset.target == 1)}")
print(f"Number of negative pairs: {sum(dataset.target == 0)}")
# Display first few values of the dataset
print(f"First pair images shapes: {dataset.pairs[0][0].shape}, {dataset.pairs[0][1].shape}")
print(f"First pair label: {dataset.target[0]}")
Running the example gives an output like:
Dataset shape: (2200, 2, 62, 47, 3)
Image pair shape: (2, 62, 47, 3)
Labels shape: (2200,)
Number of pairs: 2200
Number of positive pairs: 1100
Number of negative pairs: 1100
First pair images shapes: (62, 47, 3), (62, 47, 3)
First pair label: 1
Import the
fetch_lfw_pairs
function fromsklearn.datasets
:- This function allows loading the LFW pairs dataset directly from the scikit-learn library.
Fetch the dataset using
fetch_lfw_pairs()
:- Use
subset='train'
to load the training portion of the dataset. - Use
color=True
to load images in color.
- Use
Print the dataset shape and types:
- Access the shape of pairs using
dataset.pairs.shape
. - Show the shape of a single image pair using
dataset.pairs[0].shape
. - Show the shape of labels using
dataset.target.shape
.
- Access the shape of pairs using
Display summary statistics:
- Print the number of pairs using
len(dataset.pairs)
. - Show the number of positive and negative pairs using
sum(dataset.target == 1)
andsum(dataset.target == 0)
, respectively.
- Print the number of pairs using
Display the first few values of the dataset:
- Print the shapes of the first pair images using
dataset.pairs[0][0].shape
anddataset.pairs[0][1].shape
. - Print the label of the first pair using
dataset.target[0]
.
- Print the shapes of the first pair images using
This example demonstrates how to quickly load and explore the LFW pairs dataset using scikit-learn’s fetch_lfw_pairs()
function, allowing you to inspect the data’s shape, types, summary statistics, and visualize key features. This sets the stage for further preprocessing and application of face verification algorithms.