SKLearner Home | About | Contact | Examples

Scikit-Learn fetch_olivetti_faces() Dataset

The Olivetti Faces dataset consists of grayscale images of faces, which is commonly used for facial recognition and image classification tasks.

Key function arguments when loading the dataset include return_X_y to specify if data should be returned as a tuple, and shuffle to randomize the order of the data.

This is a classification problem where common algorithms like Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), and Convolutional Neural Networks (CNNs) are often applied.

from sklearn.datasets import fetch_olivetti_faces
import matplotlib.pyplot as plt

# Fetch the dataset
dataset = fetch_olivetti_faces()

# Display dataset shape and types
print(f"Dataset shape: {dataset.data.shape}")
print(f"Target shape: {dataset.target.shape}")

# Show summary statistics
print(f"Unique targets: {set(dataset.target)}")

# Display first few rows of the dataset
print(f"First few images:\n{dataset.images[:5]}")

# Plot example images
fig, axes = plt.subplots(1, 5, figsize=(10, 2.5))
for i, ax in enumerate(axes):
    ax.imshow(dataset.images[i], cmap='gray')
    ax.axis('off')
plt.show()

Running the example gives an output like:

scikit_learn_data
Dataset shape: (400, 4096)
Target shape: (400,)
Unique targets: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39}
First few images:
[[[0.30991736 0.3677686  0.41735536 ... 0.37190083 0.3305785  0.30578512]
  [0.3429752  0.40495867 0.43801653 ... 0.37190083 0.338843   0.3140496 ]
  [0.3429752  0.41735536 0.45041323 ... 0.38016528 0.338843   0.29752067]
  ...
  [0.21487603 0.20661157 0.2231405  ... 0.15289256 0.16528925 0.17355372]
  [0.20247933 0.2107438  0.2107438  ... 0.14876033 0.16115703 0.16528925]
  [0.20247933 0.20661157 0.20247933 ... 0.15289256 0.16115703 0.1570248 ]]

 [[0.45454547 0.47107437 0.5123967  ... 0.19008264 0.18595041 0.18595041]
  [0.446281   0.48347107 0.5206612  ... 0.21487603 0.2107438  0.2107438 ]
  [0.49586776 0.5165289  0.53305787 ... 0.20247933 0.20661157 0.20661157]
  ...
  [0.77272725 0.78099173 0.7933884  ... 0.1446281  0.1446281  0.1446281 ]
  [0.77272725 0.7768595  0.7892562  ... 0.13636364 0.13636364 0.13636364]
  [0.7644628  0.7892562  0.78099173 ... 0.15289256 0.15289256 0.15289256]]

 [[0.3181818  0.40082645 0.49173555 ... 0.40082645 0.3553719  0.30991736]
  [0.30991736 0.3966942  0.47933885 ... 0.40495867 0.37603307 0.30165288]
  [0.26859504 0.34710744 0.45454547 ... 0.3966942  0.37190083 0.30991736]
  ...
  [0.1322314  0.09917355 0.08264463 ... 0.13636364 0.14876033 0.15289256]
  [0.11570248 0.09504132 0.0785124  ... 0.1446281  0.1446281  0.1570248 ]
  [0.11157025 0.09090909 0.0785124  ... 0.14049587 0.14876033 0.15289256]]

 [[0.1983471  0.19421488 0.19421488 ... 0.58264464 0.5123967  0.45867768]
  [0.21900827 0.21900827 0.21487603 ... 0.5661157  0.5123967  0.45041323]
  [0.23966943 0.23966943 0.23966943 ... 0.59090906 0.5        0.46280992]
  ...
  [0.13636364 0.14049587 0.16115703 ... 0.76033056 0.7644628  0.7355372 ]
  [0.14876033 0.14876033 0.14876033 ... 0.76033056 0.75619835 0.74380165]
  [0.14876033 0.14876033 0.14876033 ... 0.75206614 0.75206614 0.73966944]]

 [[0.5        0.54545456 0.58264464 ... 0.2231405  0.2231405  0.2231405 ]
  [0.47933885 0.5123967  0.58264464 ... 0.20247933 0.20247933 0.20247933]
  [0.49173555 0.5413223  0.59504133 ... 0.21487603 0.21487603 0.21487603]
  ...
  [0.4752066  0.41735536 0.40082645 ... 0.19421488 0.19421488 0.19421488]
  [0.4752066  0.44214877 0.41735536 ... 0.16528925 0.16528925 0.16528925]
  [0.4876033  0.446281   0.4338843  ... 0.17768595 0.17355372 0.17355372]]]

Scikit-Learn fetch_olivetti_faces() plot

The steps are as follows:

  1. Import the fetch_olivetti_faces function from sklearn.datasets and matplotlib.pyplot for plotting:

    • This function allows us to load the Olivetti Faces dataset directly from the scikit-learn library.
    • Use matplotlib.pyplot to visualize the images.
  2. Fetch the dataset using fetch_olivetti_faces():

    • Load the dataset with default parameters.
  3. Print the dataset shape and target shape:

    • Access the shape of the data using dataset.data.shape.
    • Show the shape of the target labels using dataset.target.shape.
  4. Display summary statistics:

    • Show the unique target labels using set(dataset.target).
  5. Display the first few images of the dataset:

    • Print the first few images using dataset.images[:5].
  6. Plot example images:

    • Use matplotlib to plot a few example images from the dataset to visualize the data.

This example demonstrates how to quickly load and explore the Olivetti Faces dataset using scikit-learn’s fetch_olivetti_faces() function, allowing you to inspect the data’s shape, target labels, and visualize some example images. This sets the stage for further preprocessing and application of classification algorithms.



See Also