PCA（Principal Component Analysis）作为一种数学矩阵分析方法，能够约简多元数据集，在许多领域有重要应用，本篇介绍 PCA 应用于人脸识别。

问题

给定一组具有 $m$ 个样本的 $n$ 维数据 ${a_1, a_2, \cdots, a_m }$，如何找到特殊的较少方向使得数据在该方向上的投影方差最大。方差大说明在新空间中数据更容易分离，更方便判断类别等。

理论

将每个样本写成列向量，得到矩阵：

$$ A = \bigl( \begin{smallmatrix} a_1 & a_2 & \cdots & a_m \end{smallmatrix} \bigr) $$

为了方便，我们先对矩阵 $A$ 沿着行方向求取平均值，然后将 $A$ 的每一行减去对应的均值。这样做能够约简 $A$，方便计算。

寻找主方向 $u$，使得 $A$ 在该方向的投影方差尽量大，方差最大的方向就是主方向。即

$$ Var(A \cdot u) = (Au)^T(Au) = u^T A^T Au. $$

为目标函数。一般地，我们选择单位方向 $u$，所以有 $u^Tu=1$.

利用 Lagrange 乘子法求解如下方程：

$$ L(u) = u^TA^TAu + \lambda(u^Tu-1). $$

对单位方向 $u$ 求方向导数，并令其为0，得到

$$ A^TAu = -\lambda u. $$

从上式可以看出，PCA 选择的方向其实就是方阵 $A^TA$ 的特征向量方向，按照特征值从大到小排序得到对应的特征向量方向，就是 PCA 选择的按照重要度排序的方向。

PCA 人脸识别

利用 PCA 进行人脸识别。将人脸图像转化为灰度图像，对所有人脸图像进行 PCA 投影到新空间（一般称为脸空间）。对每一个人的人脸图像计算投影在脸空间中的平均坐标。当新人脸识别时，先投影到脸空间，然后计算投影坐标距离已知的人脸平均坐标的最小值，该最小值对应的人脸即为识别到的人脸。

步骤：
0. 准备人脸数据集，假设每个人有 $k+1$ 副人脸图像，共 $m$ 个人；

把每一幅图像转化为灰度图（图像宽、高分别是 $w,h$），然后拉长为一维列向量，分别取每个人的人脸图像 $k$ 副，剩余 1 副留作验证。**依次(同一个人的脸图像挨着)**作为列向量组成一个矩阵 $A_{r\times c} = \bigl( \begin{smallmatrix} a_1 & a_2 & \cdots & a_{km} \end{smallmatrix} \bigr), r = w*h, c=k*m$ ，每一个 $a_i, i=1,2,\cdots, km$ 为一个人脸列向量；
对矩阵 $A_{r\times c}$ 的每一行求取平均值，得到的均值向量即为 $a_{mean}$，然后每一行减去对应的均值。即对 $A_{r\times c}$ 取中心化，这里仍记作 $A_{r \times c}$；
计算协方差矩阵，这里约简为均值相乘： $\text{Corr}_{c\times c} = A^T_{r \times c} \cdot A_{r \times c}$；
计算矩阵 $\text{Corr}_{c\times c}$ 的特征值向量 $eigvalue_{c\times 1}$ 和特征向量矩阵 $EigVec_{c\times c}$，假设每一列为一个特征向量；
对特征值按照从大到小排序，并筛选前 90%（为阈值。前 $n$ 个特征值和正好占总特征向量和的比超过 90%，根据情况选择该阈值）的特征值对应的特征向量矩阵 $EigVec^{thres}_{c \times n} = \bigl(\begin{smallmatrix} EigVec^1_{c\times 1} & EigVec^2_{c \times 1} & \cdots & EigVec^n_{c \times 1} \end{smallmatrix} \bigr)$ ；
计算特征脸矩阵 $B_{r \times n} = A_{r\times c} \cdot EigVec^{thres}_{c\times n} = \bigl( \begin{smallmatrix} A_{r\times c}\cdot EigVec^1_{c\times 1} & A_{r\times c} \cdot EigVec^2_{c\times 1} & \cdots & A_{r\times c} \cdot EigVec^n_{c\times 1} \end{smallmatrix} \bigr)$ ，
每一列 $A_{r \times c} \cdots EigVec^i_{c \times 1}, i = 1, 2, \cdots, n$ ，对应投影后空间的基向量，这里称为投影空间为脸空间，共有 $n$ 个基向量。
将矩阵 $A_{r \times c}$ 投影到脸空间 $B_{r \times n}$ 得到脸特征矩阵，即 $$P_{c \times n} = A^T_{r\times c} \cdot B_{r\times n} = \left[ \begin{array}{cc|c} p_1 \\ p_2 \\ \vdots \\ p_n \end{array} \right] $$ ，其中每一行对应一个脸向量在脸空间中的坐标。因为前面我们是依次（同一个人的脸图像挨着）构建矩阵 $A_{r\times c}$ 的，所以，我们可以对脸特征矩阵 $P_{c \times n}, c=k*m$ 的每 $k$ 行计算平均，得到 $m$ 个人在的脸在脸空间中的位置坐标（领域），记作 $ P_{mean} = \bigl(\begin{smallmatrix} p^1_{mean} & p^2_{mean} & \cdots & p^m_{mean} \end{smallmatrix} \bigr) $ ，每个 $p^i_{mean}$ 都是 $1\times n$ 维的；
对于新来的人脸图像进行识别。首先将人脸图像灰度化，然后拉长为一维列向量 $b_{r\times 1}$，(对应位置)减去均值向量 $a_{mean}$，得到向量 $c_{r\times 1} = b_{r\times 1} - a_{mean}$，然后将向量 $c_{r\times 1}$ 投影到脸空间 $B_{r\times n}$，即 $p_{1\times n} = c^T_{r\times 1} \cdot B_{r\times n}$。最后计算 $p_{1\times n}$ 距离 $p^i_{mean}$ 哪个最近，就说明是那个人的脸。

实现

环境配置

如下命令在终端中执行(以 Ubuntu 18.04 为例)：

# install miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x Miniconda3-latest-Linux-x86_64.sh
/bin/bash Miniconda3-latest-Linux-x86_64.sh -b -p ~/miniconda
conda init
source ~/.bashrc

# install conda env maths-ai
conda create -n maths-ai python=3.11 -y
conda activate maths-ai

# install common packages
pip install ipykernel ipywidgets
python -m ipykernel install --user --name maths-ai --display-name maths-ai
pip install opencv-python pillow numpy matplotlib scipy sympy
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118

安装完成后，打开 jupyter 继续执行如下命令

下载数据

访问链接 github，
直接下载:
git clone https://github.com/hsahuja111/Face-Recognition-using-Principal-Component-Analysis.git

下载后图像存放在 ATnT 中。

参考链接：

数据检查

1
2
3

import os

from PIL import Image

1	data_path = "/disk1/datasets/maths-ai/face-recognition/ATnT/"

batch = 9  # 只使用前9张进行PCA投影到脸空间，用最后一张验证是否能识别成功
face_d = {}
for file in os.listdir(data_path):
    file_path = os.path.join(data_path, file)
    if os.path.isdir(file_path):
        face_d[file] = sorted(os.listdir(file_path), key=lambda x: int(x.split(".")[0]))

1	"共多少种人脸？", len(face_d)

('共多少种人脸？', 40)

加载包

import os

import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

构建矩阵

w = 92
h = 112


def img_to_vector(img_path):
    """
    把图像转化为灰度图，然后转化为一维向量
    """
    img = cv2.imread(img_path)
    # img = cv2.resize(img, (256, 256))
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    return img_gray.reshape(-1)


keys_face = sorted(list(face_d.keys()))
name = keys_face[0]
img_path = os.path.join(data_path, name, face_d[name][0])

img_vector = img_to_vector(img_path=img_path)

plt.imshow(img_vector.reshape(h, w), cmap="gray")
plt.title(name)
plt.show()

png

A = None

for k in keys_face:
    v = face_d[k]
    for i in range(batch):
        img_path = os.path.join(data_path, k, v[i])
        img_vector = img_to_vector(img_path).reshape(-1, 1)
        if A is None:
            A = img_vector
        else:
            A = np.hstack((A, img_vector))
A.shape

(10304, 360)

1	avg_face_vector = A.mean(axis=1)

avg_face = avg_face_vector.reshape(h, w)
plt.imshow(avg_face, cmap="gray")
plt.title("avg face")
plt.show()

png

1	A_homo = A - avg_face_vector.reshape(-1, 1)

fig, ax = plt.subplots(2, 5)
plt.rc("font", size=13)
# fig.tight_layout()
fig.set_figwidth(25)
fig.set_figheight(10)
for i in range(10):
    ind = i * batch + 1
    ai = A_homo[:, ind]
    name = keys_face[i]
    ai_image = ai.reshape(h, w)
    r = i // 5
    c = i % 5
    ax[r][c].imshow(ai_image, cmap="gray")
    ax[r][c].set_title(f"{name}")
    ax[r][c].xaxis.set_major_locator(plt.NullLocator())
    ax[r][c].yaxis.set_major_locator(plt.NullLocator())
plt.show()

png

计算协方差阵的特征向量，选择主特征

1
2
3

Cov_A = np.matmul(A_homo.transpose(), A_homo)
# Cov_A = np.cov(A_homo)
Cov_A.shape

(360, 360)

1	eig_values, eig_vectors = np.linalg.eig(Cov_A)

1
2
3

# 特征值从大到小排序
eig_values_sorted_index = np.argsort(-eig_values)
eig_values_sorted = np.sort(eig_values)[::-1]

1 2	plt.plot(range(len(eig_values_sorted)), list(eig_values_sorted)) plt.show()

png

1 2	# 每一列是一个特征向量，对列按照特征值从大到小排序 eig_vectors_sorted_by_values = eig_vectors[:, eig_values_sorted_index]

# 取前 threshold 个特征值，使得它们的和占总特征值和的90%
eig_acc = 0
total_eig = np.sum(eig_values)
for i, eig in enumerate(eig_values_sorted):
    eig_acc += eig
    if eig_acc / total_eig >= 0.9:
        threshold = i
        break
threshold

1
2
3

# 取前 threshold 个特征值作为主特征向量
eig_vectors_selected = eig_vectors_sorted_by_values[:, :threshold]
eig_vectors_selected.shape

(360, 103)

计算特征脸

1	A_homo.shape

(10304, 360)

1
2
3

# 计算特征脸矩阵，每一列是投影空间中特征脸基向量
B = np.matmul(A_homo, eig_vectors_selected)
B.shape

(10304, 103)

# 画出前10个特征脸基向量图像
fig, ax = plt.subplots(2, 5)
plt.rc("font", size=13)
# fig.tight_layout()
fig.set_figwidth(25)
fig.set_figheight(10)
for i in range(10):
    eig_face = B[:, i]
    eig_value = eig_values_sorted[i]
    eig_face_image = eig_face.reshape(h, w)
    r = i // 5
    c = i % 5
    ax[r][c].imshow(eig_face_image, cmap="gray")
    ax[r][c].set_title(f"{i}-th eig-face")
    ax[r][c].xaxis.set_major_locator(plt.NullLocator())
    ax[r][c].yaxis.set_major_locator(plt.NullLocator())
plt.show()

png

计算投影空间

# 把所有脸特征投影到脸空间中，计算每个脸在投影空间中的坐标
# 每一行对应一个人脸在投影空间（脸空间）中的坐标位置
P = np.matmul(A.transpose(), B)
P.shape

(360, 103)

识别人脸

# 取最后一副没有参与PCA分析的人脸图，用于验证是否能够正确识别人脸
name = keys_face[2]  # 随便取第3个人
img_name = face_d[name][-1]  # 取最后一副图像验证
img_name

'10.pgm'

# 把待验证人脸图像转化为灰度图，进而拉平为向量，最后减去平均脸向量
img_path = os.path.join(data_path, name, img_name)
img_vector = img_to_vector(img_path=img_path)
img_vector_norm = img_vector - avg_face_vector
img_vector_norm = img_vector_norm.reshape(-1, 1)
img_vector_norm.shape

(10304, 1)

1
2
3

# 查看当前待识别人脸图像
plt.imshow(img_vector_norm.reshape(h, w), cmap="gray")
plt.show()

png

B.shape

(10304, 103)

1	img_vector_norm.shape

(10304, 1)

1
2
3

# 把待识别人脸向量投影到脸空间，计算得到其在脸空间中的坐标
project_vector = np.matmul(img_vector_norm.reshape(1, -1), B)
project_vector.shape

(1, 103)

P.shape

(360, 103)

# 对每个投影后的脸特征坐标按照人脸类别（不同人）计算平均值（这里因为创建数据A时是按照不同人依次叠加的，所以同一人的特征是紧挨着的），
# 作为当前人的在投影空间（脸空间）中的唯一特征（坐标）
P_avg = []
for i in range(P.shape[1] // batch):
    P_slice = P[i * batch : (i + 1) * batch, :]
    P_avg.append(P_slice.mean(axis=0))
len(P_avg)

1	P_avg[0].shape

(103,)

1
2
3

# 计算待识别人脸特征与脸空间中哪个人脸最接近，最接近的人脸作为预测或识别的人脸name
ind = np.argmin(np.sum((np.array(P_avg) - project_vector.reshape(1, -1)) ** 2, axis=1))
ind

1
2
3

# 预测或识别的人名
pred_name = keys_face[ind]
pred_name

's11'

# 画出真实人脸和识别的人，判断是否是同一个人
img = cv2.imread(img_path)  # 取的待识别的图像
pred_img = cv2.imread(
    os.path.join(data_path, pred_name, face_d[pred_name][0])
)  # 识别出的人对应的第1副图像

fig, ax = plt.subplots(1, 2)
ax[0].imshow(img[..., ::-1])
ax[0].set_title(f"truth name:{name}")
ax[1].imshow(pred_img[..., ::-1])
ax[1].set_title(f"predict name:{pred_name}")
plt.show()