cv_vgg19_facial-expression-recognition_fer 是基于 VGG19-BN 的人脸表情识别模型,用于识别 7 种基本表情:生气(angry)、厌恶(disgust)、恐惧(fear)、高兴(happy)、悲伤(sad)、惊讶(surprise)和中立(neutral)。
cv_vgg19_facial-expression-recognition_fer-ascend/
├── inference.py # 推理测试脚本
├── log.txt # 测试日志
├── README.md # 本文档
├── test_image.jpg # 测试图片
├── inference_result.json # 推理结果
└── precision_result.json # 精度测试结果docker exec -it test-modelagent bashsource /usr/local/Ascend/ascend-toolkit/set_env.sh模型文件位于 /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer/ 目录下:
iic/cv_vgg19_facial-expression-recognition_fer/pytorch_model.pt - 模型权重(含 checkpoint 元数据)pip install opencv-python numpy torch torch_npuRun the inference script for facial expression recognition:
cd /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer-ascend/
python3 inference.pyRun the accuracy comparison test to verify the consistency between NPU calculation results and CPU.
cd /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer-ascend/
python3 inference.py precision_test| 指标 | 实测值 | 阈值 | 状态 |
|---|---|---|---|
| 相对误差 | 0.0204% | < 1.00% | PASS |
| 表情预测一致 | True | True | PASS |
| 操作 | 耗时 |
|---|---|
| CPU 推理时间 | 1.6579s |
| NPU 推理时间 | 3.9834s |
| 测试图片 | fer.jpg (767×920) |
| 输入图片 | 预测表情 | 置信度 |
|---|---|---|
| fer.jpg | neutral | 23.50% |
表情概率分布:
| 类别 | 名称 | 概率 |
|---|---|---|
| 0 | angry | 10.55% |
| 1 | disgust | 2.36% |
| 2 | fear | 5.34% |
| 3 | happy | 17.58% |
| 4 | sad | 20.74% |
| 5 | surprise | 19.96% |
| 6 | neutral | 23.50% |
结果: CPU 和 NPU 预测均为 neutral,概率均为 23.50%,完全一致
============================================================
cv_vgg19_facial-expression-recognition_fer - Ascend NPU Inference
Output: /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer-ascend
============================================================
Mode: PRECISION TEST
NPU available: True
Device: npu:0
============================================================
Loading Model and Test Image
============================================================
Model loaded from: /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer/iic/cv_vgg19_facial-expression-recognition_fer/pytorch_model.pt
Using cached test image: /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer-ascend/test_image.jpg
Test image shape: (767, 920, 3)
Input tensor shape: torch.Size([1, 3, 224, 224])
============================================================
Running CPU Inference
============================================================
CPU inference time: 1.6579s
CPU emotion: neutral (0.2350)
============================================================
Running NPU Inference
============================================================
NPU inference time: 3.9834s
NPU emotion: neutral (0.2350)
Speedup: 0.42x
============================================================
Precision Test Results
============================================================
Max absolute error: 3.141165e-04
Max relative error: 2.038815e-04 (0.0204%)
Emotion prediction match: True (neutral vs neutral)
Precision test: PASS
Test image saved: /data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer-ascend/test_image.jpg
============================================================
Test Complete!
============================================================| 组件 | 输出通道 | 说明 |
|---|---|---|
| features.0-5 | 64 | 2×(Conv3x3 + BN + ReLU) + MaxPool |
| features.6-10 | 128 | 2×(Conv3x3 + BN + ReLU) + MaxPool |
| features.11-24 | 256 | 4×(Conv3x3 + BN + ReLU) + MaxPool |
| features.25-38 | 512 | 4×(Conv3x3 + BN + ReLU) + MaxPool |
| features.39-52 | 512 | 4×(Conv3x3 + BN + ReLU) + MaxPool |
| classifier | 7 | Linear(512→7) → Softmax |
import cv2
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
MODEL_WEIGHT = "/data/ysws/agentsp/5-19-1/cv_vgg19_facial-expression-recognition_fer/iic/cv_vgg19_facial-expression-recognition_fer/pytorch_model.pt"
LABEL_NAMES = ["angry", "disgust", "fear", "happy", "sad", "surprise", "neutral"]
MEAN = np.array([0.485, 0.456, 0.406], dtype=np.float32)
STD = np.array([0.229, 0.224, 0.225], dtype=np.float32)
def preprocess_image(img, target_size=224):
img = cv2.resize(img, (target_size, target_size))
img = img.astype(np.float32) / 255.0
img = (img - MEAN) / STD
img = img.transpose(2, 0, 1)
img = np.expand_dims(img, axis=0)
return img
img = cv2.imread("test.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_tensor = torch.from_numpy(preprocess_image(img_rgb)).float()
# 加载模型(需自行实现 VGG19BN + FC)
model = ... # 参见 inference.py
model = model.to("npu:0")
model.eval()
with torch.no_grad():
output = model(img_tensor.to("npu:0"))
probs = F.softmax(output, dim=1)
prob, cls = probs.max(dim=1)
print(f"Emotion: {LABEL_NAMES[cls.item()]}, Confidence: {prob.item():.4f}")A: 检查 NPU 驱动是否正确安装,确保 CANN 环境变量已 source。0.01-0.5% 的数值误差是正常的,因为 NPU 和 CPU 使用不同的计算精度。
A: 对于 VGG19(~20M 参数),NPU 的启动开销(模型编译、数据传输)可能大于实际计算时间。这是正常现象。在实际大模型推理中 NPU 会展现显著加速优势。
A: 该模型输出 7 类概率分布,置信度取决于输入图像质量。人脸表情越明显,正确类别的概率越高。在 demo 图片中 neutral 概率最高(23.50%),说明模型对图片中人物的情感判断较均衡。
本项目遵循 MIT License