cv_resnet_image-quality-assessment-mos_youtubeUGC 是基于 ResNet18 的无参考图像质量评估模型(MOS - Mean Opinion Score),用于评估 UGC(User Generated Content)图像质量。模型输出 MOS 分数范围 [0, 1],值越大代表图像质量越好。
cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend/
├── inference.py # 推理测试脚本
├── log.txt # 测试日志
├── README.md # 本文档
├── test_image.jpg # 测试图片
├── inference_result.json # 推理结果
└── precision_result.json # 精度测试结果docker exec -it test-modelagent bashsource /usr/local/Ascend/ascend-toolkit/set_env.sh模型文件位于 /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC/ 目录下:
iic/cv_resnet_image-quality-assessment-mos_youtubeUGC/pytorch_model.pt - 模型权重(约 11.7M)pip install opencv-python numpy torch torch_npu运行推理脚本进行图像质量评估:
cd /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend/
python3 inference.pyRun the accuracy comparison test to verify the consistency between NPU calculation results and CPU.
cd /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend/
python3 inference.py precision_test| 指标 | 实测值 | 阈值 | 状态 |
|---|---|---|---|
| 相对误差 | 0.0863% | < 1.00% | PASS |
| MOS 分数误差 | 0.0002 | < 0.01 | PASS |
| 分数匹配 | True | True | PASS |
| 操作 | 耗时 |
|---|---|
| CPU 推理时间 | 0.2193s |
| NPU 推理时间 | 4.13s |
| 测试图片 | demo-0.png (540×960) |
| 输入图片 | MOS 分数 | 质量评级 |
|---|---|---|
| demo-0.png | 0.1830 | Poor (MOS < 0.5) |
结果: CPU MOS=0.1830, NPU MOS=0.1829, 误差 0.0863%,完全一致
============================================================
cv_resnet_image-quality-assessment-mos_youtubeUGC - Ascend NPU Inference
Output: /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend
============================================================
Mode: PRECISION TEST
NPU available: True
Device: npu:0
============================================================
Loading Model and Test Image
============================================================
Model loaded from: /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC/iic/cv_resnet_image-quality-assessment-mos_youtubeUGC/pytorch_model.pt
Using cached test image: /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend/test_image.jpg
Test image shape: (540, 960, 3)
Input tensor shape: torch.Size([1, 3, 224, 224])
============================================================
Running CPU Inference
============================================================
CPU inference time: 0.2193s
CPU MOS score: 0.1830
============================================================
Running NPU Inference
============================================================
NPU inference time: 4.1346s
NPU MOS score: 0.1829
Speedup: 0.05x
============================================================
Precision Test Results
============================================================
Max absolute error: 1.579076e-04
Max relative error: 8.626611e-04 (0.0863%)
Score match (|diff|<0.01): True (0.1830 vs 0.1829)
Precision test: PASS
Test image saved: /data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC-ascend/test_image.jpg
============================================================
Test Complete!
============================================================| 组件 | 输出通道 | 说明 |
|---|---|---|
| conv1 | 64 | 7×7, 步长=2, 填充=3 |
| layer1 | 64 | 2 个 BasicBlock, 步长=1 |
| layer2 | 128 | 2 个 BasicBlock, 步长=2 |
| layer3 | 256 | 2 个 BasicBlock, 步长=2 |
| layer4 | 512 | 2 个 BasicBlock, 步长=2 |
| fc0 | 1024 | Linear(512→1024) |
| fc1 | 1 | Linear(1024→1) → Sigmoid |
import cv2
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
MODEL_WEIGHT = "/data/ysws/agentsp/5-19-1/cv_resnet_image-quality-assessment-mos_youtubeUGC/iic/cv_resnet_image-quality-assessment-mos_youtubeUGC/pytorch_model.pt"
MEAN = np.array([0.485, 0.456, 0.406], dtype=np.float32)
STD = np.array([0.229, 0.224, 0.225], dtype=np.float32)
def preprocess_image(img, target_size=224):
img = cv2.resize(img, (target_size, target_size))
img = img.astype(np.float32) / 255.0
img = (img - MEAN) / STD
img = img.transpose(2, 0, 1)
img = np.expand_dims(img, axis=0)
return img
img = cv2.imread("test.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_tensor = torch.from_numpy(preprocess_image(img_rgb)).float()
# 加载模型(需自行实现 ResNet18 + FC Head)
model = ... # 参见 inference.py
model = model.to("npu:0")
model.eval()
with torch.no_grad():
mos_score = model(img_tensor.to("npu:0"))
print(f"MOS score: {mos_score.item():.4f}") # 0.0 ~ 1.0, higher is betterA: 检查 NPU 驱动是否正确安装,确保 CANN 环境变量已 source。0.01-0.5% 的数值误差是正常的,因为 NPU 和 CPU 使用不同的计算精度。
A: 对于小模型(11.7M 参数),NPU 的启动开销(模型编译、数据传输)可能大于实际计算时间。这是正常现象。在实际大模型推理中 NPU 会展现显著加速优势。
A: MOS (Mean Opinion Score) 范围 [0, 1],值越大表示图像质量越好。通常 MOS > 0.5 认为图像质量较好,MOS < 0.5 认为图像质量较差。
本项目遵循 Apache-2.0 许可证