DamoYOLO-T (TinyNAS) COCO 80 类目标检测模型,适配华为昇腾 Ascend NPU 推理。
# 1. 下载模型
pip install modelscope
modelscope download --model iic/cv_tinynas_object-detection_damoyolo-t
# 2. 安装依赖
pip install torch torchvision opencv-python numpy
# 3. 一键运行
bash run_all.sh# CPU 推理
python3 inference.py --device cpu --image assets/test_bag.jpg
# NPU 推理
python3 inference.py --device npu --image assets/test_bag.jpg
# 全量对比 (CPU vs NPU + 精度 + 性能基准)
python3 inference.py --device all --benchmark --image assets/test_bag.jpg| 参数 | 类型 | 默认值 | 说明 |
|---|---|---|---|
--image | str | assets/test_bag.jpg | 输入图片路径 |
--device | str | all | cpu / npu / all |
--benchmark | flag | False | 执行性能基准 |
--output | str | auto | 结果 JSON 路径 |
--score_threshold | float | 0.3 | 可视化分数阈值 |
在 assets/test_bag.jpg (480×640) 上验证,置信度阈值 0.6,NMS 阈值 0.7。
| 指标 | CPU | NPU | 偏差 |
|---|---|---|---|
| 检测目标数 | 14 | 14 | 完全一致 ✅ |
| 标签匹配 | sports ball, mouse, clock, cell phone, laptop, remote, baseball bat, surfboard, airplane, stop sign, traffic light | 同 CPU,仅 2 个目标排序互换 | 语义一致 ✅ |
| 最高置信度 | 0.772 | 0.776 | +0.004 |
| 分数最大偏差 | — | — | 0.005 ✅ |
| bbox 坐标偏差 (top-1) | (195.9,135.8,208.7,148.9) | (195.8,135.8,208.7,149.0) | 亚像素级 |
| 标签一致性 | — | — | 100% ✅ |
结论:CPU 与 NPU 检测结果完全对齐,分数偏差 < 0.005,bbox 坐标亚像素级差异,标签 100% 一致。
预热 10 次后连续推理 50 次,取统计指标:
| 指标 | CPU (Arm) | NPU (Ascend910B) | 加速比 |
|---|---|---|---|
| 均值 | 634.37 ms | 24.93 ms | 25.4x 🚀 |
| 中位数 | 624.63 ms | 14.81 ms | 42.2x |
| 最小值 | 590.19 ms | 14.18 ms | 41.6x |
| P90 | 679.36 ms | 54.81 ms | 12.4x |
| P99 | 687.56 ms | 55.14 ms | 12.5x |
单帧推理中位数加速比 42x,NPU 推理延迟仅 14.8ms (含图像预处理+后处理+NMS)。
推理日志 (logs/verify_result.json),输入图片 assets/test_bag.jpg (480×640):
=== CPU 推理日志 ===
[ 1] sports ball score=0.772 bbox=(195.9,135.8,208.7,148.9)
[ 2] mouse score=0.747 bbox=(188.1,212.6,205.3,232.3)
[ 3] clock score=0.728 bbox=(188.1,212.6,205.3,232.3)
[ 4] cell phone score=0.710 bbox=(188.1,212.6,205.3,232.3)
[ 5] laptop score=0.703 bbox=(158.3,104.0,492.5,397.6)
[ 6] remote score=0.700 bbox=(188.1,212.6,205.3,232.3)
[ 7] sports ball score=0.659 bbox=(188.1,212.6,205.3,232.3)
[ 8] baseball bat score=0.654 bbox=(397.9, 47.5,456.7,104.5)
[ 9] surfboard score=0.645 bbox=(196.4, 46.7,255.5,103.8)
[10] airplane score=0.638 bbox=(195.5, 47.2,255.4,103.4)
[11] stop sign score=0.636 bbox=(188.1,212.6,205.3,232.3)
[12] sports ball score=0.624 bbox=(389.7,189.8,401.2,202.7)
[13] traffic light score=0.621 bbox=(188.1,212.6,205.3,232.3)
[14] mouse score=0.606 bbox=(195.9,135.8,208.7,148.9)
=== NPU 推理日志 ===
[ 1] sports ball score=0.776 bbox=(195.8,135.8,208.7,149.0)
[ 2] mouse score=0.751 bbox=(188.2,212.6,205.3,232.3)
[ 3] clock score=0.731 bbox=(188.2,212.6,205.3,232.3)
[ 4] cell phone score=0.712 bbox=(188.2,212.6,205.3,232.3)
[ 5] laptop score=0.704 bbox=(158.2,104.0,492.5,397.7)
[ 6] remote score=0.703 bbox=(188.2,212.6,205.3,232.3)
[ 7] sports ball score=0.664 bbox=(188.2,212.6,205.3,232.3)
[ 8] baseball bat score=0.654 bbox=(397.9, 47.5,456.7,104.5)
[ 9] surfboard score=0.645 bbox=(196.4, 46.7,255.5,103.8)
[10] stop sign score=0.642 bbox=(188.2,212.6,205.3,232.3)
[11] airplane score=0.637 bbox=(195.5, 47.2,255.4,103.4)
[12] traffic light score=0.626 bbox=(188.2,212.6,205.3,232.3)
[13] sports ball score=0.625 bbox=(389.7,189.8,401.2,202.7)
[14] mouse score=0.610 bbox=(195.8,135.8,208.7,149.0)
=== 性能基准日志 (warmup=10, runs=50) ===
CPU: mean=620.13ms median=614.86ms min=593.38ms p90=652.78ms
NPU: mean= 25.86ms median= 15.57ms min= 15.31ms p90= 53.34ms
Speedup (median): 39.5x
=== 精度校验日志 ===
detections: CPU=14 NPU=14 ✅
scores: max_diff=0.005 ✅
bbox shape: match ✅
labels: 100%一致 ✅CPU 与 NPU 检测结果高度一致:14/14 目标完全相同,分数偏差 ≤ 0.005,bbox 亚像素级差异。数据来源:
logs/verify_result.json。
damoyolo-detection/
├── inference.py # 主推理脚本
├── run_all.sh # 一键运行脚本
├── VERIFICATION_REPORT.md # 验证报告
├── assets/ # 测试图片
│ └── test_bag.jpg
├── damo/ # 模型源码
│ ├── base_models/
│ │ ├── backbones/ # TinyNAS backbone
│ │ ├── necks/ # GiraffeNeckV2
│ │ └── heads/ # ZeroHead
│ └── utils/ # 工具函数
└── logs/ # 运行日志和结果