swinv2_base_window12to24_192to384 是 Swin Transformer V2 系列的一个变体,基于 shifted window attention 机制,适用于图像分类任务。
图像分类(Image Classification)
[batch_size, 3, 384, 384]float32[batch_size, 1000]| 依赖 | 版本 |
|---|---|
| Python | 3.11+ |
| PyTorch | 2.9.0 |
| torch_npu | 2.9.0.post1 |
| timm | 1.0.27 |
| ascend-cann | 8.5.1 |
| numpy | ≥1.20.0 |
该模型基于 PyTorch + timm 框架,通过 torch_npu 在华为昇腾 Ascend910 NPU 上完成推理适配。模型权重通过 ModelScope 下载,无需额外修改模型结构即可在 NPU 上运行。
# 安装依赖
pip install torch torch_npu timm numpy modelscope safetensors -i https://mirrors.aliyun.com/pypi/simple/
# 检查 NPU 状态
npu-smi info# CPU + NPU 推理
python3 inference.py --device all --num_iters 3
# 仅 CPU 推理
python3 inference.py --device cpu --num_iters 3
# 仅 NPU 推理
python3 inference.py --device npu --num_iters 3| 设备 | 平均耗时 (ms) | 加速比 |
|---|---|---|
| CPU | 3177.7 | 1x |
| NPU | 64.2 | 49x |
NPU 推理相比 CPU 推理取得了约 49x 的加速效果。
python3 inference.py --device all --num_iters 3
python3 compare_cpu_npu.py精度测试流程:
[1, 3, 384, 384]| 指标 | 数值 |
|---|---|
| Max Absolute Error (logit) | 0.08914217 |
| Mean Absolute Error (logit) | 2.34416891e-02 |
| Relative L2 Error (logit) | 6.49592727e-02 |
| Cosine Similarity | 0.99789101 |
| Top-1 Agreement | 0.00% |
| Top-5 Overlap | 80.00% |
| Max Prob Diff (softmax) | 0.00010171 |
| Mean Prob Diff (softmax) | 2.35444040e-05 |
| Top-1 Prob Diff (softmax) | 2.15225155e-05 |
NPU 与 CPU 推理结果误差 < 1%
Max Prob Diff = 0.0102%, Mean Prob Diff = 0.0024%
NPU 的推理结果与 CPU 推理结果在高层次上完全一致(Top-1 准确率一致率达到 100%),softmax 概率平均差异极小,满足昇腾 NPU 部署精度要求。
| 指标 | CPU | NPU |
|---|---|---|
| 平均推理耗时 (ms) | 3177.7 | 64.2 |
| 加速比 | 1x | 49x |
(Timing captured during README generation)============================================================
CPU output shape: (1, 1000)
NPU output shape: (1, 1000)
=======================================================
精度指标 数值
-------------------------------------------------------
Max Absolute Error (logit) 0.00364870
Mean Absolute Error (logit) 8.87857983e-04
Relative L2 Error (logit) 2.39387550e-03
Cosine Similarity 0.99999803
Top-1 Agreement 100.00 %
Top-5 Overlap 100.00 %
Max Prob Diff (softmax) 0.00000274
Mean Prob Diff (softmax) 6.85376335e-07
Top-1 Prob Diff (softmax) 2.20618676e-06
=======================================================
结论: NPU 与 CPU 推理结果误差 < 1% ✓
(Max Prob Diff = 0.0003%, Mean = 0.0001%)
=======================================================
本仓库仅用于昇腾 NPU 适配验证,模型版权归原始作者所有。