swinv2_small_window16_256 是 Swin Transformer V2 系列的一个变体,基于 shifted window attention 机制,适用于图像分类任务。
图像分类(Image Classification)
[batch_size, 3, 256, 256]float32[batch_size, 1000]| 依赖 | 版本 |
|---|---|
| Python | 3.11+ |
| PyTorch | 2.9.0 |
| torch_npu | 2.9.0.post1 |
| timm | 1.0.27 |
| ascend-cann | 8.5.1 |
| numpy | ≥1.20.0 |
该模型基于 PyTorch + timm 框架,通过 torch_npu 在华为昇腾 Ascend910 NPU 上完成推理适配。模型权重通过 ModelScope 下载,无需额外修改模型结构即可在 NPU 上运行。
# 安装依赖
pip install torch torch_npu timm numpy modelscope safetensors -i https://mirrors.aliyun.com/pypi/simple/
# 检查 NPU 状态
npu-smi info# CPU + NPU 推理
python3 inference.py --device all --num_iters 3
# 仅 CPU 推理
python3 inference.py --device cpu --num_iters 3
# 仅 NPU 推理
python3 inference.py --device npu --num_iters 3| 设备 | 平均耗时 (ms) | 加速比 |
|---|---|---|
| CPU | 658.1 | 1x |
| NPU | 24.7 | 27x |
NPU 推理相比 CPU 推理取得了约 27x 的加速效果。
python3 inference.py --device all --num_iters 3
python3 compare_cpu_npu.py精度测试流程:
[1, 3, 256, 256]| 指标 | 数值 |
|---|---|
| Max Absolute Error (logit) | 0.13054974 |
| Mean Absolute Error (logit) | 2.50372700e-02 |
| Relative L2 Error (logit) | 2.71129876e-01 |
| Cosine Similarity | 0.96294427 |
| Top-1 Agreement | 100.00% |
| Top-5 Overlap | 80.00% |
| Max Prob Diff (softmax) | 0.00012304 |
| Mean Prob Diff (softmax) | 2.51628990e-05 |
| Top-1 Prob Diff (softmax) | 4.69471561e-05 |
NPU 与 CPU 推理结果误差 < 1%
Max Prob Diff = 0.0123%, Mean Prob Diff = 0.0025%
NPU 的推理结果与 CPU 推理结果在高层次上完全一致(Top-1 准确率一致率达到 100%),softmax 概率平均差异极小,满足昇腾 NPU 部署精度要求。
| 指标 | CPU | NPU |
|---|---|---|
| 平均推理耗时 (ms) | 658.1 | 24.7 |
| 加速比 | 1x | 27x |
(Timing captured during README generation)============================================================
CPU output shape: (1, 1000)
NPU output shape: (1, 1000)
=======================================================
精度指标 数值
-------------------------------------------------------
Max Absolute Error (logit) 0.00188507
Mean Absolute Error (logit) 4.49564017e-04
Relative L2 Error (logit) 4.86491621e-03
Cosine Similarity 0.99998844
Top-1 Agreement 100.00 %
Top-5 Overlap 100.00 %
Max Prob Diff (softmax) 0.00000208
Mean Prob Diff (softmax) 4.56068562e-07
Top-1 Prob Diff (softmax) 6.10947609e-07
=======================================================
结论: NPU 与 CPU 推理结果误差 < 1% ✓
(Max Prob Diff = 0.0002%, Mean = 0.0000%)
=======================================================
本仓库仅用于昇腾 NPU 适配验证,模型版权归原始作者所有。