timm/convnext_xlarge.fb_in22k_ft_in1k on Ascend NPU

1. 简介

本项目将 timm/convnext_xlarge.fb_in22k_ft_in1k 适配到昇腾 NPU (Ascend910)。模型使用 ModelScope 下载本地权重，通过 timm.create_model(pretrained=False) 加载，并在 NPU 上完成推理验证、精度一致性检查和性能基准测试。

2. 验证环境

项目	版本/型号
NPU	Ascend910
CANN	8.5.1
torch	PyTorch (with torch_npu)
timm	latest

详细环境信息见 logs/env_check.log。

3. 推理运行

pip install -r requirements.txt
python inference.py

推理结果 (NPU Top-5):

Top-1: class_680 (0.0073)
Top-2: class_700 (0.0051)
Top-3: class_549 (0.0050)
Top-4: class_551 (0.0046)
Top-5: class_405 (0.0046)

日志保存在 logs/inference.log。

4. 精度验证

对单张测试图片进行 CPU 与 NPU 一致性验证：

指标	数值
max_abs_error	0.014928
mean_abs_error	0.003129
relative_error	0.6983%
cosine_similarity	0.999987
threshold	1.0%
结果	PASS

CPU Top-1: class_680
NPU Top-1: class_680
CPU Top-5: class_680, class_700, class_549, class_405, class_551
NPU Top-5: class_680, class_700, class_549, class_551, class_405
Top-1 match: True
Top-5 match: True

5. 性能参考

指标	数值
avg latency	25.14 ms
min latency	25.09 ms
max latency	25.20 ms
p50 latency	25.15 ms
p90 latency	25.20 ms
p95 latency	25.20 ms
throughput	39.77 images/sec

日志保存在 logs/benchmark.log。

6. 精度评测说明

本项目包含单图 smoke consistency 验证，非官方 ImageNet 完整验证集评测。详细指标见第 4 节。

7. 自验证截图

见 screenshots/self_verification.png。

8. 日志文件

logs/env_check.log — 环境检查
logs/inference.log — 推理结果
logs/accuracy.log — 精度一致性
logs/benchmark.log — 性能基准

9. 注意事项

权重通过 ModelScope snapshot_download 下载，严禁使用 HuggingFace 自动下载。
timm.create_model(..., pretrained=False) 加载本地权重。
测试图片为占位图（网络不可达时生成），实际部署建议替换为真实图片。

10. 标签

#NPU #Ascend #Ascend910 #image-classification

项目

版本/型号

NPU

Ascend910

CANN

8.5.1

torch

PyTorch (with torch_npu)

timm

latest

4. 精度验证

对单张测试图片进行 CPU 与 NPU 一致性验证：

指标	数值
max_abs_error	0.014928
mean_abs_error	0.003129
relative_error	0.6983%
cosine_similarity	0.999987
threshold	1.0%
结果	PASS

CPU Top-1: class_680

NPU Top-1: class_680

CPU Top-5: class_680, class_700, class_549, class_405, class_551

NPU Top-5: class_680, class_700, class_549, class_551, class_405

Top-1 match: True

Top-5 match: True

指标

数值

avg latency

25.14 ms

min latency

25.09 ms

max latency

25.20 ms

p50 latency

25.15 ms

p90 latency

25.20 ms

p95 latency

25.20 ms

throughput

39.77 images/sec