timm/repvit_m1_5.dist_450e_in1k 昇腾 NPU 适配

1. 模型信息

模型名称：timm/repvit_m1_5.dist_450e_in1k
模型来源：Hugging Face / timm
模型类型：ImageNet-1k 图像分类模型
模型结构：RepViT-M1.5
输出类别数：1000
Top-K：5
推理框架：PyTorch + timm + torch-npu
运行设备：Ascend NPU
适配目标：完成 timm/repvit_m1_5.dist_450e_in1k 模型在昇腾 NPU 环境下的图像分类推理验证，并与 CPU 推理结果进行误差对比。

本项目面向昇腾 Model-Agent 模型适配大赛赛道一，完成 timm/repvit_m1_5.dist_450e_in1k 图像分类模型在 Ascend NPU 环境下的适配验证。项目基于 PyTorch、timm 和 torch-npu，实现模型加载、测试图片生成、CPU 推理、NPU 推理、Top-5 分类输出、CPU/NPU 输出一致性对比以及验证材料整理。

2. 项目说明

repvit_m1_5.dist_450e_in1k 是 timm 模型库中的 ImageNet-1k 图像分类模型。模型输入一张 RGB 图像后输出 1000 维 ImageNet 分类 logits，并进一步得到 Top-5 分类结果。

本次适配内容包括：

在 Ascend NPU Notebook 环境中安装依赖；
使用 Hugging Face / timm 模型权重加载 repvit_m1_5.dist_450e_in1k；
构造测试图片 test.jpg；
分别执行 CPU 与 NPU 图像分类推理；
保存 CPU/NPU 的 logits、probability 和 Top-5 分类结果；
对 CPU 与 NPU 输出进行误差对比；
统计 CPU/NPU 平均推理耗时；
保存日志、截图、适配报告和结果文件，用于赛道一模型适配验证提交。

3. 工程结构

.
├── README.md
├── adaptation_report.md
├── download_model.sh
├── make_test_image.py
├── inference.py
├── compare_cpu_npu.py
├── make_report.py
├── requirements.txt
├── test.jpg
├── hf_model/
│   ├── config.json
│   └── model.safetensors
├── cpu_result.json
├── cpu_result.txt
├── cpu_infer.log
├── npu_result.json
├── npu_result.txt
├── npu_infer.log
├── compare_result.txt
├── compare_metrics.json
├── compare.log
├── fusion_result.json
├── npu_env.txt
└── screenshots/
    ├── npu_env.png
    ├── npu_result.png
    └── compare_result.png

其中：

README.md：项目说明文档；
adaptation_report.md：Ascend NPU 适配报告；
download_model.sh：模型下载脚本；
make_test_image.py：测试图片生成脚本；
inference.py：CPU/NPU 推理脚本；
compare_cpu_npu.py：CPU/NPU 输出误差对比脚本；
make_report.py：报告或结果整理脚本；
requirements.txt：Python 依赖文件；
test.jpg：测试图片；
hf_model/config.json：模型配置文件；
hf_model/model.safetensors：模型权重文件；
cpu_result.json：CPU 推理结构化结果；
cpu_result.txt：CPU 推理日志；
cpu_infer.log：CPU 推理补充日志；
npu_result.json：NPU 推理结构化结果；
npu_result.txt：NPU 推理日志；
npu_infer.log：NPU 推理补充日志；
compare_result.txt：CPU/NPU 对比文本结果；
compare_metrics.json：CPU/NPU 对比结构化指标；
compare.log：CPU/NPU 对比运行日志；
fusion_result.json：汇总结果文件；
npu_env.txt：NPU 环境信息；
screenshots/：验证截图材料。

4. 环境检查

在 Ascend NPU Notebook 中执行以下命令检查运行环境：

npu-smi info
python --version
python - <<'PY'
import torch
import timm

print("torch:", torch.__version__)
print("timm:", timm.__version__)

try:
    import torch_npu
    print("torch_npu import success")
    print("npu available:", torch.npu.is_available())
except Exception as e:
    print("torch_npu import failed:", repr(e))
PY

环境信息保存为：

npu_env.txt

环境检查截图保存为：

screenshots/npu_env.png

该截图用于证明当前运行环境存在 Ascend NPU，并记录 NPU 型号、运行状态和 Python 版本信息。

5. 模型加载

本项目使用 Hugging Face / timm 模型：

timm/repvit_m1_5.dist_450e_in1k

本地模型文件位于：

hf_model/config.json
hf_model/model.safetensors

推理日志中显示：

checkpoint: hf_model/model.safetensors

说明本次验证使用本地 Hugging Face 权重文件完成模型加载与推理。

6. 测试输入

本项目使用测试图片：

test.jpg

推理流程包括：

读取测试图片；
使用 timm 对图片进行预处理；
构造模型输入张量；
输入 RepViT-M1.5 图像分类模型；
输出 1000 维 ImageNet 分类 logits；
计算 softmax probability；
输出 Top-5 分类结果。

输入张量形状为：

[1, 3, 224, 224]

7. CPU 推理

运行：

python inference.py --device cpu --image test.jpg --output cpu_result.json 2>&1 | tee cpu_result.txt

CPU 推理输出文件：

cpu_result.json
cpu_result.txt
cpu_infer.log

CPU 推理日志摘要如下：

model: timm/repvit_m1_5.dist_450e_in1k
device: cpu
torch: 2.9.0+cpu
timm: 1.0.27
checkpoint: hf_model/model.safetensors
input_shape: [1, 3, 224, 224]
output_shape: [1, 1000]
avg_latency_ms: 467.8799

CPU Top-5 输出如下：

rank 1: class_index=111, prob=0.14002095, logit=7.00545454
rank 2: class_index=644, prob=0.13577750, logit=6.97467995
rank 3: class_index=409, prob=0.04388503, logit=5.84523582
rank 4: class_index=818, prob=0.02927318, logit=5.44033432
rank 5: class_index=326, prob=0.02859800, logit=5.41699934

8. NPU 推理

运行：

python inference.py --device npu --image test.jpg --output npu_result.json 2>&1 | tee npu_result.txt

NPU 推理输出文件：

npu_result.json
npu_result.txt
npu_infer.log

NPU 推理日志摘要如下：

model: timm/repvit_m1_5.dist_450e_in1k
device: npu
torch: 2.9.0+cpu
timm: 1.0.27
checkpoint: hf_model/model.safetensors
input_shape: [1, 3, 224, 224]
output_shape: [1, 1000]
avg_latency_ms: 34.2422

NPU Top-5 输出如下：

rank 1: class_index=111, prob=0.13956892, logit=7.01538944
rank 2: class_index=644, prob=0.13759053, logit=7.00111294
rank 3: class_index=409, prob=0.04359126, logit=5.85168743
rank 4: class_index=818, prob=0.02909385, logit=5.44735765
rank 5: class_index=326, prob=0.02884225, logit=5.43867207

NPU 推理结果截图保存为：

screenshots/npu_result.png

9. CPU/NPU 误差对比

运行：

python compare_cpu_npu.py 2>&1 | tee compare_result.txt

或：

python compare_cpu_npu.py 2>&1 | tee compare.log

对比脚本会读取 CPU 与 NPU 的 logits 和 probability 输出，并计算：

CPU Top-5；
NPU Top-5；
CPU/NPU Top-1 是否一致；
CPU/NPU Top-5 顺序是否一致；
CPU/NPU Top-5 集合是否一致；
logits 最大绝对误差；
logits 平均绝对误差；
logits 余弦相似度；
probability 最大绝对误差；
probability 平均绝对误差；
probability 余弦相似度；
CPU/NPU 平均推理耗时；
是否通过建议验证。

对比结果保存为：

compare_result.txt
compare_metrics.json
compare.log
fusion_result.json

10. 自验证结果

本次 timm/repvit_m1_5.dist_450e_in1k 适配验证的 CPU/NPU 误差结果如下：

指标	结果
类别数	`1000`
Top-K	`5`
CPU Top-5	`[111, 644, 409, 818, 326]`
NPU Top-5	`[111, 644, 409, 818, 326]`
CPU/NPU Top-1 是否一致	`True`
CPU/NPU Top-5 顺序是否一致	`True`
CPU/NPU Top-5 集合是否一致	`True`
logits 最大绝对误差	`0.0747933388`
logits 平均绝对误差	`0.0074124314`
logits 余弦相似度	`0.9999870135`
probability 最大绝对误差	`0.0018130243`
probability 平均绝对误差	`0.0000107833`
probability 余弦相似度	`0.9999324711`
CPU 平均推理耗时	`467.8799 ms`
NPU 平均推理耗时	`34.2422 ms`
是否通过建议验证	`True`

对应的 compare_result.txt 内容如下：

CPU/NPU comparison for Ascend NPU adaptation

model: timm/repvit_m1_5.dist_450e_in1k
cpu_result: cpu_result.json
npu_result: npu_result.json
num_classes: 1000
top_k: 5

Top-k consistency:
  CPU top5: [111, 644, 409, 818, 326]
  NPU top5: [111, 644, 409, 818, 326]
  top1_same: True
  top5_order_same: True
  top5_set_same: True

Numerical metrics:
  logit_max_abs: 0.0747933388
  logit_mean_abs: 0.0074124314
  logit_cosine_similarity: 0.9999870135
  probability_max_abs: 0.0018130243
  probability_mean_abs: 0.0000107833
  probability_cosine_similarity: 0.9999324711

Latency:
  CPU avg latency ms: 467.8799
  NPU avg latency ms: 34.2422

suggested_pass: True

根据上述结果，CPU 与 NPU 的 Top-1 结果一致，Top-5 顺序完全一致，Top-5 集合一致。logits 余弦相似度达到 0.9999870135，probability 最大绝对误差为 0.0018130243，probability 余弦相似度为 0.9999324711。同时，NPU 平均推理耗时为 34.2422 ms，CPU 平均推理耗时为 467.8799 ms，验证结果为 suggested_pass: True。因此，本次 timm/repvit_m1_5.dist_450e_in1k 昇腾 NPU 推理验证通过。

11. 验证截图材料

11.1 NPU 环境截图

npu_env

该截图展示 Ascend NPU Notebook 环境、npu-smi info 输出和 Python 版本信息。

11.2 NPU 推理结果截图

npu_result

该截图展示 NPU 推理日志，包括模型名称、运行设备、输入张量形状、输出张量形状、平均推理耗时和 Top-5 分类结果。

11.3 CPU/NPU 误差对比截图

compare_result

该截图展示 CPU/NPU 输出误差对比结果，包括 Top-K 一致性、logits 误差、probability 误差、余弦相似度、推理耗时和建议验证结论。

12. 运行日志与提交材料

本项目提交材料包括：

README.md
adaptation_report.md
download_model.sh
make_test_image.py
inference.py
compare_cpu_npu.py
make_report.py
requirements.txt
test.jpg
hf_model/config.json
hf_model/model.safetensors
cpu_result.json
cpu_result.txt
cpu_infer.log
npu_result.json
npu_result.txt
npu_infer.log
compare_result.txt
compare_metrics.json
compare.log
fusion_result.json
npu_env.txt
screenshots/npu_env.png
screenshots/npu_result.png
screenshots/compare_result.png

适配报告可查看：

adaptation_report.md

CPU 推理日志可查看：

cpu_result.txt
cpu_infer.log

NPU 推理日志可查看：

npu_result.txt
npu_infer.log

CPU/NPU 误差对比结果可查看：

compare_result.txt
compare_metrics.json
compare.log

13. 适配说明

本项目的适配工作包括：

在 Ascend NPU 环境中完成依赖安装；
使用 timm 加载 repvit_m1_5.dist_450e_in1k 模型；
使用本地 Hugging Face 权重文件 hf_model/model.safetensors；
编写测试图片生成脚本；
编写 CPU/NPU 统一推理脚本；
支持测试图片输入和 timm 图像预处理；
保存 CPU 与 NPU 的结构化分类结果；
编写 CPU/NPU logits 和 probability 误差对比脚本；
计算 Top-1 一致性、Top-5 顺序一致性、Top-5 集合一致性、logits 误差、probability 误差和余弦相似度；
统计 CPU/NPU 平均推理耗时；
输出日志、截图、适配报告和对比结果。

14. 结论

本项目完成了 timm/repvit_m1_5.dist_450e_in1k 模型在 Ascend NPU 环境下的图像分类推理适配验证。

验证结果表明，NPU 推理能够正常完成模型加载、图像预处理和 Top-5 分类输出。CPU 与 NPU 的 Top-1 结果一致，Top-5 顺序完全一致，logits 余弦相似度达到 0.9999870135，probability 最大绝对误差为 0.0018130243，probability 余弦相似度为 0.9999324711，并通过建议验证。同时，本次测试中 NPU 平均推理耗时低于 CPU。本项目可作为赛道一模型适配提交材料。