timm/seresnext50_32x4d.racm_in1k 昇腾 NPU 适配

1. 模型信息

模型名称：timm/seresnext50_32x4d.racm_in1k
模型来源：Hugging Face / timm
模型类型：ImageNet-1k 图像分类模型
模型结构：SE-ResNeXt50 32x4d
权重格式：model.safetensors
输出类别数：1000
推理框架：PyTorch + timm + torch-npu
运行设备：Ascend NPU
适配目标：完成 timm/seresnext50_32x4d.racm_in1k 模型在昇腾 NPU 环境下的图像分类推理验证，并与 CPU 推理结果进行误差对比。

本项目用于验证 timm/seresnext50_32x4d.racm_in1k 模型在 Ascend NPU 环境下的推理流程。项目基于 timm 构建 SE-ResNeXt50 32x4d 图像分类模型，加载 Hugging Face 预训练权重，完成测试图片生成、CPU 推理、NPU 推理、Top-K 分类输出、CPU/NPU 输出误差对比以及验证材料整理。

2. 项目说明

seresnext50_32x4d.racm_in1k 是 timm 模型库中的图像分类模型，模型结构为 SE-ResNeXt50 32x4d，适用于 ImageNet-1k 图像分类任务。本项目将该模型适配到 Ascend NPU Notebook 环境中，验证其在 NPU 上能否完成正常的前向推理。

本次适配重点包括：

在 Ascend NPU Notebook 环境中安装依赖；
从 Hugging Face 下载并加载本地 model.safetensors 权重；
构造测试图片 test.jpg；
分别执行 CPU 与 NPU 图像分类推理；
保存 CPU/NPU 的 Top-K 分类结果；
对 CPU 与 NPU 输出 logits 进行误差对比；
保存日志、截图和适配报告，用于赛道一模型适配验证提交。

3. 工程结构

.
├── README.md
├── adaptation_report.md
├── download_model.sh
├── make_test_image.py
├── inference.py
├── compare_cpu_npu.py
├── test_seresnext50.py
├── requirements.txt
├── test.jpg
├── cpu_result.json
├── cpu_result.txt
├── npu_result.json
├── npu_result.txt
├── compare_result.txt
├── run.log
├── outputs/
│   ├── demo_input.png
│   └── top5_result.json
└── screenshots/
    ├── npu_env.png
    ├── npu_result.png
    └── compare_result.png

其中：

download_model.sh：模型下载脚本；
make_test_image.py：测试图片生成脚本；
inference.py：CPU/NPU 推理脚本；
compare_cpu_npu.py：CPU/NPU 输出误差对比脚本；
test_seresnext50.py：早期单次推理验证脚本；
cpu_result.json：CPU 推理结构化结果；
cpu_result.txt：CPU 推理日志；
npu_result.json：NPU 推理结构化结果；
npu_result.txt：NPU 推理日志；
compare_result.txt：CPU/NPU 误差对比结果；
run.log：完整运行日志；
adaptation_report.md：适配报告；
screenshots/：验证截图材料。

4. 环境检查

在 Ascend NPU Notebook 中执行以下命令检查运行环境：

npu-smi info
python --version
python - <<'PY'
import torch
import timm

print("torch:", torch.__version__)
print("timm:", timm.__version__)

try:
    import torch_npu
    print("torch_npu import success")
    print("npu available:", torch.npu.is_available())
except Exception as e:
    print("torch_npu import failed:", repr(e))
PY

环境检查截图保存为：

screenshots/npu_env.png

该截图用于证明当前运行环境存在 Ascend NPU，并记录 NPU 型号、运行状态和 Python 版本信息。

5. 模型下载

运行：

bash download_model.sh

模型来源为 Hugging Face：

timm/seresnext50_32x4d.racm_in1k

模型权重文件保存到：

./model/model.safetensors

本次推理日志中显示：

Loaded local safetensors weight: model/model.safetensors
Missing keys: 0, unexpected keys: 0
pretrained_loaded: true

说明本次验证已经成功加载 Hugging Face 预训练权重，且模型权重字段与本地 timm 模型结构匹配。

6. 测试输入

本项目使用测试图片：

test.jpg

推理流程包括：

读取测试图片；
使用 timm 对图片进行预处理；
构造模型输入张量；
输入 SE-ResNeXt50 32x4d 模型；
输出 1000 维 ImageNet 分类 logits；
计算 softmax 概率；
输出 Top-5 分类结果。

7. CPU 推理

运行：

python inference.py --device cpu --model ./model --image test.jpg --output cpu_result.json 2>&1 | tee cpu_result.txt

CPU 推理输出文件：

cpu_result.json
cpu_result.txt

CPU 推理日志摘要如下：

Loading model: ./model
Model ID: timm/seresnext50_32x4d.racm_in1k
Architecture: seresnext50_32x4d
Using device: cpu
torch: 2.9.0+cpu
timm: 1.0.27
Loaded local safetensors weight: model/model.safetensors
Missing keys: 0, unexpected keys: 0
Input image: test.jpg
Input tensor shape: [1, 3, 224, 224]
Input tensor device: cpu

CPU Top-5 输出如下：

[
  {
    "label_id": 551,
    "label": "class_551",
    "score": 0.08635853230953217
  },
  {
    "label_id": 535,
    "label": "class_535",
    "score": 0.05465596541762352
  },
  {
    "label_id": 746,
    "label": "class_746",
    "score": 0.04685424640774727
  },
  {
    "label_id": 686,
    "label": "class_686",
    "score": 0.03751659393310547
  },
  {
    "label_id": 758,
    "label": "class_758",
    "score": 0.03158804774284363
  }
]

8. NPU 推理

运行：

python inference.py --device npu --model ./model --image test.jpg --output npu_result.json 2>&1 | tee npu_result.txt

NPU 推理输出文件：

npu_result.json
npu_result.txt

NPU 推理日志摘要如下：

Loading model: ./model
Model ID: timm/seresnext50_32x4d.racm_in1k
Architecture: seresnext50_32x4d
Using device: npu:0
torch: 2.9.0+cpu
timm: 1.0.27
Loaded local safetensors weight: model/model.safetensors
Missing keys: 0, unexpected keys: 0
Input image: test.jpg
Input tensor shape: [1, 3, 224, 224]
Input tensor device: npu:0

NPU Top-5 输出如下：

[
  {
    "label_id": 551,
    "label": "class_551",
    "score": 0.08633249253034592
  },
  {
    "label_id": 535,
    "label": "class_535",
    "score": 0.05474941432476044
  },
  {
    "label_id": 746,
    "label": "class_746",
    "score": 0.04703562334179878
  },
  {
    "label_id": 686,
    "label": "class_686",
    "score": 0.037461500614881516
  },
  {
    "label_id": 758,
    "label": "class_758",
    "score": 0.03154263272881508
  }
]

NPU 推理结果截图保存为：

screenshots/npu_result.png

日志中出现的 CANN owner warning 和：

path string is NULLpath string is NULL

属于 torch-npu / CANN 环境提示信息，不影响前面的模型加载、NPU 推理和结果保存。本次验证已经成功获得 NPU 分类输出。

9. CPU/NPU 误差对比

运行：

python compare_cpu_npu.py --cpu cpu_result.json --npu npu_result.json 2>&1 | tee compare_result.txt

对比脚本会读取 CPU 与 NPU 的 logits 输出，并计算：

CPU/NPU Top-1 是否一致；
CPU/NPU Top-K 集合是否一致；
CPU Top-K 标签；
NPU Top-K 标签；
最大绝对误差；
平均绝对误差；
最大相对误差；
平均相对误差；
是否通过设定阈值。

对比结果保存为：

compare_result.txt

10. 自验证结果

本次 timm/seresnext50_32x4d.racm_in1k 适配验证的 CPU/NPU 误差结果如下：

指标	结果
CPU Top-1	`class_551`
NPU Top-1	`class_551`
CPU/NPU Top-1 是否一致	`true`
CPU/NPU Top-K 集合是否一致	`true`
CPU Top-K	`[551, 535, 746, 686, 758]`
NPU Top-K	`[551, 535, 746, 686, 758]`
最大绝对误差	`0.0049665868282318115`
平均绝对误差	`0.000835765793453902`
最大相对误差	`0.13039982318878174`
平均相对误差	`0.0008824918186292052`
阈值	`0.01`
是否通过阈值验证	`true`

对应的 compare_result.txt 内容如下：

{
  "cpu_result": "cpu_result.json",
  "npu_result": "npu_result.json",
  "same_top1": true,
  "same_topk_set": true,
  "cpu_topk": [
    551,
    535,
    746,
    686,
    758
  ],
  "npu_topk": [
    551,
    535,
    746,
    686,
    758
  ],
  "max_abs_diff": 0.0049665868282318115,
  "mean_abs_diff": 0.000835765793453902,
  "max_rel_diff": 0.13039982318878174,
  "mean_rel_diff": 0.0008824918186292052,
  "threshold": 0.01,
  "passed": true
}

根据上述结果，CPU 与 NPU 的 Top-1 预测均为 class_551，Top-K 集合一致，平均相对误差为 0.0008824918186292052，低于设定阈值 0.01。因此，本次 timm/seresnext50_32x4d.racm_in1k 昇腾 NPU 推理验证通过。

11. 验证截图材料

11.1 NPU 环境截图

npu_env

该截图展示 Ascend NPU Notebook 环境、npu-smi info 输出和 Python 版本信息。

11.2 NPU 推理结果截图

npu_result

该截图展示 NPU 推理日志，包括模型权重加载、运行设备、测试图片、输入张量形状、输出张量形状和 Top-5 分类输出。

11.3 CPU/NPU 误差对比截图

compare_result

该截图展示 CPU/NPU 输出误差对比结果，包括 Top-1 一致性、Top-K 一致性、最大绝对误差、平均绝对误差、最大相对误差和平均相对误差。

12. 运行日志与提交材料

本项目提交材料包括：

README.md
adaptation_report.md
download_model.sh
make_test_image.py
inference.py
compare_cpu_npu.py
test_seresnext50.py
requirements.txt
test.jpg
cpu_result.json
cpu_result.txt
npu_result.json
npu_result.txt
compare_result.txt
run.log
outputs/demo_input.png
outputs/top5_result.json
screenshots/npu_env.png
screenshots/npu_result.png
screenshots/compare_result.png

完整运行日志可查看：

run.log

CPU 推理日志可查看：

cpu_result.txt

NPU 推理日志可查看：

npu_result.txt

CPU/NPU 误差对比结果可查看：

compare_result.txt

13. 适配说明

本项目的适配工作包括：

在 Ascend NPU 环境中完成依赖安装；
编写 Hugging Face 模型下载脚本；
下载并加载本地 model.safetensors 预训练权重；
使用 timm 构建 seresnext50_32x4d 模型结构；
编写测试图片生成脚本；
编写 CPU/NPU 统一推理脚本；
支持测试图片输入和 timm 图像预处理；
保存 CPU 与 NPU 的结构化分类结果；
编写 CPU/NPU logits 误差对比脚本；
计算 Top-1 一致性、Top-K 一致性、最大绝对误差、平均绝对误差、最大相对误差和平均相对误差；
输出完整日志、截图和适配报告。

14. 结论

本项目完成了 timm/seresnext50_32x4d.racm_in1k 模型在 Ascend NPU 环境下的图像分类推理适配验证。

验证结果表明，NPU 推理能够正常完成预训练权重加载、图像预处理和 Top-5 分类输出。CPU 与 NPU 的 Top-1 预测均为 class_551，Top-K 集合一致，平均相对误差为 0.0008824918186292052，低于设定阈值 0.01，并通过验证。本项目可作为赛道一模型适配提交材料。