smp-hub/mit_b1.imagenet 昇腾 NPU 适配

1. 模型信息

模型名称：smp-hub/mit_b1.imagenet
模型来源：Hugging Face / smp-hub
模型类型：图像特征提取模型 / Encoder 模型
Encoder 名称：mit_b1
Encoder 权重：imagenet
推理框架：PyTorch + segmentation-models-pytorch + torch-npu
运行设备：Ascend NPU
适配目标：完成 smp-hub/mit_b1.imagenet 模型在昇腾 NPU 环境下的前向推理验证，并与 CPU 推理结果进行多尺度特征误差对比。

本项目面向昇腾 Model-Agent 模型适配大赛赛道一，完成 smp-hub/mit_b1.imagenet 模型在 Ascend NPU 环境下的适配验证。该模型主要作为 segmentation-models-pytorch 中的 MiT-B1 encoder 使用，输出多尺度图像特征。本项目基于 PyTorch、segmentation-models-pytorch 和 torch-npu，完成模型加载、测试图片生成、CPU 推理、NPU 推理、多尺度特征输出保存、CPU/NPU 输出一致性对比以及验证材料整理。

2. 项目说明

smp-hub/mit_b1.imagenet 是 Hugging Face / smp-hub 提供的 mit_b1 ImageNet 预训练 encoder 权重，主要用于语义分割等下游任务中的特征提取。本项目使用固定测试图片 test.jpg 作为输入，分别在 CPU 与 Ascend NPU 上执行 encoder 前向推理，并比较两端输出的多尺度特征是否一致。

本次适配重点包括：

在 Ascend NPU Notebook 环境中安装依赖；
从 Hugging Face 下载并加载 smp-hub/mit_b1.imagenet 模型；
构造测试图片 test.jpg；
分别执行 CPU 与 NPU encoder 前向推理；
保存 CPU/NPU 的多尺度特征输出；
跳过输入回传和空特征输出，仅比较有效 encoder 特征；
对 CPU 与 NPU 的有效特征向量进行误差对比；
保存日志、截图和适配报告，用于赛道一模型适配验证提交。

3. 工程结构

.
├── README.md
├── adaptation_report.md
├── download_model.sh
├── make_test_image.py
├── inference.py
├── compare_cpu_npu.py
├── requirements.txt
├── test.jpg
├── cpu_result.json
├── cpu_result.txt
├── cpu_infer.log
├── npu_result.json
├── npu_result.txt
├── npu_infer.log
├── compare_result.txt
├── fusion_result.json
└── run.log

其中：

download_model.sh：模型下载脚本；
make_test_image.py：测试图片生成脚本；
inference.py：CPU/NPU 推理脚本；
compare_cpu_npu.py：CPU/NPU 特征误差对比脚本；
cpu_result.json：CPU 推理结构化结果；
cpu_result.txt：CPU 推理日志；
cpu_infer.log：CPU 推理补充日志；
npu_result.json：NPU 推理结构化结果；
npu_result.txt：NPU 推理日志；
npu_infer.log：NPU 推理补充日志；
compare_result.txt：CPU/NPU 误差对比结果；
fusion_result.json：汇总结果文件；
run.log：完整运行日志；
adaptation_report.md：适配报告。

4. 环境检查

在 Ascend NPU Notebook 中执行以下命令检查运行环境：

npu-smi info
python --version
python - <<'PY'
import torch
print("torch:", torch.__version__)

try:
    import torch_npu
    print("torch_npu import success")
    print("npu available:", torch.npu.is_available())
except Exception as e:
    print("torch_npu import failed:", repr(e))
PY

环境检查截图建议保存为：

screenshots/npu_env.png

该截图用于证明当前运行环境存在 Ascend NPU，并记录 NPU 型号、运行状态和 Python 版本信息。

5. 模型下载

运行：

bash download_model.sh

模型来源为 Hugging Face：

smp-hub/mit_b1.imagenet

模型加载信息如下：

model: smp-hub/mit_b1.imagenet
encoder_name: mit_b1
encoder_weights: imagenet

模型文件下载完成后，推理脚本会加载 MiT-B1 encoder，并执行 CPU/NPU 前向推理。

6. 测试输入

本项目使用测试图片：

test.jpg

推理流程包括：

读取测试图片；
对图片进行模型要求的预处理；
构造输入张量；
输入 MiT-B1 encoder；
输出多尺度特征；
跳过输入回传输出和空特征输出；
对有效多尺度特征进行拼接和误差对比。

输入张量形状为：

[1, 3, 224, 224]

7. CPU 推理

运行：

python inference.py --device cpu --image test.jpg --output cpu_result.json 2>&1 | tee cpu_result.txt

CPU 推理输出文件：

cpu_result.json
cpu_result.txt
cpu_infer.log

CPU 推理日志摘要如下：

model: smp-hub/mit_b1.imagenet
model_repo: smp-hub/mit_b1.imagenet
encoder_name: mit_b1
encoder_weights: imagenet
device: cpu
image: test.jpg
input_shape: [1, 3, 224, 224]
elapsed_seconds: 0.318371

CPU 原始输出数量为：

num_outputs: 6

CPU 全部输出形状如下：

output[0]: [1, 3, 224, 224]
output[1]: [1, 0, 112, 112]
output[2]: [1, 64, 56, 56]
output[3]: [1, 128, 28, 28]
output[4]: [1, 320, 14, 14]
output[5]: [1, 512, 7, 7]

其中 output[0] 为输入回传特征，output[1] 为空特征，因此对比时跳过：

skipped_output_indices: [0, 1]

CPU 实际参与对比的多尺度特征形状如下：

compared_output[0]: [1, 64, 56, 56]
compared_output[1]: [1, 128, 28, 28]
compared_output[2]: [1, 320, 14, 14]
compared_output[3]: [1, 512, 7, 7]

CPU 拼接后的特征向量长度为：

feature_vector_length: 388864

CPU 特征统计如下：

shape: [388864]
numel: 388864
min: -10.98579502
max: 8.91627693
mean: -0.01094125
std: 0.54648173

8. NPU 推理

运行：

python inference.py --device npu --image test.jpg --output npu_result.json 2>&1 | tee npu_result.txt

NPU 推理输出文件：

npu_result.json
npu_result.txt
npu_infer.log

NPU 推理日志摘要如下：

model: smp-hub/mit_b1.imagenet
model_repo: smp-hub/mit_b1.imagenet
encoder_name: mit_b1
encoder_weights: imagenet
device: npu
image: test.jpg
input_shape: [1, 3, 224, 224]
elapsed_seconds: 17.324281

NPU 原始输出数量为：

num_outputs: 6

NPU 全部输出形状如下：

output[0]: [1, 3, 224, 224]
output[1]: [1, 0, 112, 112]
output[2]: [1, 64, 56, 56]
output[3]: [1, 128, 28, 28]
output[4]: [1, 320, 14, 14]
output[5]: [1, 512, 7, 7]

NPU 实际参与对比的多尺度特征形状如下：

compared_output[0]: [1, 64, 56, 56]
compared_output[1]: [1, 128, 28, 28]
compared_output[2]: [1, 320, 14, 14]
compared_output[3]: [1, 512, 7, 7]

NPU 拼接后的特征向量长度为：

feature_vector_length: 388864

NPU 特征统计如下：

shape: [388864]
numel: 388864
min: -10.99114037
max: 8.90920639
mean: -0.01094007
std: 0.54658347

NPU 推理结果截图建议保存为：

screenshots/npu_result.png

9. CPU/NPU 误差对比

运行：

python compare_cpu_npu.py --cpu cpu_result.json --npu npu_result.json 2>&1 | tee compare_result.txt

对比脚本会读取 CPU 与 NPU 的有效多尺度特征输出，并计算：

CPU 输入形状；
NPU 输入形状；
CPU 有效特征输出形状；
NPU 有效特征输出形状；
拼接后特征向量长度；
最大绝对误差；
平均绝对误差；
RMSE；
最大相对误差；
平均相对误差；
余弦相似度；
是否通过验证。

对比结果保存为：

compare_result.txt

10. 自验证结果

本次 smp-hub/mit_b1.imagenet 适配验证的 CPU/NPU 误差结果如下：

指标	结果
CPU 输入形状	`[1, 3, 224, 224]`
NPU 输入形状	`[1, 3, 224, 224]`
CPU 有效输出形状	`[[1, 64, 56, 56], [1, 128, 28, 28], [1, 320, 14, 14], [1, 512, 7, 7]]`
NPU 有效输出形状	`[[1, 64, 56, 56], [1, 128, 28, 28], [1, 320, 14, 14], [1, 512, 7, 7]]`
特征向量长度	`388864`
最大绝对误差	`0.0224874020`
平均绝对误差	`0.0005130929`
RMSE	`0.0011662909`
最大相对误差	`0.0020469526`
平均相对误差	`0.0019247755`
余弦相似度	`0.9999977413`
是否通过验证	`True`

对应的 compare_result.txt 内容如下：

CPU/NPU comparison result
==================================================
model: smp-hub/mit_b1.imagenet
encoder_name: mit_b1
encoder_weights: imagenet
cpu_result: cpu_result.json
npu_result: npu_result.json

cpu_input_shape: [1, 3, 224, 224]
npu_input_shape: [1, 3, 224, 224]

cpu_compared_output_shapes: [[1, 64, 56, 56], [1, 128, 28, 28], [1, 320, 14, 14], [1, 512, 7, 7]]
npu_compared_output_shapes: [[1, 64, 56, 56], [1, 128, 28, 28], [1, 320, 14, 14], [1, 512, 7, 7]]

feature_vector_shape: (388864,)

max_abs_error: 0.0224874020
mean_abs_error: 0.0005130929
rmse: 0.0011662909
max_relative_error: 0.0020469526
mean_relative_error: 0.0019247755
cosine_similarity: 0.9999977413

passed: True

根据上述结果，CPU 与 NPU 的有效多尺度特征输出形状一致，拼接后的特征向量长度均为 388864。最大绝对误差为 0.0224874020，平均绝对误差为 0.0005130929，余弦相似度达到 0.9999977413，验证结果为 passed: True。因此，本次 smp-hub/mit_b1.imagenet 昇腾 NPU 推理验证通过。

11. 验证截图材料

11.1 NPU 环境截图

npu_env

该截图展示 Ascend NPU Notebook 环境、npu-smi info 输出和 Python 版本信息。

11.2 NPU 推理结果截图

npu_result

该截图展示 NPU 推理日志，包括模型名称、运行设备、输入图片、输入张量形状、多尺度输出形状和特征统计信息。

11.3 CPU/NPU 误差对比截图

compare_result

该截图展示 CPU/NPU 多尺度特征误差对比结果，包括有效输出形状、最大绝对误差、平均绝对误差、RMSE、相对误差和余弦相似度。

12. 运行日志与提交材料

本项目提交材料包括：

README.md
adaptation_report.md
download_model.sh
make_test_image.py
inference.py
compare_cpu_npu.py
requirements.txt
test.jpg
cpu_result.json
cpu_result.txt
cpu_infer.log
npu_result.json
npu_result.txt
npu_infer.log
compare_result.txt
fusion_result.json
run.log
screenshots/npu_env.png
screenshots/npu_result.png
screenshots/compare_result.png

完整运行日志可查看：

run.log

CPU 推理日志可查看：

cpu_result.txt
cpu_infer.log

NPU 推理日志可查看：

npu_result.txt
npu_infer.log

CPU/NPU 误差对比结果可查看：

compare_result.txt

13. 适配说明

本项目的适配工作包括：

在 Ascend NPU 环境中完成依赖安装；
编写 Hugging Face / smp-hub 模型下载脚本；
加载 smp-hub/mit_b1.imagenet 模型；
编写测试图片生成脚本；
编写 CPU/NPU 统一推理脚本；
支持测试图片输入和模型图像预处理；
保存 CPU 与 NPU 的多尺度 encoder 特征输出；
跳过输入回传输出和空特征输出，仅对有效 encoder 特征进行比较；
编写 CPU/NPU 特征误差对比脚本；
计算最大绝对误差、平均绝对误差、RMSE、相对误差和余弦相似度；
输出完整日志、截图和适配报告。

14. 结论

本项目完成了 smp-hub/mit_b1.imagenet 模型在 Ascend NPU 环境下的 encoder 特征提取推理适配验证。

验证结果表明，NPU 推理能够正常完成模型加载、图像预处理和多尺度特征输出。CPU 与 NPU 的有效输出形状一致，特征向量长度均为 388864，平均绝对误差为 0.0005130929，余弦相似度达到 0.9999977413，并通过验证。本项目可作为赛道一模型适配提交材料。