timm/tf_efficientnet_lite2.in1k on Ascend NPU

1. 简介

本项目将 ModelScope 上的 timm/tf_efficientnet_lite2.in1k 图像分类模型适配到华为昇腾 NPU (Ascend910) 上运行。模型基于 EfficientNet 架构，输入尺寸为 260x260，输出 1000 类 ImageNet 分类概率。

原始模型地址：https://modelscope.cn/models/timm/tf_efficientnet_lite2.in1k
参数量：6,092,072
任务类型：图像分类

2. 验证环境

NPU：Ascend910
CANN：8.5.1
PyTorch：2.x
torch_npu：可用
timm：1.0.27
modelscope：1.35.3

环境检查日志见 logs/env_check.log。

3. 推理运行

pip install -r requirements.txt
python inference.py

推理结果 (NPU):

=== NPU Inference Result ===
Input shape: [1, 3, 260, 260]
Output shape: [1, 1000]
Top-5 predictions:
  1. class 978: 0.1030
  2. class 975: 0.0805
  3. class 538: 0.0475
  4. class 972: 0.0458
  5. class 668: 0.0428

Raw logits (first 10): [0.26419690251350403, -0.4944930374622345, -0.4740123748779297, -1.1688908338546753, -0.6446986794471741, -0.29545584321022034, -1.731345295906067, -0.727383553981781, -1.3421813249588013, -1.7245378494262695]

日志保存在 logs/inference.log。

4. 精度验证

对单张测试输入进行 CPU 与 NPU 一致性验证：

指标	数值
max_abs_error	0.008945
mean_abs_error	0.001646
relative_error	0.1967%
cosine_similarity	0.999999
threshold	1.0%
结果	PASS

5. 性能参考

在 Ascend910 NPU 上的单输入推理性能（输入尺寸 1x3x260x260）：

指标	数值
avg_latency	8.835 ms
min_latency	7.636 ms
max_latency	10.015 ms
p50_latency	9.821 ms
p90_latency	10.015 ms
p95_latency	10.015 ms
throughput	113.19 infer/s

日志保存在 logs/benchmark.log。

6. 精度评测说明

本项目包含单输入 smoke consistency 验证，非官方完整验证集评测。详细指标见第 4 节。

7. 自验证截图

见 screenshots/self_verification.png 和 screenshots/self_verification.txt。

8. 日志文件

文件	内容
`logs/env_check.log`	NPU 环境检查结果
`logs/paths.txt`	模型下载路径记录
`logs/inference.log`	NPU 推理输出
`logs/accuracy.log`	CPU-NPU 精度一致性检查
`logs/benchmark.log`	NPU 性能基准测试

9. 注意事项

权重通过 ModelScope snapshot_download 下载，使用本地权重加载，不依赖 HuggingFace 自动下载。
推理脚本使用 pretrained=False + load_state_dict 方式加载本地权重。
请勿将权重文件（.bin, .safetensors, .pth 等）提交到 Git 仓库。
timm 数据预处理配置通过 timm.data.resolve_model_data_config 自动解析。

10. 标签

#NPU #Ascend #Ascend910 #timm #EfficientNet #图像分类

1. 简介

3. 推理运行

pip install -r requirements.txt
python inference.py

推理结果 (NPU):

=== NPU Inference Result ===
Input shape: [1, 3, 260, 260]
Output shape: [1, 1000]
Top-5 predictions:
  1. class 978: 0.1030
  2. class 975: 0.0805
  3. class 538: 0.0475
  4. class 972: 0.0458
  5. class 668: 0.0428

Raw logits (first 10): [0.26419690251350403, -0.4944930374622345, -0.4740123748779297, -1.1688908338546753, -0.6446986794471741, -0.29545584321022034, -1.731345295906067, -0.727383553981781, -1.3421813249588013, -1.7245378494262695]

日志保存在 logs/inference.log。

指标

数值

max_abs_error

0.008945

mean_abs_error

0.001646

relative_error

0.1967%

cosine_similarity

0.999999

threshold

1.0%

结果

PASS

指标

数值

avg_latency

8.835 ms

min_latency

7.636 ms

max_latency

10.015 ms

p50_latency

9.821 ms

p90_latency

10.015 ms

p95_latency

10.015 ms

throughput

113.19 infer/s

文件

内容

logs/env_check.log

NPU 环境检查结果

logs/paths.txt

模型下载路径记录

logs/inference.log

NPU 推理输出

logs/accuracy.log

CPU-NPU 精度一致性检查

logs/benchmark.log

NPU 性能基准测试

9. 注意事项

权重通过 ModelScope snapshot_download 下载，使用本地权重加载，不依赖 HuggingFace 自动下载。

推理脚本使用 pretrained=False + load_state_dict 方式加载本地权重。

请勿将权重文件（.bin, .safetensors, .pth 等）提交到 Git 仓库。

timm 数据预处理配置通过 timm.data.resolve_model_data_config 自动解析。