lamhalobotnet50ts_256 是一个基于 timm (PyTorch Image Models) 的图像分类模型,在 ImageNet-1K 数据集上预训练。该模型结合了 Lambda层、Halo层 与 BotNet 架构,在 256x256 输入上训练。
该模型为标准的 PyTorch 图像分类模型,通过 timm 库加载预训练权重,可在昇腾 Ascend910 NPU 上直接运行。
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch torch_npu timm Pillow numpy safetensorsfrom modelscope.hub.snapshot_download import snapshot_download
model_dir = snapshot_download('timm/lamhalobotnet50ts_256.a1h_in1k')import torch
import torch_npu
from timm import create_model
from safetensors.torch import load_file
from PIL import Image
from timm.data import create_transform, resolve_data_config
model = create_model('lamhalobotnet50ts_256.a1h_in1k', pretrained=False)
model.eval()
state_dict = load_file('model.safetensors')
model.load_state_dict(state_dict, strict=False)
model = model.to('npu:0')
img = Image.open('test.jpg').convert('RGB')
cfg = resolve_data_config({}, model=create_model('lamhalobotnet50ts_256.a1h_in1k', pretrained=False))
transform = create_transform(input_size=256, is_training=False,
mean=cfg.get('mean'), std=cfg.get('std'),
interpolation=cfg.get('interpolation', 'bicubic'))
input_tensor = transform(img).unsqueeze(0).to('npu:0')
with torch.no_grad():
output = model(input_tensor)
probs = torch.nn.functional.softmax(output[0].cpu(), dim=0)
top5 = torch.topk(probs, k=5)
for i in range(5):
print(f'Top {i+1}: class={top5.indices[i].item()}, prob={top5.values[i].item():.6f}')CPU 推理:
python3 inference.py --device cpuNPU 推理:
python3 inference.py --device npupython3 compare_cpu_npu.pyLoading lamhalobotnet50ts_256 on npu:0...
=== lamhalobotnet50ts_256 Inference on npu:0 ===
Inference time: 0.2185s
Top 1: class=22, prob=0.093029
Top 2: class=21, prob=0.092194
Top 3: class=23, prob=0.059201
Top 4: class=128, prob=0.047318
Top 5: class=127, prob=0.045120
| 指标 | 数值 |
|---|---|
| MAE (Mean Absolute Error) | 0.00062322 |
| MaxAbsErr (最大绝对误差) | 0.00296104 |
| Cosine Similarity (余弦相似度) | 0.99999972 |
| Mean Relative Error (平均相对误差) | 0.387615% |
| Top-1 预测是否一致 | 一致 (CPU=22, NPU=22) |
| Top-5 重叠数 | 5/5 |
| Max Probability Difference | 0.004033% |
| Top-1 Probability Relative Error | 0.017959% |
| Class | CPU Prob | NPU Prob | 差值 |
|---|---|---|---|
| 21 | 0.092180 | 0.092194 | 0.00001395 |
| 22 | 0.093012 | 0.093029 | 0.00001644 |
| 23 | 0.059165 | 0.059201 | 0.00003644 |
| 127 | 0.045081 | 0.045120 | 0.00003870 |
| 128 | 0.047325 | 0.047318 | 0.00000708 |
NPU与CPU推理结果误差为0.0040%,符合精度误差小于1%的要求
| 设备 | 推理耗时 | 加速比 |
|---|---|---|
| CPU | 0.3023s | 1.00x (基线) |
| NPU (Ascend910) | 0.2042s | 1.48x |
以下日志展示了 NPU 推理成功的关键信息:
Input shape: torch.Size([1, 3, 256, 256])
Top-1 Match: True (CPU=22, NPU=22)
Top-5 Overlap: 5/5
--- Top-5 Probability Comparison ---
Top-1 Probability Relative Error: 0.017959%
Top-1 Prediction: MATCH (CPU=22, NPU=22)