test_convnext2.r160_in1k-NPU

模型介绍

test_convnext2.r160_in1k 是一个轻量级的 ConvNeXt 图像分类模型，参数量 0.48M，输入尺寸为 160x160。该模型主要用于 ConvNeXt 架构的测试和验证，在 ImageNet-1k 数据集上训练。

本仓库提供了该模型在 华为昇腾 NPU (Ascend910) 上的适配代码和推理结果。

模型信息

属性	值
模型名称	test_convnext2.r160_in1k
原始模型地址	ModelScope
任务类型	图像分类 (Image Classification)
模型架构	ConvNeXt
框架	PyTorch + timm
输入格式	RGB 图像，160×160
输出格式	1000 类 ImageNet 分类 logits
参数量	0.48M
数据集	ImageNet-1k

依赖环境

组件	版本
Python	3.11.x
PyTorch	2.9.0
torch_npu	2.9.0.post1
timm	1.0.27
modelscope	1.35.3
CANN	8.5.1
NPU	Ascend910 (64GB)

NPU 适配说明

该模型为标准 timm ConvNeXt 图像分类模型，无需修改模型代码即可直接运行在昇腾 NPU 上。适配过程主要包括：

通过 ModelScope 下载模型权重
使用 timm 加载模型并进行推理
将模型移至 NPU 设备运行
对比 CPU 与 NPU 推理结果验证精度

环境准备

安装依赖

pip install -r requirements.txt

下载模型

模型已通过 ModelScope 自动下载到本地目录。

推理命令

执行推理

python3 inference.py

该脚本会自动完成 CPU 和 NPU 推理，输出 Top-5 分类结果和推理耗时。

CPU/NPU 精度对比

python3 compare_cpu_npu.py

该脚本对比 CPU 和 NPU 的推理结果，包括 logits 差异、概率差异、Top-5 一致性和余弦相似度。

推理结果

分类结果

使用测试图片 test/test_owl.jpg 进行推理：

排名	CPU 类别索引	CPU 概率	NPU 类别索引	NPU 概率	匹配
1	24	40.47%	24	40.40%	✓
2	82	5.93%	82	5.95%	✓
3	81	3.65%	81	3.66%	✓
4	316	2.81%	316	2.82%	✓
5	377	2.12%	377	2.12%	✓

推理耗时

设备	耗时 (ms/sample)	加速比
CPU	6.27	1.00x
NPU	1.44	4.36x

CPU/NPU 精度测试结果

测试方法

使用同一张测试图片（224×224 缩放至 160×160）
分别在 CPU 和 NPU 上运行模型推理
对比 logits 输出、softmax 概率和分类结果

精度指标

指标	值
Logits 最大绝对差异	0.00443959
Logits 平均绝对差异	0.00072614
概率最大差异	0.071%
概率平均差异	0.000%
Top-5 一致率	100% (5/5)
余弦相似度 (logits)	99.999991%
显著 logits 最大相对误差	0.135%
显著 logits 平均相对误差	0.030%

结论

NPU 与 CPU 推理误差 < 1%，精度验证通过。

模拟终端输出截图

推理截图

部署和推理方法

使用 timm 加载模型

import timm
import torch
from PIL import Image

# 加载模型
model = timm.create_model('test_convnext2.r160_in1k', pretrained=True)
model = model.eval()

# 数据预处理
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

# 加载图片并推理
img = Image.open('test.jpg').convert('RGB')
input_tensor = transforms(img).unsqueeze(0)

# CPU 推理
with torch.no_grad():
    output = model(input_tensor)

# NPU 推理
if hasattr(torch, 'npu') and torch.npu.is_available():
    model_npu = model.to('npu')
    input_npu = input_tensor.to('npu')
    with torch.no_grad():
        output_npu = model_npu(input_npu)

许可证

Apache-2.0

模型介绍

本仓库提供了该模型在 华为昇腾 NPU (Ascend910) 上的适配代码和推理结果。

属性

值

模型名称

test_convnext2.r160_in1k

原始模型地址

ModelScope

任务类型

图像分类 (Image Classification)

模型架构

ConvNeXt

框架

PyTorch + timm

输入格式

RGB 图像，160×160

输出格式

1000 类 ImageNet 分类 logits

参数量

0.48M

数据集

ImageNet-1k

组件

版本

Python

3.11.x

PyTorch

2.9.0

torch_npu

2.9.0.post1

timm

1.0.27

modelscope

1.35.3

CANN

8.5.1

NPU

Ascend910 (64GB)

排名

CPU 类别索引

CPU 概率

NPU 类别索引

NPU 概率

匹配

40.47%

40.40%

✓

5.93%

5.95%

✓

3.65%

3.66%

✓

316

2.81%

316

2.82%

✓

377

2.12%

377

2.12%

✓

设备

耗时 (ms/sample)

加速比

CPU

6.27

1.00x

NPU

1.44

4.36x

CPU/NPU 精度测试结果

测试方法

使用同一张测试图片（224×224 缩放至 160×160）

分别在 CPU 和 NPU 上运行模型推理

对比 logits 输出、softmax 概率和分类结果

精度指标

指标	值
Logits 最大绝对差异	0.00443959
Logits 平均绝对差异	0.00072614
概率最大差异	0.071%
概率平均差异	0.000%
Top-5 一致率	100% (5/5)
余弦相似度 (logits)	99.999991%
显著 logits 最大相对误差	0.135%
显著 logits 平均相对误差	0.030%

结论

NPU 与 CPU 推理误差 < 1%，精度验证通过。

部署和推理方法

使用 timm 加载模型

import timm
import torch
from PIL import Image

# 加载模型
model = timm.create_model('test_convnext2.r160_in1k', pretrained=True)
model = model.eval()

# 数据预处理
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

# 加载图片并推理
img = Image.open('test.jpg').convert('RGB')
input_tensor = transforms(img).unsqueeze(0)

# CPU 推理
with torch.no_grad():
    output = model(input_tensor)

# NPU 推理
if hasattr(torch, 'npu') and torch.npu.is_available():
    model_npu = model.to('npu')
    input_npu = input_tensor.to('npu')
    with torch.no_grad():
        output_npu = model_npu(input_npu)