putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast on Ascend NPU

1. 简介

本文档记录 putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast 在华为昇腾 Ascend NPU 环境上的适配与验证结果。

该模型是一个基于 StructBERT 的中文情感分类模型，对输入文本进行正面/负面二分类。

模型类型：text-classification（文本分类）
架构：StructBERT (BERT-based)
语言：中文
标签数：2（负面、正面）
模型参数：hidden_size=256, num_hidden_layers=4, num_attention_heads=4
推理框架：PyTorch + transformers + torch_npu

2. 验证环境

组件	版本
`NPU`	Ascend 910
`torch`	2.5.1
`torch-npu`	2.9.0.post1+gitee7ba04
`transformers`	4.57.6
`Python`	3.11.14

NPU 数量：2 逻辑卡
模型路径：~/putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast/model/putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast/

3. 推理

使用提供的推理脚本直接运行：

cd /opt/atomgit/putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast
python3 inference.py

推理脚本会自动完成以下操作：

加载模型配置和权重（自动映射 encoder. 前缀为 bert.，head.classifier. 前缀为 classifier.）
将模型加载到 NPU 设备
对测试文本进行情感分类推理
输出预测结果及概率

4. Smoke 验证

python3 inference.py

预期输出示例：

Using device: npu:0
NPU available: True

Text: 这个产品非常好用，我很喜欢！
Prediction: 正面 (label: 1)
Probabilities: 负面=0.0095, 正面=0.9905

Text: 质量太差了，完全不推荐购买。
Prediction: 负面 (label: 0)
Probabilities: 负面=0.9875, 正面=0.0125

5. 性能参考

Batch Size	平均延迟 (ms)	P50 (ms)	P90 (ms)	P99 (ms)	吞吐量 (samples/sec)
1	2.680	2.684	2.730	2.759	373.14
4	2.601	2.603	2.635	2.650	1537.61
8	2.639	2.621	2.643	3.284	3031.83
16	2.629	2.638	2.659	2.665	6086.98

6. 精度评测

精度验证方法：在 CPU 和 NPU 上分别运行相同输入，对比输出概率值。

指标	数值
测试样本数	10
最大概率差异	0.000146 (0.0146%)
误差阈值	< 1%
是否通过	是
预测一致率	10/10 (100%)

所有样本的 CPU 与 NPU 推理结果完全一致，最大概率差异仅 0.0146%，远低于 1% 的精度阈值。

7. 注意事项

模型权重下载自 ModelScope，路径为 putaoshu/nlp_structbert_sentiment-classification_chinese-tiny-fast
模型权重中 encoder. 前缀在加载时需映射为 bert.，head.classifier. 前缀映射为 classifier.，以兼容 HuggingFace BertForSequenceClassification
推理时需安装 torch_npu 以及对应的 PyTorch 版本
该模型为小型模型（4层 Transformer），推理速度快，单条推理延迟通常 < 5ms