langlangzi/bert-sentiment-Classification-remark on Ascend NPU

1. 简介

本文档记录 langlangzi/bert-sentiment-Classification-remark 在华为昇腾 Ascend NPU 环境的适配与验证结果。该模型基于 bert-base-chinese 预训练模型微调，用于短视频评论的情感分类任务，支持 positive、negative、neutral 三种情感标签。

模型类型：情感分类 (Sentiment Classification)
基础模型：bert-base-chinese
推理框架：PyTorch + Transformers + torch_npu
标签映射：0->positive, 1->negative, 2->neutral

2. 验证环境

组件	版本
NPU	Ascend910
CANN	25.5.2
PyTorch	2.9.0+cpu
torch_npu	2.9.0.post1+gitee7ba04
Transformers	4.44.2
Python	3.11.14

NPU：2 逻辑卡
模型路径：/opt/atomgit/langlangzi/bert-sentiment-Classification-remark/model/langlangzi/bert-sentiment-Classification-remark

3. 模型推理

推理脚本 inference.py：

import torch
import torch_npu
from transformers import BertForSequenceClassification, BertTokenizer

device = torch.device('npu:0' if torch.npu.is_available() else 'cpu')
tokenizer = BertTokenizer.from_pretrained(model_dir)
model = BertForSequenceClassification.from_pretrained(model_dir)
model.to(device)
model.eval()

def predict_sentiment(text):
    encoding = tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512)
    input_ids = encoding['input_ids'].to(device)
    attention_mask = encoding['attention_mask'].to(device)
    with torch.no_grad():
        output = model(input_ids, attention_mask=attention_mask)
        logits = output.logits
        predicted_label = torch.argmax(logits, dim=1).item()
    label_map = {0: 'positive', 1: 'negative', 2: 'neutral'}
    return label_map[predicted_label]

4. Smoke 验证

基础推理测试：

test_texts = [
    "今天天气真好，心情非常愉快！",
    "这个产品质量太差了，非常不满意。",
    "一般般吧，没什么特别的感觉。",
]
for text in test_texts:
    label = predict_sentiment(text)
    print(f"文本: {text} -> 情感: {label}")

验证结果：

模型可正常加载到 Ascend NPU
推理结果正确，三种情感分类均正常输出
NPU 推理结果与 CPU 推理结果一致

5. 性能参考

测试条件：10 个样本，预热后测量，包含 torch.npu.synchronize() 确保准确计时。

指标	数值
平均延迟	7.86 ms
吞吐量	127.24 samples/sec
首次推理延迟	12.47 ms (含预热)
稳态推理延迟	~7-8 ms

6. 精度评测

对比 NPU 与 CPU 推理的 logits 最大差异，评估精度一致性。

指标	数值
NPU 准确率	100.00%
CPU 准确率	100.00%
最大 logits 差异	0.002763
精度判定	通过 (差异 < 1%)

精度评测源代码和日志详见 eval/ 目录。

7. 注意事项

模型使用 ModelScope SDK 下载，首次下载约需 3-5 分钟
推理时需设置 model.eval() 并在 torch.no_grad() 上下文中运行
首次推理存在编译预热开销 (约 13ms)，后续推理稳定在 7-8ms
建议使用 torch.npu.synchronize() 确保准确计时
该模型不适用于 vLLM 服务，为原生 PyTorch 推理

1. 简介

模型类型：情感分类 (Sentiment Classification)

基础模型：bert-base-chinese

推理框架：PyTorch + Transformers + torch_npu

标签映射：0->positive, 1->negative, 2->neutral

相关获取地址：

组件

版本

NPU

Ascend910

CANN

25.5.2

PyTorch

2.9.0+cpu

torch_npu

2.9.0.post1+gitee7ba04

Transformers

4.44.2

Python

3.11.14

3. 模型推理

推理脚本 inference.py：

import torch
import torch_npu
from transformers import BertForSequenceClassification, BertTokenizer

device = torch.device('npu:0' if torch.npu.is_available() else 'cpu')
tokenizer = BertTokenizer.from_pretrained(model_dir)
model = BertForSequenceClassification.from_pretrained(model_dir)
model.to(device)
model.eval()

def predict_sentiment(text):
    encoding = tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512)
    input_ids = encoding['input_ids'].to(device)
    attention_mask = encoding['attention_mask'].to(device)
    with torch.no_grad():
        output = model(input_ids, attention_mask=attention_mask)
        logits = output.logits
        predicted_label = torch.argmax(logits, dim=1).item()
    label_map = {0: 'positive', 1: 'negative', 2: 'neutral'}
    return label_map[predicted_label]

4. Smoke 验证

基础推理测试：

test_texts = [
    "今天天气真好，心情非常愉快！",
    "这个产品质量太差了，非常不满意。",
    "一般般吧，没什么特别的感觉。",
]
for text in test_texts:
    label = predict_sentiment(text)
    print(f"文本: {text} -> 情感: {label}")

验证结果：

模型可正常加载到 Ascend NPU

推理结果正确，三种情感分类均正常输出

NPU 推理结果与 CPU 推理结果一致

指标

数值

平均延迟

7.86 ms

吞吐量

127.24 samples/sec

首次推理延迟

12.47 ms (含预热)

稳态推理延迟

~7-8 ms

指标

数值

NPU 准确率

100.00%

CPU 准确率

100.00%

最大 logits 差异

0.002763

精度判定

通过 (差异 < 1%)

7. 注意事项

模型使用 ModelScope SDK 下载，首次下载约需 3-5 分钟

推理时需设置 model.eval() 并在 torch.no_grad() 上下文中运行

首次推理存在编译预热开销 (约 13ms)，后续推理稳定在 7-8ms

建议使用 torch.npu.synchronize() 确保准确计时

该模型不适用于 vLLM 服务，为原生 PyTorch 推理