roberta-base-go_emotions on Ascend NPU

1. 简介

本文档记录 SamLowe/roberta-base-go_emotions 情绪分类模型在昇腾 NPU（Ascend 910B3）上的迁移适配、精度评测与性能验证结果。

该模型基于 RoBERTa-base 在 GoEmotions 数据集上微调，支持 28 种细粒度情绪的多标签分类（multi-label classification），包括 admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, neutral。

2. 验证环境

组件	版本
`torch`	`2.8.0`
`torch_npu`	`2.8.0.post4`
`transformers`	`5.8.1`
`CANN`	`8.5.1`

NPU：8 × Ascend 910B3
精度对比基准：CPU（x86, PyTorch 2.8.0）

3. 部署使用流程

3.1 环境准备

conda create -n roberta-base-go_emotions python=3.11 -y
conda activate roberta-base-go_emotions
pip install torch==2.8.0 torch_npu==2.8.0.post4 \
    -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install transformers numpy \
    -i https://pypi.tuna.tsinghua.edu.cn/simple

3.2 推理脚本使用

python inference.py --text "I am so happy today!"
python inference.py --batch_file texts.txt --threshold 0.3

编程接口：

from inference import GoEmotionsClassifier
clf = GoEmotionsClassifier(model_path="./roberta-base-go_emotions", device="npu")
results, probs = clf.predict(["I am happy!", "This is sad."])

4. Smoke 验证

python inference.py --text "I feel great!" --device npu

预期输出：按概率降序排列的情绪标签列表。

5. 性能参考

测试条件：23 条情绪文本，batch_size=16。

指标	数值
CPU 吞吐量	`45.9` texts/s
NPU 吞吐量	`478.9` texts/s
CPU/NPU 加速比	`10.4` ×

6. 精度评测

6.1 评测方法

分别在 CPU 和 NPU 上推理 23 条情绪文本，比较 28 维概率向量：

MAE：概率向量逐元素平均绝对误差
余弦相似度：概率向量方向一致性
Top-1 / Top-3 准确率：最高概率标签和前三标签一致性

6.2 评测结果

指标	数值
平均余弦相似度	`0.999999`
MAE	`0.000059`
最大概率误差	`0.002828`
精度误差率	`0.0001%`
Top-1 准确率	`100.0%`
Top-3 重叠率	`100.0%`

结论：精度误差率 0.0001%，Top-1 完全一致，评测通过。

7. 迁移适配说明

7.1 模型结构

roberta-base-go_emotions 基于 RoBERTa 架构：

Backbone：RobertaModel（12 层 Transformer，768 维隐藏层，RoBERTa 优化版本）
Classifier Head：线性层（768 → 28），28 种情绪标签
激活函数：Sigmoid（multi-label classification，每类独立概率，非 softmax 互斥）
Tokenizer：BPE（Byte-Pair Encoding），使用 merges.txt + vocab.json
参数量：124.7M（RoBERTa-base 标准规模）

7.2 适配要点

使用 AutoModelForSequenceClassification.from_pretrained() 加载
model.to("npu:0") 一步迁移到 NPU
分类 logits 通过 torch.sigmoid() 转换为 28 维独立概率
使用 threshold（默认 0.3）过滤低置信度情绪，支持多标签同时触发
AutoTokenizer 在 CPU 端分词，tensor 转移至 NPU；输出通过 .cpu().numpy() 返回

7.3 关键代码

import torch, torch_npu
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained(
    "roberta-base-go_emotions"
).to("npu:0")
tokenizer = AutoTokenizer.from_pretrained("roberta-base-go_emotions")

inputs = tokenizer("I am so happy!", return_tensors="pt")
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.sigmoid(logits)  # 28维独立概率

# 获取超过阈值的情绪
labels = model.config.id2label
emotions = {labels[i]: float(p) for i, p in enumerate(probs[0]) if p > 0.3}

8. 注意事项

多标签分类（multi-label）：与 softmax 互斥分类不同，该模型使用 sigmoid 激活，每条文本可同时触发多种情绪。使用 threshold 参数控制输出标签的置信度阈值。
输入截断：max_length=512，超出部分将被截断。RoBERTa tokenizer 有 50k vocabulary，覆盖常见英文文本。
NPU 预热：首次推理触发算子编译（约 3-5 秒），建议生产环境先执行一次预热调用
阈值调优：默认 threshold=0.3，可根据具体场景调整。降低阈值增加召回（更多标签），提高阈值增加精确度
28 种情绪：涵盖 GoEmotions 数据集的完整标签体系，包含 neutral 表示无明显情绪