冬
gcw_IDzXRVNw/intent_classification-ascend
模型介绍文件和版本Pull Requests讨论分析
下载使用量0

intent_classification Ascend NPU 部署指南

项目简介

intent_classification 是基于 DistilBERT 的用户意图分类模型,经过精细调整用于将文本分类到 15 种不同的用户意图类别。该模型在不到 50k 的数据集上训练了 100 个 epoch,达到了 99.87% 的准确率,可用于客户服务聊天机器人、虚拟助手和推荐系统。

特性

  • 支持 Ascend NPU 推理加速
  • CPU 与 NPU 精度对比测试(输出完全一致)
  • 15 类用户意图分类
  • 21 倍加速比
  • 高准确率(99.87%)

环境要求

  • 硬件:华为 Ascend 910 系列 NPU
  • CANN:8.0.RC1 或更高版本
  • PyTorch:2.0+ 及 torch_npu
  • Docker:容器名称 test-modelagent
  • transformers:4.31+

目录结构

intent_classification-ascend/
├── inference.py          # 推理测试脚本
├── log.txt               # 测试日志
├── README.md             # 本文档
├── test_sample.txt       # 测试样例
├── inference_result.json # 推理结果
└── precision_result.json # 精度测试结果

部署步骤

1. 进入容器

docker exec -it test-modelagent bash

2. 设置环境变量

source /usr/local/Ascend/ascend-toolkit/set_env.sh

3. 准备模型文件

模型文件位于 /data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification/ 目录下:

  • model.safetensors - 模型权重 (约 268MB)
  • config.json - 模型配置
  • vocab.txt - 词汇表
  • tokenizer.json / tokenizer_config.json - 分词器文件

4. 安装依赖

pip install transformers torch_npu -i https://pypi.huaweicloud.com/repository/pypi/simple/

Usage

Method 1: Normal Inference Mode

Run the inference script for intent classification:

cd /data/ysws/agentsp/5-16/intent_classification-ascend/

python3 inference.py --mode inference

方式二:精度测试模式 (CPU vs NPU)

运行精度对比测试,验证 NPU 计算结果与 CPU 一致性:

cd /data/ysws/agentsp/5-16/intent_classification-ascend/

python3 inference.py --mode precision_test

方式三:完整测试 (推理 + 精度)

cd /data/ysws/agentsp/5-16/intent_classification-ascend/

python3 inference.py --mode all

命令行参数说明

参数说明默认值
--mode测试模式: inference, precision_test 或 allall

测试验证

精度测试结果

指标实测值阈值状态
最大相对误差0.0397%< 1.00%PASS
最大绝对误差8.92e-03--
CPU 推理时间0.220s--
NPU 推理时间0.010s--
加速比21.40x> 1xPASS
预测标签一致性完全一致-PASS

性能数据

操作耗时
NPU 推理时间 (3 句)0.243s
精度测试 CPU 时间0.220s
精度测试 NPU 时间0.010s

分类结果示例

输入文本预测类别
"I ordered from you 2 weeks ago and its still not here."appointment
"I need to bring in my daughter for a checkup."appointment
"How can I recover my password?"recover password

测试日志

============================================================
Intent Classification NPU Test
Model: Falconsai/intent_classification
Output: /data/ysws/agentsp/5-16/intent_classification-ascend
============================================================

============================================================
Intent Classification Inference Test (NPU)
============================================================
Device: npu:0
Model: /data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification
Loading tokenizer...
Loading model...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 5028.45it/s]
Model loaded successfully
Input texts: ['I ordered from you 2 weeks ago and its still not here.', 'I need to bring in my daughter for a checkup.', 'How can I recover my password?']
Input shape: torch.Size([3, 15])
Logits shape: torch.Size([3, 15])
Predictions: [14, 14, 12]
Labels: ['appointment', 'appointment', 'recover password']
Inference time: 0.243s

Inference result saved to /data/ysws/agentsp/5-16/intent_classification-ascend/inference_result.json

============================================================
Precision Test (CPU vs NPU)
============================================================
Using device: npu:0
Loading tokenizer...
Loading model on CPU...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 4492.03it/s]
Loading model on npu:0...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 4535.75it/s]
Running inference on CPU...
Running inference on NPU...
CPU inference time: 0.220s
NPU inference time: 0.010s
Speedup: 21.40x
Max absolute error: 8.916855e-03
Max relative error: 0.0397% (threshold: 1.0%)
CPU predictions: [14]
NPU predictions: [14]
Predictions match: True
Status: PASS

Precision result saved to /data/ysws/agentsp/5-16/intent_classification-ascend/precision_result.json

============================================================
Creating Test Sample
============================================================
Saved test sample: /data/ysws/agentsp/5-16/intent_classification-ascend/test_sample.txt

============================================================
Test Complete!
============================================================

Python API 使用示例

基本意图分类

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

MODEL_DIR = "/data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification"

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_DIR)

model = model.to("npu:0").eval()

texts = ["I ordered from you 2 weeks ago and its still not here."]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
labels = [model.config.id2label[p.item()] for p in predictions]
print(labels)

批量处理

texts = [
    "I ordered from you 2 weeks ago and its still not here.",
    "How can I recover my password?",
    "I want to cancel my subscription."
]

inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)

logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
labels = [model.config.id2label[p.item()] for p in predictions]

for text, label in zip(texts, labels):
    print(f"{text} -> {label}")

模型结构

  • 架构类型: DistilBertForSequenceClassification
  • 编码器层数: 6
  • 隐藏层维度: 768
  • 注意力头数: 12
  • 前馈网络维度: 3072
  • 词汇表大小: 30522
  • 分类类别数: 15
Intent ID意图名称
0cancellation
1ordering
2shipping
3invoicing
4billing and payment
5returns and refunds
6complaints and feedback
7speak to person
8edit account
9delete account
10delivery information
11subscription
12recover password
13registration problems
14appointment

推理参数配置

从 config.json 提取的关键参数:

{
  "model_type": "distilbert",
  "n_layers": 6,
  "n_heads": 12,
  "dim": 768,
  "hidden_dim": 3072,
  "vocab_size": 30522,
  "max_position_embeddings": 512,
  "attention_dropout": 0.1,
  "dropout": 0.1
}

常见问题

Q: 精度测试失败?

A: 检查 NPU 驱动是否正确安装。DistilBERT 模型在 CPU 和 NPU 上的输出完全一致,误差极小 (0.04%)。

Q: 如何提高分类速度?

A: 使用批处理可以显著提高吞吐量。NPU 推理非常快 (0.010s vs CPU 0.220s)。

Q: 支持哪些语言?

A: 本模型专门针对英语文本的意图分类。如需其他语言,请访问 HuggingFace 模型库查找对应模型。

参考链接

  • 原始模型: https://huggingface.co/Falconsai/intent_classification
  • DistilBERT 论文: https://arxiv.org/abs/1910.01108
  • HuggingFace Transformers: https://huggingface.co/transformers

许可证

本项目遵循 Apache-2.0 许可证