intent_classification 是基于 DistilBERT 的用户意图分类模型,经过精细调整用于将文本分类到 15 种不同的用户意图类别。该模型在不到 50k 的数据集上训练了 100 个 epoch,达到了 99.87% 的准确率,可用于客户服务聊天机器人、虚拟助手和推荐系统。
intent_classification-ascend/
├── inference.py # 推理测试脚本
├── log.txt # 测试日志
├── README.md # 本文档
├── test_sample.txt # 测试样例
├── inference_result.json # 推理结果
└── precision_result.json # 精度测试结果docker exec -it test-modelagent bashsource /usr/local/Ascend/ascend-toolkit/set_env.sh模型文件位于 /data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification/ 目录下:
pip install transformers torch_npu -i https://pypi.huaweicloud.com/repository/pypi/simple/Run the inference script for intent classification:
cd /data/ysws/agentsp/5-16/intent_classification-ascend/
python3 inference.py --mode inference运行精度对比测试,验证 NPU 计算结果与 CPU 一致性:
cd /data/ysws/agentsp/5-16/intent_classification-ascend/
python3 inference.py --mode precision_testcd /data/ysws/agentsp/5-16/intent_classification-ascend/
python3 inference.py --mode all| 参数 | 说明 | 默认值 |
|---|---|---|
--mode | 测试模式: inference, precision_test 或 all | all |
| 指标 | 实测值 | 阈值 | 状态 |
|---|---|---|---|
| 最大相对误差 | 0.0397% | < 1.00% | PASS |
| 最大绝对误差 | 8.92e-03 | - | - |
| CPU 推理时间 | 0.220s | - | - |
| NPU 推理时间 | 0.010s | - | - |
| 加速比 | 21.40x | > 1x | PASS |
| 预测标签一致性 | 完全一致 | - | PASS |
| 操作 | 耗时 |
|---|---|
| NPU 推理时间 (3 句) | 0.243s |
| 精度测试 CPU 时间 | 0.220s |
| 精度测试 NPU 时间 | 0.010s |
| 输入文本 | 预测类别 |
|---|---|
| "I ordered from you 2 weeks ago and its still not here." | appointment |
| "I need to bring in my daughter for a checkup." | appointment |
| "How can I recover my password?" | recover password |
============================================================
Intent Classification NPU Test
Model: Falconsai/intent_classification
Output: /data/ysws/agentsp/5-16/intent_classification-ascend
============================================================
============================================================
Intent Classification Inference Test (NPU)
============================================================
Device: npu:0
Model: /data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification
Loading tokenizer...
Loading model...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 5028.45it/s]
Model loaded successfully
Input texts: ['I ordered from you 2 weeks ago and its still not here.', 'I need to bring in my daughter for a checkup.', 'How can I recover my password?']
Input shape: torch.Size([3, 15])
Logits shape: torch.Size([3, 15])
Predictions: [14, 14, 12]
Labels: ['appointment', 'appointment', 'recover password']
Inference time: 0.243s
Inference result saved to /data/ysws/agentsp/5-16/intent_classification-ascend/inference_result.json
============================================================
Precision Test (CPU vs NPU)
============================================================
Using device: npu:0
Loading tokenizer...
Loading model on CPU...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 4492.03it/s]
Loading model on npu:0...
Loading weights: 100%|██████████| 104/104 [00:00<00:00, 4535.75it/s]
Running inference on CPU...
Running inference on NPU...
CPU inference time: 0.220s
NPU inference time: 0.010s
Speedup: 21.40x
Max absolute error: 8.916855e-03
Max relative error: 0.0397% (threshold: 1.0%)
CPU predictions: [14]
NPU predictions: [14]
Predictions match: True
Status: PASS
Precision result saved to /data/ysws/agentsp/5-16/intent_classification-ascend/precision_result.json
============================================================
Creating Test Sample
============================================================
Saved test sample: /data/ysws/agentsp/5-16/intent_classification-ascend/test_sample.txt
============================================================
Test Complete!
============================================================import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
MODEL_DIR = "/data/ysws/agentsp/5-16/intent_classification/Falconsai/intent_classification"
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_DIR)
model = model.to("npu:0").eval()
texts = ["I ordered from you 2 weeks ago and its still not here."]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
labels = [model.config.id2label[p.item()] for p in predictions]
print(labels)texts = [
"I ordered from you 2 weeks ago and its still not here.",
"How can I recover my password?",
"I want to cancel my subscription."
]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predictions = torch.argmax(logits, dim=-1)
labels = [model.config.id2label[p.item()] for p in predictions]
for text, label in zip(texts, labels):
print(f"{text} -> {label}")| Intent ID | 意图名称 |
|---|---|
| 0 | cancellation |
| 1 | ordering |
| 2 | shipping |
| 3 | invoicing |
| 4 | billing and payment |
| 5 | returns and refunds |
| 6 | complaints and feedback |
| 7 | speak to person |
| 8 | edit account |
| 9 | delete account |
| 10 | delivery information |
| 11 | subscription |
| 12 | recover password |
| 13 | registration problems |
| 14 | appointment |
从 config.json 提取的关键参数:
{
"model_type": "distilbert",
"n_layers": 6,
"n_heads": 12,
"dim": 768,
"hidden_dim": 3072,
"vocab_size": 30522,
"max_position_embeddings": 512,
"attention_dropout": 0.1,
"dropout": 0.1
}A: 检查 NPU 驱动是否正确安装。DistilBERT 模型在 CPU 和 NPU 上的输出完全一致,误差极小 (0.04%)。
A: 使用批处理可以显著提高吞吐量。NPU 推理非常快 (0.010s vs CPU 0.220s)。
A: 本模型专门针对英语文本的意图分类。如需其他语言,请访问 HuggingFace 模型库查找对应模型。
本项目遵循 Apache-2.0 许可证