boltuix_EntityBERT

1. 简介

本文档记录 boltuix_EntityBERT 在昇腾 NPU（Ascend910）环境的快速部署与验证结果。

命名实体识别（NER）模型，基于 HuggingFace transformers 框架，支持从文本中识别特定类型的实体（如疾病、药物、基因等）。

2. 验证环境

组件	版本
`torch`	`2.5.1`
`torch_npu`	`2.5.1`
`transformers`	`>=4.48.0`
`CANN`	`8.5.RC1`

NPU：Ascend910（单卡）
推理框架：PyTorch + transformers

3. 快速部署

3.1 环境准备

pip install transformers torch

3.2 推理代码

import torch
from transformers import AutoTokenizer, BertForTokenClassification

device = torch.device("npu:0" if torch.npu.is_available() else "cpu")
model_name = "boltuix_EntityBERT"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = BertForTokenClassification.from_pretrained(model_name, trust_remote_code=True)
model = model.to(device).eval()

texts = ["The patient was diagnosed with diabetes."]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to(device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)
print(f"输出 logits 形状: {outputs.logits.shape}")

4. Smoke 验证

python3 inference.py

验证结果：

模型成功加载到 npu:0
推理过程无报错

5. 精度评测

NPU 与 CPU 输出对比，验证数值一致性（logits 相对误差 < 1%）。

指标	数值
Top-1 一致性	54/54
Max Logit Diff Ratio	0.043%
Avg KL Divergence	0.000000
结论	PASS

精度要求：Top-1 一致率 100%，logits 相对误差 < 1%。

6. 性能参考

测试条件：FP32 / batch=8 / max_length=128 / warmup=5 / timed=50 runs，Ascend910 单卡。

指标	数值
平均推理时间	`3.23 ms`
测试次数	`50`

7. 注意事项

精度验证为 NPU vs CPU 数值一致性（logits 相对误差 < 1%），非模型准确率
如遇 trust_remote_code 相关警告，不影响推理结果
支持 FP16 混合精度推理以提升性能

1. 简介

本文档记录 boltuix_EntityBERT 在昇腾 NPU（Ascend910）环境的快速部署与验证结果。

命名实体识别（NER）模型，基于 HuggingFace transformers 框架，支持从文本中识别特定类型的实体（如疾病、药物、基因等）。

相关获取地址：

参考文档：

组件

版本

torch

2.5.1

torch_npu

2.5.1

transformers

>=4.48.0

CANN

8.5.RC1

3. 快速部署

3.1 环境准备

pip install transformers torch

3.2 推理代码

import torch
from transformers import AutoTokenizer, BertForTokenClassification

device = torch.device("npu:0" if torch.npu.is_available() else "cpu")
model_name = "boltuix_EntityBERT"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = BertForTokenClassification.from_pretrained(model_name, trust_remote_code=True)
model = model.to(device).eval()

texts = ["The patient was diagnosed with diabetes."]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to(device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model(**inputs)
print(f"输出 logits 形状: {outputs.logits.shape}")

指标

数值

Top-1 一致性

54/54

Max Logit Diff Ratio

0.043%

Avg KL Divergence

0.000000

结论

PASS

指标

数值

平均推理时间

3.23 ms

测试次数

50