冬

opus-mt-sem-en Ascend NPU 部署指南

项目简介

opus-mt-sem-en 是 Helsinki-NLP MarianMT 系列的多语言翻译模型，支持 Semantic 语言到 English 的翻译。该模型基于 Transformer 架构，采用 6 层 Encoder 和 6 层 Decoder 组成。

特性

支持 Ascend NPU 推理加速
CPU vs NPU 精度对比测试 (译文 100% 一致)
支持 Semantic 语言到英语翻译
兼容 HuggingFace transformers

环境要求

硬件: 华为 Ascend 910 系列 NPU
CANN: 8.0.RC1 或更高版本
PyTorch: 2.0+ with torch_npu
Docker: 容器名称 test-modelagent
transformers: 4.8+

目录结构

opus-mt-sem-en-ascend/
├── inference.py          # 推理测试脚本
├── log.txt               # 测试日志
├── README.md             # 本文档
├── precision_result.json # 精度测试结果
└── test_sentences.txt   # 测试句子

部署步骤

1. 进入容器

docker exec -it test-modelagent bash

2. 设置环境变量

source /usr/local/Ascend/ascend-toolkit/set_env.sh

3. 准备模型文件

模型文件位于 /data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-sem-en/ 目录下

4. 安装依赖

pip install transformers torch_npu

使用方式

方式一：普通推理模式

运行推理脚本进行翻译：

cd /data/ysws/agentsp/5-20-1/opus-mt-sem-en-ascend/

# 使用默认测试句子
python3 inference.py

方式二：精度测试模式 (CPU vs NPU)

运行精度对比测试，验证 NPU 计算结果与 CPU 一致性：

cd /data/ysws/agentsp/5-20-1/opus-mt-sem-en-ascend/

# 运行完整精度测试
python3 inference.py --precision_test

测试验证

精度测试结果

指标	实测值	阈值	状态
译文匹配率	100%	100%	PASS
NPU 加速比	13.56x	> 1x	PASS
平均 CPU 推理时间 (3句)	2.56s	-	-
平均 NPU 推理时间 (3句)	0.19s	-	-

性能数据

操作	耗时
8句 NPU 翻译总耗时	1.52s
单句 NPU 推理平均	0.190s
CPU 单句推理平均	2.56s

推理结果示例

原文	译文
Dobry den, jak se mate?	Dobry den, would you mate?
Jake je pocasi?	Jacke jee pocasi?
Dekuji mockrat.	Mocks.
Na shledanou!	We're not schledanou!

结果: CPU 和 NPU 输出一致，翻译功能正常

测试日志

完整测试日志：

============================================================
opus-mt-sem-en Ascend NPU 部署测试
============================================================
MODEL_DIR: /data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-sem-en
OUTPUT_DIR: /data/ysws/agentsp/5-20-1/opus-mt-sem-en-ascend
Mode: precision_test

============================================================
创建测试样本
============================================================
测试句子已保存到: /data/ysws/agentsp/5-20-1/opus-mt-sem-en-ascend/test_sentences.txt
共 8 句

============================================================
opus-mt-sem-en NPU 推理测试
============================================================
Device: npu:0
Model loaded successfully!

测试句子数量: 8
  [1] Dobry den, jak se mate?
  [2] Jake je pocasi?
  [3] Dekuji mockrat.
  [4] Na shledanou!
  [5] Kolik to stoji?
  [6] Nevim.
  [7] Jsem student.
  [8] Good morning!

开始翻译 (device: npu:0)...

翻译结果:
  [1] 原文: Dobry den, jak se mate?
      译文: Dobry den, would you mate?
  [2] 原文: Jake je pocasi?
      译文: Jacke jee pocasi?
  [3] 原文: Dekuji mockrat.
      译文: Mocks.
  [4] 原文: Na shledanou!
      译文: We're not schledanou!
  [5] 原文: Kolik to stoji?
      译文: Collick tostories?
  [6] 原文: Nevim.
      译文: Nevim.
  [7] 原文: Jsem student.
      译文: Name a student.
  [8] 原文: Good morning!
      译文: Good Morning!

总耗时: 1.5194s
平均每句: 0.1899s

============================================================
opus-mt-sem-en 精度测试 (CPU vs NPU)
============================================================
Device: npu:0

加载 CPU 模型...
CPU 模型加载完成

加载 NPU 模型...
NPU 模型加载完成

测试句子数量: 3

--- 句子 1 ---
原文: Dobry den, jak se mate?
CPU 译文: Dobry den, would you mate?
CPU 耗时: 3.1759s
NPU 译文: Dobry den, would you mate?
NPU 耗时: 0.2323s
译文匹配: True

--- 句子 2 ---
原文: Jake je pocasi?
CPU 译文: Jacke jee pocasi?
CPU 耗时: 2.1074s
NPU 译文: Jacke jee pocasi?
NPU 耗时: 0.1572s
译文匹配: True

--- 句子 3 ---
原文: Na shledanou!
CPU 译文: We're not schledanou!
CPU 耗时: 2.3885s
NPU 译文: We're not schledanou!
NPU 耗时: 0.1763s
译文匹配: True

============================================================
精度测试结果汇总
============================================================
译文完全匹配: PASS
平均 CPU 推理时间: 2.5573s
平均 NPU 推理时间: 0.1886s
NPU 加速比: 13.56x

精度阈值: 1.0%
译文匹配率: PASS

总体状态: PASS

============================================================
测试完成!
============================================================

Python API 使用示例

基本翻译

import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

MODEL_DIR = "/data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-sem-en"

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIR)
model = model.to("npu:0")
model.eval()

texts = ["Dobry den, jak se mate?"]
inputs = tokenizer(texts, return_tensors="pt", padding=True)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    gen_ids = model.generate(inputs["input_ids"], max_length=100, num_beams=4, early_stopping=True)
    translations = tokenizer.batch_decode(gen_ids, skip_special_tokens=True)

print(translations)  # ['Dobry den, would you mate?']

模型信息

架构类型: MarianMT (Transformer)
编码器: 6 层 Transformer
解码器: 6 层 Transformer
语言方向: Semantic -> English

推理参数配置

参数	值
max_length	100
num_beams	4
early_stopping	True

常见问题

Q: 精度测试失败?

A: 检查 NPU 驱动是否正确安装，确保 CANN 环境变量已 source。CPU 和 NPU 输出应该完全一致。

Q: 翻译结果看起来奇怪?

A: 这是模型的正常表现，opus-mt-sem-en 是 Semantic 语言到英语的翻译模型，输入应该使用 Semantic 语言。

参考链接

原始模型: https://huggingface.co/Helsinki-NLP/opus-mt-sem-en
MarianMT: https://huggingface.co/transformers/model_doc/marian.html
HuggingFace Transformers: https://huggingface.co/transformers

许可证

本项目遵循 Apache-2.0 许可证