冬

opus-mt-ROMANCE-en Ascend NPU 部署指南

项目简介

opus-mt-ROMANCE-en 是 Helsinki-NLP 开发的多语言机器翻译模型，支持将罗曼语族语言翻译成英语(English)。支持的源语言包括法语、西班牙语、意大利语、葡萄牙语、罗马尼亚语等 40+ 种语言。该模型基于 Transformer 架构的 MarianMT 模型，参数量约 220M。

特性

支持 Ascend NPU 推理加速
CPU vs NPU 精度对比测试 (译文完全一致)
多语言翻译支持 (罗曼语族 → 英语)
Beam search 解码
兼容 HuggingFace transformers

环境要求

硬件: 华为 Ascend 910 系列 NPU
CANN: 8.0.RC1 或更高版本
PyTorch: 2.0+ with torch_npu
transformers: 4.8+

目录结构

opus-mt-ROMANCE-en-ascend/
├── inference.py          # 推理测试脚本
├── log.txt               # 测试日志
├── README.md             # 本文档
├── test_sentences.txt    # 测试句子
└── precision_result.json # 精度测试结果

部署步骤

1. 进入容器

docker exec -it test-modelagent bash

2. 设置环境变量

source /usr/local/Ascend/ascend-toolkit/set_env.sh

3. 准备模型文件

模型文件位于 /data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-ROMANCE-en/ 目录下：

pytorch_model.bin - PyTorch 模型权重
config.json - 模型配置
tokenizer_config.json - 分词器配置
vocab.json - 词表
source.spm / target.spm - SentencePiece 模型

4. 安装依赖

pip install transformers torch_npu sacremoses

使用方式

方式一：普通推理模式

运行推理脚本进行机器翻译：

cd /data/ysws/agentsp/5-20-1/opus-mt-ROMANCE-en-ascend/

python3 inference.py

方式二：精度测试模式 (CPU vs NPU)

cd /data/ysws/agentsp/5-20-1/opus-mt-ROMANCE-en-ascend/

python3 inference.py --precision_test

测试验证

精度测试结果

指标	实测值	阈值	状态
译文匹配率	100%	100%	PASS
NPU 加速比	12.41x	-	显著加速

性能数据

操作	耗时
平均 CPU 推理时间 (单句)	1.5386s
平均 NPU 推理时间 (单句)	0.1240s
NPU 加速比	12.41x
8 句批量翻译总耗时	1.2453s

推理结果示例

输入句子	输出翻译
Hola, como estas hoy?	Hi, how are you today?
Buenos dias, senor.	Good morning, sir.
Donde esta la biblioteca?	Where's the library?
Gracias por tu ayuda.	Thanks for your help.

结果: CPU 和 NPU 输出的翻译结果完全一致，NPU 相比 CPU 获得约 12.41x 加速

测试日志

完整测试日志保存在 log.txt

完整测试日志

============================================================
opus-mt-ROMANCE-en Ascend NPU 部署测试
============================================================
MODEL_DIR: /data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-ROMANCE-en
OUTPUT_DIR: /data/ysws/agentsp/5-20-1/opus-mt-ROMANCE-en-ascend
Mode: precision_test

============================================================
创建测试样本
============================================================
测试句子已保存到: /data/ysws/agentsp/5-20-1/opus-mt-ROMANCE-en-ascend/test_sentences.txt
共 8 句

============================================================
opus-mt-ROMANCE-en NPU 推理测试
============================================================
Device: npu:0
Model loaded successfully!

测试句子数量: 8
  [1] Hola, como estas hoy?
  [2] Buenos dias, senor.
  [3] Me llamo Juan.
  [4] Donde esta la biblioteca?
  [5] Gracias por tu ayuda.
  [6] Que tal el weather?
  [7] Hasta manana!
  [8] Te quiero mucho.

开始翻译 (device: npu:0)...

翻译结果:
  [1] 原文: Hola, como estas hoy?
      译文: Hi, how are you today?
  [2] 原文: Buenos dias, senor.
      译文: Good morning, sir.
  [3] 原文: Me llamo Juan.
      译文: My name is Juan.
  [4] 原文: Donde esta la biblioteca?
      译文: Where's the library?
  [5] 原文: Gracias por tu ayuda.
      译文: Thanks for your help.
  [6] 原文: Que tal el weather?
      译文: How about the weather?
  [7] 原文: Hasta manana!
      译文: See you tomorrow!
  [8] 原文: Te quiero mucho.
      译文: I love you so much.

总耗时: 1.2453s
平均每句: 0.1557s

============================================================
opus-mt-ROMANCE-en 精度测试 (CPU vs NPU)
============================================================
Device: npu:0

加载 CPU 模型...
CPU 模型加载完成

加载 NPU 模型...
NPU 模型加载完成

测试句子数量: 3

--- 句子 1 ---
原文: Hola, como estas hoy?
CPU 译文: Hi, how are you today?
CPU 耗时: 1.7299s
NPU 译文: Hi, how are you today?
NPU 耗时: 0.1535s
译文匹配: True

--- 句子 2 ---
原文: Buenos dias, senor.
CPU 译文: Good morning, sir.
CPU 耗时: 1.2436s
NPU 译文: Good morning, sir.
NPU 耗时: 0.0924s
译文匹配: True

--- 句子 3 ---
原文: Donde esta la biblioteca?
CPU 译文: Where's the library?
CPU 耗时: 1.6423s
NPU 译文: Where's the library?
NPU 耗时: 0.1262s
译文匹配: True

============================================================
精度测试结果汇总
============================================================
译文完全匹配: PASS
平均 CPU 推理时间: 1.5386s
平均 NPU 推理时间: 0.1240s
NPU 加速比: 12.41x

精度阈值: 1.0%
译文匹配率: PASS

总体状态: PASS

============================================================
测试完成!
============================================================

Python API 使用示例

import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

MODEL_DIR = "/data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-ROMANCE-en"

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIR)
model = model.to("npu:0")
model.eval()

texts = ["Hola, como estas hoy?"]
inputs = tokenizer(texts, return_tensors="pt", padding=True)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    gen_ids = model.generate(
        inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_length=100,
        num_beams=4,
        early_stopping=True
    )

translations = tokenizer.batch_decode(gen_ids, skip_special_tokens=True)
print(translations)  # ['Hi, how are you today?']

模型结构

架构类型: MarianMT (Transformer Encoder-Decoder)
编码器: 6 层 Transformer
解码器: 6 层 Transformer
隐藏层维度: 512
注意力头数: 8
参数量: ~220M
源语言: 罗曼语族 40+ 种
目标语言: 英语 (en)

参考链接

原始模型: https://huggingface.co/Helsinki-NLP/opus-mt-ROMANCE-en
Helsinki-NLP: https://github.com/Helsinki-NLP
HuggingFace Transformers: https://huggingface.co/transformers

许可证

本项目遵循 Apache-2.0 许可证