冬

opus-mt-mul-en Ascend NPU 部署指南

项目简介

opus-mt-mul-en 是 Helsinki-NLP 开发的多语言机器翻译模型，支持将多种语言（Multilingual）翻译成英语（English）。支持的源语言包括法语、西班牙语、德语、意大利语、葡萄牙语、罗马尼亚语、捷克语、荷兰语等。该模型基于 Transformer 架构的 MarianMT 模型，参数量约 220M。

特性

支持 Ascend NPU 推理加速
CPU 与 NPU 精度对比测试（译文完全一致）
多语言翻译支持（多语言 → 英语）
Beam search 解码
兼容 HuggingFace transformers

环境要求

硬件：华为 Ascend 910 系列 NPU
CANN：8.0.RC1 或更高版本
PyTorch：2.0+ with torch_npu
transformers：4.8+

目录结构

opus-mt-mul-en-ascend/
├── inference.py          # 推理测试脚本
├── log.txt               # 测试日志
├── README.md             # 本文档
├── test_sentences.txt    # 测试句子
└── precision_result.json # 精度测试结果

部署步骤

1. 进入容器

docker exec -it test-modelagent bash

2. 设置环境变量

source /usr/local/Ascend/ascend-toolkit/set_env.sh

3. 安装依赖

pip install transformers torch_npu sacremoses

Usage

Method 1: Normal Inference Mode

cd /data/ysws/agentsp/5-20-1/opus-mt-mul-en-ascend/

python3 inference.py

方式二：精度测试模式 (CPU vs NPU)

cd /data/ysws/agentsp/5-20-1/opus-mt-mul-en-ascend/

python3 inference.py --precision_test

测试验证

精度测试结果

指标	实测值	阈值	状态
译文匹配率	100%	100%	PASS
NPU 加速比	12.06x	-	显著加速

性能数据

操作	耗时
平均 CPU 推理时间 (单句)	1.6945s
平均 NPU 推理时间 (单句)	0.1405s
NPU 加速比	12.06x
8 句批量翻译总耗时	1.3045s

推理结果示例

输入句子	输出翻译
Bonjour, comment allez-vous?	Hello, how are you?
Hola, como estas hoy?	Hey, how are these today?
Guten Tag, wie geht es Ihnen?	Hello, how are you?
Buongiorno, come stai?	Good morning, how are you?

结果: CPU 和 NPU 输出的翻译结果完全一致，NPU 相比 CPU 获得约 12.06x 加速

完整测试日志

============================================================
opus-mt-mul-en Ascend NPU 部署测试
============================================================
MODEL_DIR: /data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-mul-en
OUTPUT_DIR: /data/ysws/agentsp/5-20-1/opus-mt-mul-en-ascend
Mode: precision_test

============================================================
创建测试样本
============================================================
测试句子已保存到: /data/ysws/agentsp/5-20-1/opus-mt-mul-en-ascend/test_sentences.txt
共 8 句

============================================================
opus-mt-mul-en NPU 推理测试
============================================================
Device: npu:0
Model loaded successfully!

测试句子数量: 8
  [1] Bonjour, comment allez-vous?
  [2] Hola, como estas hoy?
  [3] Guten Tag, wie geht es Ihnen?
  [4] Buongiorno, come stai?
  [5] Bom dia, como vai?
  [6] Buna ziua, cum esti?
  [7] Dobry den, jak se mate?
  [8] Goedendag, hoe gaat het?

开始翻译 (device: npu:0)...

翻译结果:
  [1] 原文: Bonjour, comment allez-vous?
      译文: Hello, how are you?
  [2] 原文: Hola, como estas hoy?
      译文: Hey, how are these today?
  [3] 原文: Guten Tag, wie geht es Ihnen?
      译文: Hello, how are you?
  [4] 原文: Buongiorno, come stai?
      译文: Good morning, how are you?
  [5] 原文: Bom dia, como vai?
      译文: Hello, how's it going?
  [6] 原文: Buna ziua, cum esti?
      译文: Hello, how are you?
  [7] 原文: Dobry den, jak se mate?
      译文: Hello, how's it going?
  [8] 原文: Goedendag, hoe gaat het?
      译文: Good afternoon, how's it going?

总耗时: 1.3045s
平均每句: 0.1631s

============================================================
opus-mt-mul-en 精度测试 (CPU vs NPU)
============================================================
Device: npu:0

加载 CPU 模型...
CPU 模型加载完成

加载 NPU 模型...
NPU 模型加载完成

测试句子数量: 3

--- 句子 1 ---
原文: Bonjour, comment allez-vous?
CPU 译文: Hello, how are you?
CPU 耗时: 1.6567s
NPU 译文: Hello, how are you?
NPU 耗时: 0.1460s
译文匹配: True

--- 句子 2 ---
原文: Hola, como estas hoy?
CPU 译文: Hey, how are these today?
CPU 耗时: 1.8231s
NPU 译文: Hey, how are these today?
NPU 耗时: 0.1444s
译文匹配: True

--- 句子 3 ---
原文: Guten Tag, wie geht es Ihnen?
CPU 译文: Hello, how are you?
CPU 耗时: 1.6035s
NPU 译文: Hello, how are you?
NPU 耗时: 0.1311s
译文匹配: True

============================================================
精度测试结果汇总
============================================================
译文完全匹配: PASS
平均 CPU 推理时间: 1.6945s
平均 NPU 推理时间: 0.1405s
NPU 加速比: 12.06x

精度阈值: 1.0%
译文匹配率: PASS

总体状态: PASS

============================================================
测试完成!
============================================================

Python API 使用示例

import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

MODEL_DIR = "/data/ysws/agentsp/5-20-1/Helsinki-NLP/opus-mt-mul-en"

tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIR)
model = model.to("npu:0")
model.eval()

texts = ["Bonjour, comment allez-vous?"]
inputs = tokenizer(texts, return_tensors="pt", padding=True)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}

with torch.no_grad():
    gen_ids = model.generate(
        inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_length=100,
        num_beams=4,
        early_stopping=True
    )

translations = tokenizer.batch_decode(gen_ids, skip_special_tokens=True)
print(translations)  # ['Hello, how are you?']

模型结构

架构类型: MarianMT（Transformer 编码器-解码器）
编码器: 6 层 Transformer
解码器: 6 层 Transformer
隐藏层维度: 512
注意力头数: 8
参数量: ~220M
源语言: 多语言（mul）
目标语言: 英语（en）

参考链接

原始模型: https://huggingface.co/Helsinki-NLP/opus-mt-mul-en
Helsinki-NLP: https://github.com/Helsinki-NLP
HuggingFace Transformers: https://huggingface.co/transformers

许可证

本项目遵循 Apache-2.0 许可证