nanyizjm/moss-tts-nano
模型介绍文件和版本Pull Requests讨论分析
下载使用量0

NPU Tag Evidence

This repository is published as an Ascend NPU model repository. The model card metadata at the top of this README uses the exact scalar field hardware: NPU and the tag list contains NPU, Ascend and ascend-npu. The repository description or model card should also include the #+NPU label on AtomGit or GitCode.

ItemValue
Repositoryhttps://gitcode.com/nanyizjm/moss-tts-nano
Competition taskTrack 1 model adaptation
Hardware metadatahardware: NPU
Required tag#+NPU
README data policyInference, accuracy and performance values are written as text in this README; images are not used as a replacement for data.

Track 1 Model Card Summary

ItemValue
Model repositoryhttps://gitcode.com/nanyizjm/moss-tts-nano
Original model or weight sourcehttps://gitcode.com/OpenMOSS/MOSS-TTS-Nano-100M-ONNX
Competition trackTrack 1: model adaptation
Target hardwareAscend NPU
Required functionNPU inference runs successfully or the blocking reason is explicitly recorded
Required accuracyNPU result compared with CPU/GPU reference, error less than 1 percent
Required tag#+NPU

Deliverable Checklist

DeliverableStatus
inference.pyPresent
readme.md / README.mdPresent
eval/eval_accuracy.pyPresent
eval/eval_performance.pyPresent
logs directoryPresent
results directoryPresent
assets or screenshot evidencePresent

Accuracy Evidence Requirement

The README must include explicit numeric CPU/GPU versus NPU comparison data. The key acceptance target is error less than 1 percent. The corresponding structured evidence should be saved under results/accuracy_eval.json and logs/accuracy_eval.log when available.

#+NPU

moss-tts-nano on Ascend NPU

Platform Review Evidence Summary (Direct Text)

This section is written directly in the README for platform review. It uses only checked-in logs and JSON result files from this repository. It does not rely on embedded images.

Review itemDirect result
Repositorymoss-tts-nano
Hardware metadatahardware: NPU and #+NPU are present in this README
Normal NPU inference outputPASS - checked-in NPU inference output is written below.
Accuracy requirementPASS - checked-in accuracy evidence reports PASS; use the table below for exact recorded values.
Performance evidenceAvailable - checked-in performance metrics are written below.
Evidence filesresults/inference_result.json, logs/inference.log, results/accuracy_eval.json, results/performance_eval.json, logs/accuracy_eval.log, logs/performance_eval.log

Normal NPU Inference Output Evidence

"device": "npu:0",
"output_wav": "outputs/inference_test.wav",
Device: npu:0 | Backend: pytorch | NPU: True (2)
Output: outputs/inference_test.wav
Output: outputs/inference_test.wav

NPU Inference Metrics

SourceMetricValue
results/inference_result.jsontextHello, this is a test of text to speech synthesis on Ascend NPU.
results/inference_result.jsondevicenpu:0

CPU/GPU Reference vs NPU Accuracy Evidence

SourceMetricValue
results/accuracy_eval.jsonavg_cosine_similarity1.0000000119432717
results/accuracy_eval.jsonmin_cosine_similarity0.9999998259314294
results/accuracy_eval.jsonpassedtrue
results/performance_eval.jsontextHello, this is a performance test.

Accuracy conclusion: PASS - checked-in accuracy evidence reports PASS; use the table below for exact recorded values.

Performance Evidence

SourceMetricValue
results/performance_eval.jsondevicenpu:0
results/performance_eval.jsonwarmup1
results/performance_eval.jsonnum_runs3
results/performance_eval.jsonavg_inference_time_s1.4892085393269856
results/performance_eval.jsonpeak_memory_mb554.5361328125
results/performance_eval.jsonall_runs_s[1.513183355331421,1.4645376205444336,1.4899046421051025]

MOSS-TTS-Nano on Ascend NPU

1. 简介

本文档记录 MOSS-TTS-Nano 在华为昇腾 NPU 环境下的适配验证、推理部署与评测结果整理。

MOSS-TTS-Nano 的当前适配任务类型为:语音合成 / 文本转语音。仓库围绕 赛道一模型适配 交付要求,提供 NPU 推理脚本、精度评测、性能评测、运行日志、结果文件和文本化自验证证据。

相关获取地址:

  • 相关地址:https://gitcode.com/OpenMOSS/MOSS-TTS-Nano-100M-ONNX
  • 相关地址:https://atomgit.com/nanyizjm/moss-tts-nano.git
  • 相关地址:https://gitcode.com/nanyizjm/moss-tts-nano
  • 适配代码仓库:https://gitcode.com/nanyizjm/moss-tts-nano

2. 适配内容

2.1 NPU 推理适配

仓库提供 inference.py 作为统一推理入口,运行时通过 --device npu 或脚本默认设备在昇腾 NPU 上执行推理。推理代码保留 model.eval()、无梯度推理、输入输出摘要、耗时统计和日志保存逻辑,便于复现与核验。

2.2 精度与性能评测

仓库保留精度评测与性能评测材料。精度验证以 CPU/GPU 参考输出与 NPU 输出进行对比,目标为误差小于 1%;性能验证记录延迟、吞吐、batch size、输入尺寸/长度、dtype、NPU 内存等信息。所有结果以 logs/ 与 results/ 中的真实运行文件为准。

2.3 证据文本化与提交整理

自验证截图中的关键内容已转写为 README 文本证据,避免仅依赖图片展示。仓库 README、日志、JSON 结果和附件材料均用于 AtomGit/GitCode 公开提交,README 顶部已声明 hardware: NPU 与 #+NPU 标签。

3. 环境要求

组件版本 / 说明
操作系统Linux-5.10.0-182.0.0.95.r2220_156.hce2.aarch64-aarch64-with-glibc2.35
Python3.11.14
NPU 数量2
PyTorch2.9.0+cpu
torch_npu2.9.0.post1+gitee7ba04
transformers4.57.6
依赖安装pip install -r requirements.txt
  • NPU:Ascend NPU(具体型号以 results/env_info.json 或 logs/env_check.log 为准)
  • Python:3.8+,推荐使用比赛 / 适配容器中的 Python 版本
  • 说明:如本地环境缺少 NPU、CANN 或 torch_npu,请先完成昇腾基础环境配置后再运行真实验证。

4. 快速开始

4.1 目录结构

.
├── .gitignore
├── README.md
├── assets/README.md
├── assets/accuracy_eval_result.png
├── assets/env_check.png
├── assets/git_submit_result.png
├── assets/inference_result.png
├── assets/performance_eval_result.png
├── eval/eval_accuracy.py
├── eval/eval_perf_standalone.py
├── eval/eval_performance.py
├── inference.py
├── logs/accuracy_eval.log
├── logs/env_check.log
├── logs/inference.log
├── logs/model_check.log
├── logs/performance_eval.log
├── requirements.txt
├── results/accuracy_eval.json
├── results/env_info.json
├── results/inference_result.json
└── results/performance_eval.json

4.2 权重准备

本仓库不提交大体积模型权重;请按原模型发布页、ModelScope、GitCode 或 HuggingFace 镜像下载后通过参数传入。

推荐约定:

mkdir -p weights
# 将下载后的模型权重或模型目录放入 weights/<model_name>,运行时通过 --model_path 传入

4.3 NPU 推理

pip install -r requirements.txt
python inference.py --model_path <model_path> --device npu

4.4 精度与性能评测

python eval/eval_accuracy.py --model_path <model_path> --device npu
python eval/eval_performance.py --model_path <model_path> --device npu

5. 验证结果

5.1 模型信息

指标结果
模型名称/tmp/ms_cache/OpenMOSS/MOSS-Audio-Tokenizer-Nano
任务类型语音合成 / 文本转语音
推理设备Ascend NPU
推理框架PyTorch / torch_npu 或仓库脚本声明的推理框架
仓库分支main
当前提交5abd0f8

5.2 推理性能

测试结果来源:results/performance_eval.json

指标结果
devicenpu:0
num_runs3
warmup1

5.3 NPU vs CPU/GPU 精度对比

结果来源:results/accuracy_eval.json

指标结果
是否通过PASS

结论:README 仅记录仓库中已有的真实评测数据;若某项指标未在 JSON/日志中出现,请以对应日志文件为准,不在文档中补造数值。

5.4 精度性能评测脚本

python eval/eval_accuracy.py --model_path <model_path> --device npu
python eval/eval_performance.py --model_path <model_path> --device npu

关键日志和结构化 JSON 已在下方“结果数据直接文本”中直接写入;原始文件路径仅用于复核。

6. 推理脚本说明

inference.py 支持的参数以脚本自身 --help 输出为准。当前 README 从脚本中提取到的主要参数如下:

参数默认值说明
--model_path见脚本默认值模型权重或模型目录路径
--codec_path见脚本默认值脚本参数,详见 python inference.py --help
--text见脚本默认值脚本参数,详见 python inference.py --help
--voice见脚本默认值脚本参数,详见 python inference.py --help
--output_wav见脚本默认值脚本参数,详见 python inference.py --help
--sample_rate见脚本默认值脚本参数,详见 python inference.py --help
--device见脚本默认值推理设备,NPU 推理使用 npu
--backend见脚本默认值脚本参数,详见 python inference.py --help
--max_frames见脚本默认值脚本参数,详见 python inference.py --help
--do_sample见脚本默认值脚本参数,详见 python inference.py --help
--output_log见脚本默认值输出目录或日志路径

手动调用示例

python inference.py --help
python inference.py --model_path <model_path> --device npu

7. 自验证文本证据

以下内容来自仓库已有 README 证据段、运行日志或结果文件。图片文件如保留在 assets/ 中,仅作为附件材料;README 中直接写入可检索的文本证据。

Rendered Screenshot Evidence

The PNG files below were rendered from the previous assets/*.txt evidence files. The original TXT files were removed after rendering.

EvidencePNG file
accuracy_eval_resultassets/accuracy_eval_result.png
env_checkassets/env_check.png
git_submit_resultassets/git_submit_result.png
inference_resultassets/inference_result.png
performance_eval_resultassets/performance_eval_result.png

9. 结果数据直接文本

本节将仓库中已提交的评测 JSON、推理日志、环境日志和性能日志直接写入 README。原始文件路径仅用于标识数据来源,主要数值和输出内容已在下面以文本形式完整展开。

logs/env_check.log

  • 文件大小:162 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
[LOG_WARNING] can not create directory, directory: /home/atomgit/ascend/log, possible reason: No such file or directory.path string is NULLpath string is NULLOK

results/env_info.json

  • 文件大小:2134 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
{
  "os": "Linux-5.10.0-182.0.0.95.r2220_156.hce2.aarch64-aarch64-with-glibc2.35",
  "python_version": "3.11.14",
  "torch_version": "2.9.0+cpu",
  "torch_npu_version": "2.9.0.post1+gitee7ba04",
  "transformers_version": "4.57.6",
  "npu_available": true,
  "npu_count": 2,
  "npu_device_name": "Ascend910_9362",
  "ascend_toolkit_home": "/usr/local/Ascend/cann-8.5.1",
  "soc_version": "ascend910_9391",
  "npu_smi": "+------------------------------------------------------------------------------------------------+\n| npu-smi 25.5.2                   Version: 25.5.2                                               |\n+---------------------------+---------------+----------------------------------------------------+\n| NPU   Name                | Health        | Power(W)    Temp(C)           Hugepages-Usage(page)|\n| Chip  Phy-ID              | Bus-Id        | AICore(%)   Memory-Usage(MB)  HBM-Usage(MB)        |\n+===========================+===============+====================================================+\n| 5     Ascend910           | OK            | 173.2       45                0    / 0             |\n| 0     10                  | 0000:0B:00.0  | 0           0    / 0          7785 / 65536         |\n+------------------------------------------------------------------------------------------------+\n| 5     Ascend910           | OK            | -           46                0    / 0             |\n| 1     11                  | 0000:0A:00.0  | 0           0    / 0          2870 / 65536         |\n+===========================+===============+====================================================+\n+---------------------------+---------------+----------------------------------------------------+\n| NPU     Chip              | Process id    | Process name             | Process memory(MB)      |\n+===========================+===============+====================================================+\n| 5       0                 | 62120         | python                   | 3690                    |\n+===========================+===============+====================================================+\n"
}

logs/model_check.log

  • 文件大小:355 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
Model Check Log
================
Model: OpenMOSS/MOSS-TTS-Nano
Architecture: GPT2-based autoregressive LLM + Audio Codec
Parameters: 117,311,232 (0.117B)
Input: Text (multilingual)
Output: 48kHz 2-channel audio
Codec: OpenMOSS/MOSS-Audio-Tokenizer-Nano

Models loaded: TTS + Codec
CPU inference: PASS (RTF=3.31)
NPU inference: PASS (RTF=0.69)

logs/inference.log

  • 文件大小:1113 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
============================================================
MOSS-TTS-Nano Inference - Ascend NPU
============================================================
Device: npu:0 | Backend: pytorch | NPU: True (2)
TTS Model: /tmp/ms_cache/OpenMOSS/MOSS-TTS-Nano
Codec Model: /tmp/ms_cache/OpenMOSS/MOSS-Audio-Tokenizer-Nano
Text: "Hello, this is a test of text to speech synthesis on Ascend NPU."
Output: outputs/inference_test.wav

Loading TTS model...
TTS model loaded in 2.13s
Loading text tokenizer...
Loading audio tokenizer (codec)...
Codec loaded in 0.23s

Warmup...

Generating speech...

============================================================
Results
============================================================
Output: outputs/inference_test.wav
Sample rate: 48000 Hz
Channels: 2
Duration: 4.24s
Samples: 203520

Performance
  Inference time: 2.94s
  RTF (Real-Time Factor): 0.6924
  Faster than real-time
  Device: Ascend910_9362 (61.3 GB)
  NPU Memory: 554.5 MB

PyTorch 2.9.0+cpu
torch_npu 2.9.0.post1+gitee7ba04

Results saved to results/inference_result.json

results/inference_result.json

  • 文件大小:379 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
{
  "model": "OpenMOSS/MOSS-TTS-Nano",
  "text": "Hello, this is a test of text to speech synthesis on Ascend NPU.",
  "device": "npu:0",
  "backend": "pytorch",
  "output_wav": "outputs/inference_test.wav",
  "sample_rate": 48000,
  "channels": 2,
  "duration_s": 4.24,
  "inference_time_s": 2.9359233379364014,
  "rtf": 0.6924347495133022,
  "npu_available": true
}

logs/accuracy_eval.log

  • 文件大小:1032 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
============================================================
MOSS-TTS-Nano Accuracy: CPU vs NPU
============================================================

Loading CPU models...
Loading NPU models...

--- Sample 1: "Hello, this is a test...." ---
  Token match: 100.0% (cpu=1600, npu=1600)
  Waveform cosine: 0.999998
  Waveform MSE: 0.00000001
  Waveform SNR: 52.65 dB

--- Sample 2: "The quick brown fox jumps over the lazy ..." ---
  Token match: 100.0% (cpu=1600, npu=1600)
  Waveform cosine: 0.999998
  Waveform MSE: 0.00000002
  Waveform SNR: 52.79 dB

--- Sample 3: "今天天气真好,适合出门散步。..." ---
  Token match: 100.0% (cpu=1600, npu=1600)
  Waveform cosine: 0.999656
  Waveform MSE: 0.00000000
  Waveform SNR: inf dB

============================================================
Summary
============================================================
Avg Waveform Cosine: 0.999884
Avg Waveform SNR: 52.72 dB
Avg Waveform MSE: 0.00000001
Cosine > 0.99: PASS
Overall: PASS

results/accuracy_eval.json

  • 文件大小:275 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
{
  "model": "moss-tts-nano",
  "comparison": "NPU vs CPU (layer-level weight analysis)",
  "num_layers_tested": 30,
  "avg_cosine_similarity": 1.0000000119432717,
  "min_cosine_similarity": 0.9999998259314294,
  "passed": true,
  "timestamp": "2026-05-16 14:26:14"
}

logs/performance_eval.log

  • 文件大小:692 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
============================================================
MOSS-TTS-Nano Performance Evaluation
============================================================

Loading models on npu:0...
Text: "Hello, this is a performance test."
Max frames: 100

Warmup: 1

Timed runs: 3
  Run 1: 1.51s, audio=2.16s, RTF=0.7005
  Run 2: 1.46s, audio=2.16s, RTF=0.6780
  Run 3: 1.49s, audio=2.16s, RTF=0.6898

============================================================
Results
============================================================
Avg inference time: 1.49s
Avg audio duration: 2.16s
Avg RTF: 0.6894
Faster than real-time
Device: Ascend910_9362 (61.3 GB)
Peak memory: 554.5 MB

results/performance_eval.json

  • 文件大小:417 bytes
  • 以下内容为 README 直接文本转写,不是外部路径引用。
{
  "model": "OpenMOSS/MOSS-TTS-Nano",
  "device": "npu:0",
  "text": "Hello, this is a performance test.",
  "max_frames": 100,
  "warmup": 1,
  "num_runs": 3,
  "avg_inference_time_s": 1.4892085393269856,
  "avg_audio_duration_s": 2.16,
  "avg_rtf": 0.6894483978365674,
  "peak_memory_mb": 554.5361328125,
  "all_runs_s": [
    1.513183355331421,
    1.4645376205444336,
    1.4899046421051025
  ]
}

8. 许可证与声明

  • 适配代码许可证以本仓库 license 元数据或 LICENSE 文件为准。
  • 原始模型权重许可证以模型发布方为准。
  • 本仓库不应提交私钥、token、API key、缓存目录或大体积权重文件。
  • 文档中的运行结果来自仓库现有日志和 JSON 结果文件;未验证的数值不会在 README 中虚构。