Ascend-SACT/sherpa-onnx-kokoro
模型介绍文件和版本Pull Requests讨论分析
下载使用量0

sherpa-onnx-kokoro模型部署指导

第一章 模型简介

sherpa-onnx 是一个基于ONNX 运行时的开源语音处理库,专注于提供高性能、跨平台的语音识别(ASR)、语音合成(TTS)及多种语音任务的本地化解决方案。‌

Kokoro TTS 模型支持多语言合成,通过混合语音技术实现自然的跨语言语音生成,支持包括中文、英文、西班牙语等多种语言。模型架构依赖词典文件(如 lexicon-us-en.txt)和语音特征(voices.bin)来管理发音规则,并可通过调整说话人ID(如 sid=18)切换语音风格。‌

第二章 运行环境

硬件设备

设备型号NPU配置
Atlas 800I A28*64G
Atlas 800T A28*64G

软件版本配置表

软件配套版本
python3.11
sherpa-onnxmaster
CANN8.3.rc2
HDK25.2.3

第三章 镜像准备

当前模型不依赖通用推理框架,仅依赖昇腾环境,所以物理机安装驱动、hdk,镜像内包含cann相关即可。

openeuler系统编译sherpa-onnx框架npu版本存在问题,需要使用ubuntu系统镜像。

  • 镜像拉取

    docker pull quay.io/ascend/cann:8.3.rc2-910-ubuntu22.04-py3.11
  • 资源需要:910B单卡即可

  • 容器运行:

    docker run -itd --name sherpa-onnx-kokoro \
    --net=host \
    --privileged=true \
    --shm-size=1g \
    --device=/dev/davinci0 \
    --device=/dev/davinci_manager \
    --device=/dev/devmm_svm \
    --device=/dev/hisi_hdc \
    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
    -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
    -v /usr/local/sbin/:/usr/local/sbin/ \
    -v /var/log/npu/slog/:/var/log/npu/slog \
    -v /var/log/npu/profiling/:/var/log/npu/profiling \
    -v /var/log/npu/dump/:/var/log/npu/dump \
    -v /var/log/npu/:/usr/slog \
    -v /models:/models \
    -v /etc/hccn.conf:/etc/hccn.conf \
    c1855ae355cb /bin/bash
  • 进入容器:

    docker exec -it sherpa-onnx-kokoro

第四章 模型权重下载

权重下载地址
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-int8-multi-lang-v1_1.tar.bz2
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2

https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-int8-multi-lang-v1_0.tar.bz2
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2
下载模型权重
cd /root/sherpa-onnx-kokoro
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2
tar xf kokoro-multi-lang-v1_1.tar.bz2
rm kokoro-multi-lang-v1_1.tar.bz2

第五章 推理框架安装

确认cann已安装
root@next-gen-kaldi:~$ source /usr/local/Ascend/ascend-toolkit/set_env.sh
root@next-gen-kaldi:~$ echo $ASCEND_TOOLKIT_HOME
/usr/local/Ascend/ascend-toolkit/latest
构建sherpa-onnx
cd /root/sherpa-onnx-kokoro
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build
cd build
cmake -DSHERPA_ONNX_ENABLE_ASCEND_NPU=ON -DBUILD_SHARED_LIBS=ON -Wno-dev -Wno-deprecated .. 

make

在 cmake 过程中,若出现 eigen-3.4.0.tar.gz 下载报错的问题,可手动下载该文件并上传至对应路径。

make 完成后执行

ldd ./bin/sherpa-onnx-offline

确认libascendcl.so文件关联情况

libascendcl.so => /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so (0x0000ffff94105000)

编译安装sherpa-onnx python库
cd sherpa-onnx
python setup.py install

第六章 模型验证

脚本验证
cd /root/sherpa-onnx-kokoro/sherpa-onnx

build/bin/sherpa-onnx-offline-tts \
  --kokoro-model=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/model.onnx \
  --kokoro-voices=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/voices.bin \
  --kokoro-tokens=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/tokens.txt \
  --kokoro-data-dir=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/espeak-ng-data \
  --kokoro-lexicon=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-us-en.txt,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-zh.txt \
  --num-threads=2 \
  --sid=2 \
  --output-filename="./kokoro-multi-lang-v1_1-output.wav" \
  "Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."

第七章 性能测试

使用shell脚本运行性能测试数据如下:
for t in 1 2 3 4; do
 build/bin/sherpa-onnx-offline-tts \
   --num-threads=$t \
   --kokoro-model=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/model.onnx \
   --kokoro-voices=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/voices.bin \
   --kokoro-tokens=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/tokens.txt \
   --kokoro-data-dir=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/espeak-ng-data \
   --kokoro-lexicon=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-us-en.txt,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-zh.txt \
   --tts-rule-fsts=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/date-zh.fst,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/number-zh.fst \
   --sid=1 \
   --output-filename="./kokoro-multi-lang-v1_1-output.wav" \
   "你好吗?Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
done
  • 说明文档自带性能测试数据如下:

树莓派 4 Model B Rev 1.5 上的 RTF

num_threads1234
RTF7.6354.4703.4303.191
  • 910B2 64G单卡性能指标(重复执行三次)
num_threads1234
RTF3.4761.9981.5361.197
RTF3.5431.8401.3081.093
RTF3.3151.8241.3501.127

第八章 官方参考材料

  • kokoro模型官方部署指导
https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html
  • sherpa-onnx昇腾编译安装指导
https://k2-fsa.github.io/sherpa/onnx/ascend/install.html