sherpa-onnx 是一个基于ONNX 运行时的开源语音处理库,专注于提供高性能、跨平台的语音识别(ASR)、语音合成(TTS)及多种语音任务的本地化解决方案。
Kokoro TTS 模型支持多语言合成,通过混合语音技术实现自然的跨语言语音生成,支持包括中文、英文、西班牙语等多种语言。模型架构依赖词典文件(如 lexicon-us-en.txt)和语音特征(voices.bin)来管理发音规则,并可通过调整说话人ID(如 sid=18)切换语音风格。
硬件设备
| 设备型号 | NPU配置 |
|---|---|
| Atlas 800I A2 | 8*64G |
| Atlas 800T A2 | 8*64G |
软件版本配置表
| 软件配套 | 版本 |
|---|---|
| python | 3.11 |
| sherpa-onnx | master |
| CANN | 8.3.rc2 |
| HDK | 25.2.3 |
当前模型不依赖通用推理框架,仅依赖昇腾环境,所以物理机安装驱动、hdk,镜像内包含cann相关即可。
openeuler系统编译sherpa-onnx框架npu版本存在问题,需要使用ubuntu系统镜像。
镜像拉取
docker pull quay.io/ascend/cann:8.3.rc2-910-ubuntu22.04-py3.11资源需要:910B单卡即可
容器运行:
docker run -itd --name sherpa-onnx-kokoro \
--net=host \
--privileged=true \
--shm-size=1g \
--device=/dev/davinci0 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /models:/models \
-v /etc/hccn.conf:/etc/hccn.conf \
c1855ae355cb /bin/bash进入容器:
docker exec -it sherpa-onnx-kokorohttps://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-int8-multi-lang-v1_1.tar.bz2
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-int8-multi-lang-v1_0.tar.bz2
https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2cd /root/sherpa-onnx-kokoro
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_1.tar.bz2
tar xf kokoro-multi-lang-v1_1.tar.bz2
rm kokoro-multi-lang-v1_1.tar.bz2root@next-gen-kaldi:~$ source /usr/local/Ascend/ascend-toolkit/set_env.sh
root@next-gen-kaldi:~$ echo $ASCEND_TOOLKIT_HOME
/usr/local/Ascend/ascend-toolkit/latestcd /root/sherpa-onnx-kokoro
git clone https://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build
cd build
cmake -DSHERPA_ONNX_ENABLE_ASCEND_NPU=ON -DBUILD_SHARED_LIBS=ON -Wno-dev -Wno-deprecated ..
make在 cmake 过程中,若出现 eigen-3.4.0.tar.gz 下载报错的问题,可手动下载该文件并上传至对应路径。
make 完成后执行
ldd ./bin/sherpa-onnx-offline确认libascendcl.so文件关联情况
libascendcl.so => /usr/local/Ascend/ascend-toolkit/latest/lib64/libascendcl.so (0x0000ffff94105000)
cd sherpa-onnx
python setup.py installcd /root/sherpa-onnx-kokoro/sherpa-onnx
build/bin/sherpa-onnx-offline-tts \
--kokoro-model=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/model.onnx \
--kokoro-voices=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/voices.bin \
--kokoro-tokens=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/tokens.txt \
--kokoro-data-dir=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/espeak-ng-data \
--kokoro-lexicon=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-us-en.txt,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-zh.txt \
--num-threads=2 \
--sid=2 \
--output-filename="./kokoro-multi-lang-v1_1-output.wav" \
"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."for t in 1 2 3 4; do
build/bin/sherpa-onnx-offline-tts \
--num-threads=$t \
--kokoro-model=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/model.onnx \
--kokoro-voices=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/voices.bin \
--kokoro-tokens=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/tokens.txt \
--kokoro-data-dir=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/espeak-ng-data \
--kokoro-lexicon=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-us-en.txt,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/lexicon-zh.txt \
--tts-rule-fsts=/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/date-zh.fst,/root/sherpa-onnx-kokoro/kokoro-multi-lang-v1_1/number-zh.fst \
--sid=1 \
--output-filename="./kokoro-multi-lang-v1_1-output.wav" \
"你好吗?Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone."
done树莓派 4 Model B Rev 1.5 上的 RTF
| num_threads | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| RTF | 7.635 | 4.470 | 3.430 | 3.191 |
| num_threads | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| RTF | 3.476 | 1.998 | 1.536 | 1.197 |
| RTF | 3.543 | 1.840 | 1.308 | 1.093 |
| RTF | 3.315 | 1.824 | 1.350 | 1.127 |
https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.htmlhttps://k2-fsa.github.io/sherpa/onnx/ascend/install.html