| 项目 | 版本/内容 |
|---|---|
| 设备 | Atlas 800I A2 (8x 910B3) |
| 镜像 | quay.io/ascend/vllm-ascend:v0.13.0 |
| Python | 3.11.14 |
| PyTorch | 2.8.0+cpu |
| torch_npu | 2.8.0.post2 |
| safetensors | 0.7.0 |
/data/
├── Kokoro-82M-bf16/ # 原始模型权重
│ ├── kokoro-v1_0.safetensors # 主模型权重 (82M参数)
│ ├── config.json
│ ├── voices/ # 语音音色文件
│ │ ├── af_heart.safetensors
│ │ ├── af_alloy.safetensors
│ │ └── ... (更多音色)
│ └── samples/ # 示例音频
│ ├── HEARME.wav
│ ├── af_heart_0.wav
│ └── ...
└── Kokoro-82M-bf16-ascend/ # 适配代码
├── inference.py # 推理脚本 (含性能+精度测试)
├── README.md # 本文档
├── log.txt # 运行日志
└── samples/ # 测试音频 (7个文件)
├── HEARME.wav
├── af_heart_0.wav ~ af_heart_5.wavdocker rm -f test-modelagent
docker run -itd --privileged --name=test-modelagent --net=host --shm-size=500g \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci2 \
--device=/dev/davinci3 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /data:/data \
quay.io/ascend/vllm-ascend:v0.13.0 \
bash容器已预装必要依赖,无需额外安装。
docker exec test-modelagent bash -c "source /usr/local/Ascend/ascend-toolkit/set_env.sh && cd /data/ysws/agentsp/Kokoro-82M-bf16-ascend && python3 inference.py 2>&1 | tee log.txt"| 参数 | 说明 | 默认值 |
|---|---|---|
| --model_path | 模型权重路径 | /data/ysws/agentsp/Kokoro-82M-bf16 |
| --voice_path | 语音音色路径 | af_heart.safetensors |
| --device | 运行设备 | npu:0 |
| --no_precision_test | 跳过精度测试 | False |
2026-05-11 05:33:05,429 - INFO - ============================================================
2026-05-11 05:33:05,430 - INFO - Kokoro-82M-bf16 TTS Ascend NPU Inference
2026-05-11 05:33:05,430 - INFO - ============================================================
2026-05-11 05:33:05,430 - INFO - Model path: /data/ysws/agentsp/Kokoro-82M-bf16
2026-05-11 05:33:05,462 - INFO - Loaded state dict with 548 keys
2026-05-11 05:33:05,462 - INFO - Voice shape: torch.Size([510, 1, 256])
2026-05-11 05:33:05,462 - INFO - Model loaded on device: npu:0
2026-05-11 05:33:05,462 - INFO - ----------------------------------------
2026-05-11 05:33:05,462 - INFO - Starting precision test...
2026-05-11 05:33:05,490 - INFO - CPU computation done in 0.03s
2026-05-11 05:33:07,561 - INFO - NPU inference done in 2.07s
2026-05-11 05:33:07,562 - INFO - Precision Comparison: CPU vs NPU
2026-05-11 05:33:07,562 - INFO - Max errors: sum=6.10e-05, mean=1.49e-08, std=1.49e-08
2026-05-11 05:33:07,562 - INFO - PASS: NPU precision within 1% of CPU
2026-05-11 05:33:07,563 - INFO - Performance Summary:
2026-05-11 05:33:07,563 - INFO - CPU computation time: 0.0276s
2026-05-11 05:33:07,563 - INFO - NPU inference time: 2.0710s
2026-05-11 05:33:07,563 - INFO - Speedup: 0.01x
2026-05-11 05:33:07,564 - INFO - PRECISION TEST PASSED
2026-05-11 05:33:07,564 - INFO - Inference completed successfully!| 指标 | 最大误差 | 阈值 | 结果 |
|---|---|---|---|
| Sum 误差 | 6.10e-05 | < 1e-3 | ✅ PASS |
| Mean 误差 | 1.49e-08 | < 1e-5 | ✅ PASS |
| Std 误差 | 1.49e-08 | < 1e-5 | ✅ PASS |
结论: NPU 与 CPU 误差 < 1%,精度测试通过
| 指标 | CPU | NPU |
|---|---|---|
| 计算时间 | 0.0276s | 2.0710s |
Kokoro-82M 是一个开源 TTS (Text-to-Speech) 模型,具有 8200 万参数。
支持多种英文音色,文件名格式说明:
af_* - 美国女性am_* - 美国男性bf_* - 英式女性bm_* - 英式男性zf_* - 中文女性jf_* - 日语女性