移动端语音多命令词唤醒模型,基于FunASR框架的SANM结构,参数量约3.28M。
支持命令词:
NPU适配说明:
pip install funasr>=1.1.7 modelscope soundfile
# 注意:不需要ffmpeg,已通过soundfile monkey-patch绕过原始权重来源: https://modelscope.cn/models/iic/speech_sanm_kws_phone-xiaoyun-commands-online
modelscope download --model iic/speech_sanm_kws_phone-xiaoyun-commands-online --local_dir ./weightspython3 inference.py --audio assets/kws_xiaoyunxiaoyun.wav --keywords "小云小云" --device npu输出示例:
[INFO] 使用NPU设备: Ascend910_9362
[INFO] 模型路径: weights
[INFO] 输入音频: assets/kws_xiaoyunxiaoyun.wav
[INFO] 关键词: 小云小云
[INFO] 推理设备: npu
[INFO] 模型加载耗时: 3.138s
[INFO] 推理耗时: 0.287s
========== 推理结果 ==========
[{'key': 'kws_xiaoyunxiaoyun', 'text': 'detected 小云小云 0.965860'}]
==============================python3 inference.py --audio assets/kws_xiaoyunxiaoyun.wav --keywords "小云小云" --device cpupython3 benchmark.py --device npu --runs 20NPU性能结果:
| 配置 | Avg(ms) | P50(ms) | P90(ms) | P99(ms) | QPS | RTF |
|---|---|---|---|---|---|---|
| 默认配置 [4,8,4] | 1618.23 | 1608.18 | 1676.20 | 1709.04 | 0.62 | 0.357 |
python3 accuracy.py评测结果:
| 关键词 | CPU Status | CPU Score | NPU Status | NPU Score | Score Diff | Match |
|---|---|---|---|---|---|---|
| 小云小云 | detected | 0.965860 | detected | 0.965860 | 0.000000 | PASS |
| 播放音乐 | rejected | 0.000000 | rejected | 0.000000 | 0.000000 | PASS |
| 暂停播放 | rejected | 0.000000 | rejected | 0.000000 | 0.000000 | PASS |
| 增大音量 | rejected | 0.000000 | rejected | 0.000000 | 0.000000 | PASS |
| 减小音量 | rejected | 0.000000 | rejected | 0.000000 | 0.000000 | PASS |
结论: NPU与CPU推理结果完全一致,最大分数差异为0,精度评测 PASS。
.
├── inference.py # NPU推理脚本
├── benchmark.py # 性能测试脚本
├── accuracy.py # 精度评测脚本
├── README.md # 部署文档
├── assets/ # 测试音频
│ └── kws_xiaoyunxiaoyun.wav
├── weights/ # 模型权重(需下载)
└── output/ # 输出结果@inproceedings{Gao2020SANMME,
title={SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition},
author={Zhifu Gao and Shiliang Zhang and Ming Lei and Ian Mcloughlin},
booktitle={Interspeech},
year={2020}
}