pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('iic/speech_UniASR-large_asr_2pass-zh-cn-16k-common-vocab8358-tensorflow1-offline')"输入音频格式:16kHz,单声道,WAV
使用 model_utils.load_wav() 加载音频,支持 torchaudio/soundfile/wave 三层 fallback。
pip install -r requirements.txt
python inference.pyModel: iic/speech_UniASR-large_asr_2pass-zh-cn-16k-common-vocab8358-tensorflow1-offline
Audio: assets/test.wav
NPU transcription: 欢迎大家来体验达摩院推出的语音识别模型日志保存在 logs/inference.log。
| 指标 | 数值 |
|---|---|
| max_abs_error | 0.023046 |
| mean_abs_error | 0.000784 |
| relative_error | 0.0170% |
| cosine_similarity | 1.000000 |
| threshold | 1.0% |
| 结果 | PASS |
CPU-NPU 相对误差为 0.0170%,远低于 1.0% 阈值,精度一致性验证通过。
| 指标 | 数值 |
|---|---|
| 平均延迟 | 1652.08 ms |
| 最小延迟 | 1613.83 ms |
| 最大延迟 | 1791.25 ms |
| P50 | 1639.75 ms |
| P90 | 1682.72 ms |
| P95 | 1736.99 ms |
| 音频时长 | 5.55 s |
| RTF | 0.2978 |
.
├── assets/
│ └── test.wav
├── logs/
│ ├── inference.log
│ ├── eval_consistency.log
│ └── benchmark.log
├── screenshots/
│ ├── self_verification.txt
│ └── self_verification.png
├── model_utils.py
├── inference.py
├── eval_consistency.py
├── benchmark.py
├── requirements.txt
├── .gitignore
└── README.mdpip install -r requirements.txtpython inference.pypython eval_consistency.pypython benchmark.py#NPU #Ascend #ASR #UniASR #FunASR #SpeechRecognition