pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('dengcunqin/speech_paraformer-large_asr_nat-zh-cantonese-en-16k-vocab8501-online')"输入音频格式:16kHz,单声道,WAV
使用 model_utils.load_wav() 加载音频,支持 torchaudio/soundfile/wave 三层 fallback。
pip install -r requirements.txt
python inference.pyModel: dengcunqin/speech_paraformer-large_asr_nat-zh-cantonese-en-16k-vocab8501-online
Audio: assets/test.wav
NPU transcription: 欢迎大家来体体验达摩院院输出的语语音别识别日志保存在 logs/inference.log。
| 指标 | 数值 |
|---|---|
| max_abs_error | 0.000134 |
| mean_abs_error | 0.000004 |
| relative_error | 0.0090% |
| cosine_similarity | 1.000000 |
| threshold | 1.0% |
| 结果 | PASS |
| 指标 | 数值 |
|---|---|
| 平均延迟 | 858.12 ms |
| 最小延迟 | 833.70 ms |
| 最大延迟 | 894.06 ms |
| P50 | 853.82 ms |
| P90 | 885.47 ms |
| P95 | 889.77 ms |
| 音频时长 | 5.55 s |
| RTF | 0.1547 |
.
├── assets/
│ └── test.wav
├── logs/
│ ├── inference.log
│ ├── eval_consistency.log
│ ├── benchmark.log
│ └── env_check.log
├── screenshots/
│ ├── self_verification.txt
│ └── self_verification.png
├── model_utils.py
├── inference.py
├── eval_consistency.py
├── benchmark.py
├── capture_cpu.py
├── capture_npu.py
├── requirements.txt
├── .gitignore
└── README.mdpip install -r requirements.txtpython inference.pypython eval_consistency.pypython benchmark.py#NPU #Ascend #ASR #Paraformer #FunASR