pip install modelscope
python -c "from modelscope import snapshot_download; snapshot_download('dengcunqin/speech_seaco_paraformer_large_asr_nat-zh-cantonese-en-16k-common-vocab11666-pytorch')"输入音频格式:16kHz,单声道,WAV
使用 model_utils.load_wav() 加载音频,支持 torchaudio/soundfile/wave 三层降级备用。
pip install -r requirements.txt
python inference.pyModel: dengcunqin/speech_seaco_paraformer_large_asr_nat-zh-cantonese-en-16k-common-vocab11666-pytorch
Audio: assets/test.wav
NPU transcription: 有无人知道金钟添马街系点去㗎日志保存在 logs/inference.log。
| 指标 | 数值 |
|---|---|
| max_abs_error | 0.000022 |
| mean_abs_error | 0.000002 |
| relative_error | 0.0034% |
| cosine_similarity | 1.000000 |
| threshold | 1.0% |
| 结果 | PASS |
| 指标 | 数值 |
|---|---|
| 平均延迟 | 84.56 ms |
| 最小延迟 | 83.93 ms |
| 最大延迟 | 85.89 ms |
| P50 | 84.30 ms |
| P90 | 85.53 ms |
| P95 | 85.71 ms |
| 音频时长 | 6.46 s |
| RTF | 0.0131 |
.
├── assets/
│ └── test.wav
├── logs/
│ ├── env_check.log
│ ├── inference.log
│ ├── eval_consistency.log
│ └── benchmark.log
├── screenshots/
│ └── self_verification.txt
├── model_utils.py
├── inference.py
├── eval_consistency.py
├── benchmark.py
├── requirements.txt
├── .gitignore
└── README.mdpip install -r requirements.txtpython inference.pypython eval_consistency.pypython benchmark.py#NPU #Ascend #ASR #FunASR #Paraformer #Cantonese #Chinese #English