from modelscope import snapshot_download
model_dir = snapshot_download("iic/speech_paraformer-large-vad-punc_asr_nat-en-16k-common-vocab10020")
音频预处理
输入格式:WAV,16kHz,单声道
预处理:通过 load_wav() 加载并 resample 到 16kHz
支持 torchaudio / soundfile / wave 三层 fallback
NPU 推理命令
python inference.py
NPU 推理输出
refuse horace vo kingdom ibrahim horace re architectural identities kingdom float peck splendor against rubbed hainanese unequal retention sheriffng consist inquired assemble vo