speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch NPU 适配

模型信息

项目	内容
模型名	`iic/speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch`
任务类型	自动语音识别（ASR）
模型架构	Paraformer-Large（SANMEncoder + CifPredictorV2）
框架	FunASR 1.3.1
来源	ModelScope（达摩院）
语言	中文
采样率	16kHz
训练数据	AISHELL-1
特性	非流式离线推理，大模型高精度识别

环境信息

项目	版本
NPU	Ascend910_9362
CANN	8.5.1
Python	3.11.14
torch	2.x
torch_npu	2.9.0
FunASR	1.3.1

模型下载

from modelscope import snapshot_download
model_dir = snapshot_download("iic/speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch")

注：该模型使用旧版 generic-asr 配置格式，需要将 configuration.json 转换为新版 funasr 格式，并修正 config.yaml 中的组件名大小写（如 sanm → SANMEncoder）。详见 model_utils.py 中的加载逻辑。

音频预处理

输入格式：WAV, 16kHz, 单声道
预处理：通过 load_wav() 加载并 resample 到 16kHz
支持 torchaudio / soundfile / wave 三层 fallback

NPU 推理命令

python inference.py

NPU 推理输出

大家来体验达摩院推出的语音识别识别模型

CPU-NPU 精度一致性结果

指标	值
max_abs_error	0.000136
mean_abs_error	0.000003
relative_error	0.0513%
cosine_similarity	0.99999999992
threshold	1.0%
结果	PASS

Benchmark 结果

指标	值
avg_latency_ms	462.80
min_latency_ms	455.68
max_latency_ms	472.25
p50_latency_ms	460.72
p90_latency_ms	468.04
p95_latency_ms	470.15
audio_duration_sec	5.55
real_time_factor	0.0834

工程结构

iic-speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch-NPU/
├── assets/
│   └── test.wav
├── logs/
│   ├── env_check.log
│   ├── inference.log
│   ├── eval_consistency.log
│   └── benchmark.log
├── screenshots/
│   └── self_verification.png
├── models/
├── model_utils.py
├── inference.py
├── eval_consistency.py
├── benchmark.py
├── requirements.txt
├── .gitignore
└── README.md

运行说明

# 安装依赖
pip install -r requirements.txt

# NPU 推理
python inference.py

# CPU-NPU 一致性校验
python eval_consistency.py

# 性能测试
python benchmark.py

项目	内容
模型名	`iic/speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch`
任务类型	自动语音识别（ASR）
模型架构	Paraformer-Large（SANMEncoder + CifPredictorV2）
框架	FunASR 1.3.1
来源	ModelScope（达摩院）
语言	中文
采样率	16kHz
训练数据	AISHELL-1
特性	非流式离线推理，大模型高精度识别

项目

版本

NPU

Ascend910_9362

CANN

8.5.1

Python

3.11.14

torch

2.x

torch_npu

2.9.0

FunASR

1.3.1

模型下载

from modelscope import snapshot_download
model_dir = snapshot_download("iic/speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch")

注：该模型使用旧版 generic-asr 配置格式，需要将 configuration.json 转换为新版 funasr 格式，并修正 config.yaml 中的组件名大小写（如 sanm → SANMEncoder）。详见 model_utils.py 中的加载逻辑。

指标

值

max_abs_error

0.000136

mean_abs_error

0.000003

relative_error

0.0513%

cosine_similarity

0.99999999992

threshold

1.0%

结果

PASS

指标

值

avg_latency_ms

462.80

min_latency_ms

455.68

max_latency_ms

472.25

p50_latency_ms

460.72

p90_latency_ms

468.04

p95_latency_ms

470.15

audio_duration_sec

5.55

real_time_factor

0.0834

工程结构

iic-speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch-NPU/
├── assets/
│   └── test.wav
├── logs/
│   ├── env_check.log
│   ├── inference.log
│   ├── eval_consistency.log
│   └── benchmark.log
├── screenshots/
│   └── self_verification.png
├── models/
├── model_utils.py
├── inference.py
├── eval_consistency.py
├── benchmark.py
├── requirements.txt
├── .gitignore
└── README.md

speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch NPU 适配

模型信息

环境信息

模型下载

音频预处理

NPU 推理命令

NPU 推理输出

CPU-NPU 精度一致性结果

Benchmark 结果

工程结构

运行说明

标签

speech_paraformer-large_asr_nat-zh-cn-16k-aishell1-vocab8404-pytorch NPU 适配

模型信息

环境信息

模型下载

音频预处理

NPU 推理命令

NPU 推理输出

CPU-NPU 精度一致性结果

Benchmark 结果

工程结构

运行说明

标签