| 配套 | 版本 | 环境准备指导 |
|---|---|---|
| CANN | 8.2.RC1 | - |
| Python | 3.10.12 | - |
| torch | 2.8.0 | - |
| torch_npu | 2.8.0rc1 | - |
MiDashengLM-7B需要使用vllm 0.10.2以上版本。
执行如下命令:
pip config set global.extra-index-url "https://download.pytorch.org/whl/cpu/ https://mirrors.huaweicloud.com/ascend/repos/pypi"
pip install vllm==0.10.2
pip install vllm-ascend==0.10.2rc1
pip install vllm[audio]pip install torchaudio==2.8.0MiDashengLM模型代码及其使用的Torch Audio未完全适配昇腾,故需要对代码做一定修改。
初始化window_fn的数据类型为torch.float32。
/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/midashenglm.py:339
def _init_front_end(self, config):
with set_default_torch_dtype(torch.float32):
window_fn = lambda win_len: torch.hann_window(win_len, dtype=torch.float32)
self.front_end = nn.Sequential(
audio_transforms.MelSpectrogram(
f_min=config.f_min,
f_max=config.f_max,
center=config.center,
win_length=config.win_length,
hop_length=config.hop_length,
sample_rate=config.sample_rate,
n_fft=config.n_fft,
n_mels=config.n_mels,
window_fn=window_fn,
),
audio_transforms.AmplitudeToDB(top_db=120),
)将x的类型转为torch.float32,因torch.stft不支持DT_BFLOAT16。
/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/midashenglm.py
def forward(
self,
x: torch.Tensor,
x_length: Optional[torch.Tensor] = None,
) -> tuple[torch.Tensor, Optional[torch.Tensor]]:
x = x.to(torch.float32)
x = self.front_end(x)
x = x.to(self.time_pos_embed.dtype)
target_length_in_patches = self.target_length // 4
x = x.unsqueeze(1)
x = torch.permute(x, (0, 2, 1, 3))
x = self.init_bn(x)
x = torch.permute(x, (0, 2, 1, 3))
x = self.patch_embed(x)
t = x.shape[-1]支持对复数求ABS。
/usr/local/lib/python3.10/dist-packages/torchaudio/functional/functional.py:145
spec_f = spec_f.reshape(shape[:-1] + spec_f.shape[-2:])
if window_norm:
spec_f /= window.pow(2.0).sum().sqrt()
if power is not None:
if not spec_f.is_complex():
if power == 1.0:
return spec_f.abs()
return spec_f.abs().pow(power)
else:
real_part = spec_f.real
imag_part = spec_f.imag
abs_tensor = torch.hypot(real_part, imag_part)
if power == 1.0:
return abs_tensor
return abs_tensor.pow(power)
return spec_fmodelscope download --model midasheng/midashenglm-7b --local_dir ./MiDashengLM-7B采用bfloat16精度启动。
vllm serve models/MiDashengLM-7B-bf16 \
--served-model-name midashenglm-7b-bf16 \
--tensor-parallel-size 1 \
--max_model_len 4096 \
--trust-remote-code \
--dtype bfloat16 \
--enforce-eager \
--port 8106 curl http://127.0.0.1:8106/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $INF_API_KEY" \
-d '{
"model": "midashenglm-7b-bf16",
"messages": [
{"role": "system", "content": "You are a helpful language and speech assistant."},
{"role": "user", "content": [
{"type": "audio_url", "audio_url": {"url": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2.5-Omni/cough.wav"}},
{"type": "text", "text": "Caption the audio."}
]}
],
"temperature": 0.7,
"max_tokens": 2048
}'MiDashengLM-7B模型对temperature参数较为灵敏。如果进行识别类型的评测,可将temperature设为0。