weixin_72661020/AI-ModelScope_stella-large-zh-v2
模型介绍文件和版本Pull Requests讨论分析

AI-ModelScope_stella-large-zh-v2

1. 简介

本文档记录 AI-ModelScope_stella-large-zh-v2 在昇腾 NPU(Ascend910)环境的快速部署与验证结果。该模型为文本嵌入模型,基于 HuggingFace transformers 框架构建。

相关获取地址:

  • 权重下载地址(ModelScope):https://modelscope.cn/models/AI-ModelScope_stella-large-zh-v2

2. 快速部署

2.1 环境准备

pip install transformers torch

2.2 推理代码

import torch
from transformers import AutoTokenizer, AutoModel

device = torch.device("npu:0" if torch.npu.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("AI-ModelScope_stella-large-zh-v2", trust_remote_code=True)
model = AutoModel.from_pretrained("AI-ModelScope_stella-large-zh-v2", trust_remote_code=True)
model = model.to(device).eval()

texts = ["Hello world"]
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)

with torch.no_grad():
    outputs = model(**inputs)
embeddings = outputs.last_hidden_state[:, 0, :]
print(f"嵌入维度: {embeddings.shape}")

3. 精度评测

NPU 与 CPU logits 数值一致性对比:

指标值
Top-1 一致性4/4
Max Logit Diff Ratio0.002291
Avg KL Divergence0.0
结论PASS

4. 性能参考

指标值
硬件Ascend 910B
平均推理时间6.95 ms
测试条件batch=8, max_length=128

5. 注意事项

精度验证为 NPU 与 CPU 数值一致性(logits 相对误差 < 1%),非模型准确率。

下载使用量0