ModelScope:Trelis/multi-qa-MiniLM-L6-dot-v1-ft-pairs | GitCode:Trelis/multi-qa-MiniLM-L6-dot-v1-ft-pairs
| 指标 | 值 |
|---|---|
| Top-1 一致性 | N/A (embedding) |
| Max Logit Diff Ratio | 0.000975 |
| Avg KL Divergence | 0 |
| 权重大小 | 86.7MB |
| 设备 | Ascend 910B NPU |
from transformers import AutoTokenizer, BertModel
import torch
tok = AutoTokenizer.from_pretrained("/opt/atomgit/~/output/Trelis_multi-qa-MiniLM-L6-dot-v1-ft-pairs/model/Trelis/multi-qa-MiniLM-L6-dot-v1-ft-pairs", trust_remote_code=True)
model = BertModel.from_pretrained("/opt/atomgit/~/output/Trelis_multi-qa-MiniLM-L6-dot-v1-ft-pairs/model/Trelis/multi-qa-MiniLM-L6-dot-v1-ft-pairs", trust_remote_code=True).to("npu:0").eval()
inputs = tok(["测试文本"], return_tensors="pt", padding=True, truncation=True, max_length=128)
inputs = {k: v.to("npu:0") for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs)
print(outputs.last_hidden_state.shape if hasattr(outputs, "last_hidden_state") else outputs.logits.shape)inference.py — 昇腾 NPU 推理脚本eval/run_accuracy.py — 精度验证脚本eval/run_performance.py — 性能测试脚本eval/accuracy.json — 精度验证结果eval/performance.json — 性能测试结果