| Component | Version |
|---|---|
| PyTorch | 2.9.0 |
| torch_npu | 2.9.0.post1 |
| Ascend CANN | 8.5.1 |
| NPU | Ascend 910B4 (29.5GB) |
| Transformers | 4.57.6 |
| File | Description |
|---|---|
inference.py | Single-image inference + benchmark mode |
accuracy_run.py | CPU vs NPU accuracy validation |
accuracy_run_perf.py | NPU performance benchmark |
result.json | Sample inference result |
accuracy_report.json | Accuracy validation report |
perf_report.json | Performance benchmark report |
python3 inference.py \
--model_path /path/to/siglip2-base-patch16-512 \
--image /path/to/image.jpg \
--output result.jsonpython3 inference.py \
--model_path /path/to/siglip2-base-patch16-512 \
--benchmark \
--warmup 3 \
--iterations 10Or use the dedicated benchmark script:
python3 accuracy_run_perf.py /path/to/siglip2-base-patch16-512 10 perf_report.jsonpython3 accuracy_run.py \
--model_path /path/to/siglip2-base-patch16-512 \
--output accuracy_report.json| Metric | Value |
|---|---|
| Avg Latency | 14.59 ms |
| Median Latency | 14.28 ms |
| P90 Latency | 14.38 ms |
| P99 Latency | 21.96 ms |
| Min Latency | 14.18 ms |
| Max Latency | 29.07 ms |
| Throughput | 68.55 img/s |
| Metric | Value |
|---|---|
| CPU vs NPU Prediction Match | ✅ 5/5 |
| Max Relative Error | < 0.1% |
| Cosine Similarity | 1.0 |
| Status | ✅ PASS |
Note: This is the base model without fine-tuning. The classifier head (LABEL_0, LABEL_1) is randomly initialized. Fine-tuning is required for downstream tasks.
基于现有评测数据,CPU 与 NPU 的 余弦相似度 精度误差为 0.0%,小于 1% 的精度要求。
本仓库提供完整的推理脚本,支持 CPU 和 NPU 双平台推理:
# NPU 推理
python3 inference.py --device npu
# CPU 推理
python3 inference.py --device cpu推理完成后会输出推理结果和耗时,表明模型在 NPU 上推理成功。