| 项目 | 版本/内容 |
|---|---|
| 设备 | Atlas 800I A2 (8x 910B3) |
| 镜像 | quay.io/ascend/vllm-ascend:v0.13.0 |
| Python | 3.11.14 |
| PyTorch | 2.8.0+cpu |
| torch_npu | 2.8.0.post2 |
| transformers | 4.57.6 |
/data/
├── dinov3-vitb16-pretrain-lvd1689m/ # 模型权重
│ ├── model.safetensors
│ ├── config.json
│ ├── preprocessor_config.json
│ └── ...
└── dinov3-vitb16-pretrain-lvd1689m-ascend/ # 适配代码
├── inference.py # 推理脚本
└── README.md # 本文档docker rm -f test-dinov3
docker run -itd --privileged --name=test-dinov3 --net=host --shm-size=500g \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci2 \
--device=/dev/davinci3 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /data:/data \
quay.io/ascend/vllm-ascend:v0.13.0 \
bashdocker exec test-dinov3 bash -c "pip3 install safetensors pillow -q -i https://repo.huaweicloud.com/repository/pypi/simple/"docker exec test-dinov3 bash -c "cd /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m-ascend && python inference.py \
--model_path /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m \
--precision_test \
2>&1 | tee /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m-ascend/log.txt"docker exec test-dinov3 bash -c "cd /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m-ascend && python inference.py \
--model_path /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m \
--image_path /tmp/test_image.jpg \
--device npu:0 \
2>&1 | tee /data/ysws/agentsp/dinov3-vitb16-pretrain-lvd1689m-ascend/log.txt"| 参数 | 说明 | 默认值 |
|---|---|---|
| --model_path | 模型权重路径 | 必需 |
| --image_path | 待推理图像路径 | 必需(精度测试时不需要) |
| --precision_test | 运行精度测试 | False |
| --device | 运行设备 | npu:0 |
========================================================
Precision Comparison: CPU vs NPU
========================================================
Max errors: sum=9.16e-05, mean=1.19e-07, std=2.98e-08
PASS: NPU precision within thresholds
========================================================
PRECISION TEST PASSED
========================================================| 指标 | 阈值 | 实测值 | 状态 |
|---|---|---|---|
| max_error_sum | < 1e-3 | 9.16e-05 | ✅ PASS |
| max_error_mean | < 1e-5 | 1.19e-07 | ✅ PASS |
| max_error_std | < 1e-5 | 2.98e-08 | ✅ PASS |
2026-05-11 07:13:14,968 - INFO - DINOv3-ViT-B16 昇腾 NPU 推理
2026-05-11 07:13:17,549 - INFO - 模型已加载到设备: npu:0
2026-05-11 07:13:17,549 - INFO - 开始预热...
2026-05-11 07:13:17,860 - INFO - 预热完成
2026-05-11 07:13:17,860 - INFO - 开始推理...
2026-05-11 07:13:17,862 - INFO - 推理耗时: 0.0178s
2026-05-11 07:13:17,862 - INFO - 池化输出形状: torch.Size([1, 768])
2026-05-11 07:13:17,862 - INFO - 隐藏状态形状: torch.Size([1, 201, 768])
2026-05-11 07:13:17,862 - INFO - 推理成功完成!torch.npu.is_available() 返回 True,8卡均正常