本文档记录 dinov3-vitl16-pretrain-lvd1689m 在昇腾 NPU 环境的部署与验证结果。
DINOv3 ViT-Large (ViT-L/16) 是一种视觉基础模型,可输出图像特征以用于下游任务。
相关地址:
| 组件 | 版本 |
|---|---|
| 镜像 | quay.io/ascend/vllm-ascend:v0.13.0 |
| Python | 3.11.14 |
| PyTorch | 2.8.0+cpu |
| torch_npu | 2.8.0.post2 |
| transformers | 4.57.6 |
Atlas 800I A2 (8x 910B3)/data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m| 项目 | 值 |
|---|---|
| 架构 | DINOv3ViTModel |
| 参数量 | ~300M |
| 隐藏层大小 | 1024 |
| 层数 | 24 |
| 注意力头数 | 16 |
| patch 大小 | 16 |
| 输入尺寸 | 224x224 |
| 输出 pooler_output | (1, 1024) |
| 输出 last_hidden_state | (1, 201, 1024) |
docker rm -f test-dinov3
docker run -itd --privileged --name=test-dinov3 --net=host --shm-size=500g \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci2 \
--device=/dev/davinci3 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /data:/data \
quay.io/ascend/vllm-ascend:v0.13.0 \
bashdocker exec test-dinov3 bash -c "pip3 install safetensors pillow -q -i https://repo.huaweicloud.com/repository/pypi/simple/"docker exec test-dinov3 bash -c "source /usr/local/Ascend/ascend-toolkit/set_env.sh && \
cd /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m/ascend_adapt && \
python3 inference.py \
--model_path /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m \
--image_path /tmp/test_image.jpg \
--device npu:0 \
2>&1 | tee log.txt"| 参数 | 说明 | 默认值 |
|---|---|---|
| --model_path | 模型权重路径 | 必需 |
| --image_path | 待推理图像路径 | 必需 |
| --warm_image_path | 预热图像路径 | 同 --image_path |
| --device | 运行设备 | npu:0 |
| --fp16 | 使用FP16推理 | True (默认) |
| --fp32 | 使用FP32推理 | False |
| --no_warmup | 跳过预热阶段 | False |
| --precision_test | 运行精度测试 | False |
docker exec test-modelagent bash -c "source /usr/local/Ascend/ascend-toolkit/set_env.sh && \
cd /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m-ascend && \
python3 inference.py \
--model_path /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m \
--image_path /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m-ascend/推理截图.png \
--precision_test"| 指标 | 实测值 | 阈值 | 状态 |
|---|---|---|---|
| Max Error (sum) | 0.00e+00 | < 1e-3 | PASS |
| Max Error (mean) | 0.00e+00 | < 1e-5 | PASS |
| Max Error (std) | 0.00e+00 | < 1e-5 | PASS |
| 操作 | 耗时 |
|---|---|
| CPU 参考计算 (20 tensors) | 0.1837s |
| NPU 推理 (20 tensors) | 0.2294s |
2026-05-09 07:15:18,244 - INFO - ============================================================
2026-05-09 07:15:18,249 - INFO - DINOv3-ViT-L16 昇腾 NPU 推理
2026-05-09 07:15:18,249 - INFO - ============================================================
2026-05-09 07:15:18,249 - INFO - 模型路径: /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m
2026-05-09 07:15:18,249 - INFO - 图像路径: /tmp/test_image.jpg
2026-05-09 07:15:18,249 - INFO - 设备: npu:0
2026-05-09 07:15:18,249 - INFO - 精度: FP16
2026-05-09 07:15:18,249 - INFO - 正在加载模型: /data/ysws/agentsp/dinov3-vitl16-pretrain-lvd1689m
2026-05-09 07:15:21,087 - INFO - 模型已加载到设备: npu:0
2026-05-09 07:15:21,087 - INFO - 模型精度: torch.float16
2026-05-09 07:15:21,087 - INFO - ----------------------------------------
2026-05-09 07:15:21,087 - INFO - 开始预热...
2026-05-09 07:15:26,263 - INFO - 预热完成
2026-05-09 07:15:26,264 - INFO - ----------------------------------------
2026-05-09 07:15:26,264 - INFO - 开始推理...
2026-05-09 07:15:28,350 - INFO - ----------------------------------------
2026-05-09 07:15:28,351 - INFO - 推理耗时: 0.0346秒
2026-05-09 07:15:28,351 - INFO - 池化输出形状: torch.Size([1, 1024])
2026-05-09 07:15:28,351 - INFO - 隐藏状态形状: torch.Size([1, 201, 1024])
2026-05-09 07:15:28,352 - INFO - ============================================================
2026-05-09 07:15:28,352 - INFO - 推理成功完成!
2026-05-09 07:15:28,352 - INFO - ============================================================| 指标 | 值 |
|---|---|
| 推理时间 (FP16) | 0.0346秒/图 |
| 预热时间 (FP16) | 5.18秒 |
| 内存占用 | ~60GB (FP16) |