| 项目 | 版本/内容 |
|---|---|
| 设备 | Atlas 800I A2 (8x 910B3) |
| 镜像 | quay.io/ascend/vllm-ascend:v0.13.0 |
| Python | 3.11.14 |
| PyTorch | 2.8.0+cpu |
| torch_npu | 2.8.0.post2 |
| transformers | 4.57.6 |
| einops | 0.8.2 |
| safetensors | 0.7.0 |
/data/
├── nomic-embed-vision-v1.5/ # 原始模型权重
│ ├── model.safetensors
│ ├── config.json
│ ├── preprocessor_config.json
│ ├── README.md
│ ├── nomic_ai/ # 适配代码
│ │ ├── modeling_nomic_vision.py # NomicVisionModel 实现
│ │ └── __init__.py
│ └── nomic_ai/ # 原作者模型代码
│ ├── modeling_hf_nomic_bert.py
│ ├── configuration_hf_nomic_bert.py
│ └── __init__.py
└── nomic-embed-vision-v1.5-ascend/ # 适配代码
├── inference.py # 推理脚本
├── README.md # 本文档
└── log.txt # 运行日志docker rm -f test-modelagent
docker run -itd --privileged --name=test-modelagent --net=host --shm-size=500g \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci2 \
--device=/dev/davinci3 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons/ \
-v /usr/local/sbin/:/usr/local/sbin/ \
-v /var/log/npu/slog/:/var/log/npu/slog \
-v /var/log/npu/profiling/:/var/log/npu/profiling \
-v /var/log/npu/dump/:/var/log/npu/dump \
-v /var/log/npu/:/usr/slog \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /data:/data \
quay.io/ascend/vllm-ascend:v0.13.0 \
bash容器已预装必要依赖,无需额外安装。
docker exec test-modelagent bash -c "source /usr/local/Ascend/ascend-toolkit/set_env.sh && cd /data/ysws/agentsp/nomic-embed-vision-v1.5-ascend && python3 inference.py --image_path /tmp/test_image.jpg 2>&1 | tee log.txt"| 参数 | 说明 | 默认值 |
|---|---|---|
| --model_path | 模型权重路径 | /data/ysws/agentsp/nomic-embed-vision-v1.5 |
| --image_path | 待推理图像路径 | 必需 |
| --warm_image_path | 预热图像路径 | 同 --image_path |
| --device | 运行设备 | npu:0 |
| --no_warmup | 跳过预热阶段 | False |
| --fp16 | 使用FP16推理 | True |
| --fp32 | 使用FP32推理 | False |
| --precision_test | 运行精度测试 | False |
docker exec test-modelagent bash -c "source /usr/local/Ascend/ascend-toolkit/set_env.sh && \
cd /data/ysws/agentsp/nomic-embed-vision-v1.5-ascend && \
python3 inference.py --model_path /data/ysws/agentsp/nomic-embed-vision-v1.5 --image_path /data/ysws/agentsp/nomic-embed-vision-v1.5-ascend/推理截图.png --precision_test"| 指标 | 实测值 | 阈值 | 状态 |
|---|---|---|---|
| Max Error (sum) | 1.22e-04 | < 1e-3 | PASS |
| Max Error (mean) | 1.19e-07 | < 1e-5 | PASS |
| Max Error (std) | 2.98e-08 | < 1e-5 | PASS |
| 操作 | 耗时 |
|---|---|
| CPU 参考计算 (20 tensors) | 0.0486s |
| NPU 推理 (20 tensors) | 0.2445s |
2026-05-11 02:25:05,949 - INFO - ============================================================
2026-05-11 02:25:05,949 - INFO - Nomic-Embed-Vision-V1.5 Ascend NPU Inference
2026-05-11 02:25:05,949 - INFO - ============================================================
2026-05-11 02:25:05,949 - INFO - Model path: /data/ysws/agentsp/nomic-embed-vision-v1.5
2026-05-11 02:25:05,949 - INFO - Image path: /tmp/test_image.jpg
2026-05-11 02:25:05,949 - INFO - Device: npu:0
2026-05-11 02:25:05,949 - INFO - Precision: FP16
2026-05-11 02:25:05,950 - INFO - Loading model from: /data/ysws/agentsp/nomic-embed-vision-v1.5
2026-05-11 02:25:08,931 - INFO - Model loaded on device: npu:0
2026-05-11 02:25:08,972 - INFO - ----------------------------------------
2026-05-11 02:25:08,972 - INFO - Warming up...
2026-05-11 02:25:09,348 - INFO - Warmup done
2026-05-11 02:25:09,349 - INFO - ----------------------------------------
2026-05-11 02:25:09,349 - INFO - Running inference...
2026-05-11 02:25:09,600 - INFO - ----------------------------------------
2026-05-11 02:25:09,600 - INFO - Inference time: 0.0094s
2026-05-11 02:25:09,600 - INFO - Pooler output shape: torch.Size([1, 768])
2026-05-11 02:25:09,600 - INFO - Last hidden state shape: torch.Size([1, 197, 768])
2026-05-11 02:25:09,602 - INFO - Pooler output (first 5): [-0.00960932 -0.04391192 -0.01039853 0.06418511 -0.02451999]
2026-05-11 02:25:09,602 - INFO - ============================================================
2026-05-11 02:25:09,602 - INFO - Inference completed successfully!
2026-05-11 02:25:09,602 - INFO - ============================================================| 适配方案 | 单图推理时间 |
|---|---|
| 预热后 NPU | ~9.4ms |
| 首次推理 NPU | ~250ms |
Nomic-Embed-Vision-V1.5 是一种高性能视觉嵌入模型,与 nomic-embed-text-v1.5 共享相同的嵌入空间。
pooler_output: L2 归一化的 CLS token 嵌入,形状 (1, 768)last_hidden_state: 所有 token 的隐藏状态,形状 (1, 197, 768)