ViTDet 是基于 Vision Transformer 的目标检测方案。本次适配对象为:
configs/npu_real/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit.py典型场景:
#2.1 下载镜像
docker pull quay.io/ascend/cann:8.5.0使用支持昇腾 NPU 的 Docker 镜像,启动命令如下:
docker run -it -u root -d --net=host \
--privileged \
--ipc=host \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
-v /usr/local/sbin:/usr/local/sbin \
-v /usr/local/Ascend/driver/tools/hccn_tool:/usr/local/Ascend/driver/tools/hccn_tool \
--name transfer_npu \
quay.io/ascend/cann:8.5.0 \
/bin/bash| 组件 | 版本 |
|---|---|
| Python | 3.11.13 |
| torch | 2.8.0+cpu |
| torch_npu | 2.8.0 |
| torchvision | 0.23.0 |
| CANN | 8.5.0 |
cd ~
git clone https://github.com/open-mmlab/mmdetection.git优先执行:
cd ~/mmdetection
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
pip install 'torch_npu==2.11.0rc1' -i https://mirrors.aliyun.com/pypi/simple
pip install PyYAML -i https://mirrors.aliyun.com/pypi/simple
pip install -U openmim -i https://mirrors.aliyun.com/pypi/simple
pip install 'setuptools==60.2.0' -i https://mirrors.aliyun.com/pypi/simple
pip install 'torch_npu==2.8.0' -i https://mirrors.aliyun.com/pypi/simple
pip install 'torchvision==0.23.0' -i https://mirrors.aliyun.com/pypi/simple
pip install -U psutil decorator -i https://mirrors.aliyun.com/pypi/simple
pip install onnx -i https://mirrors.aliyun.com/pypi/simple
apt-get update
apt-get install -y libxcb1 libxrender1 libxext6 libglib2.0-0
apt-get install -y libgl1-mesa-glx libglib2.0-0由于需要修改mmcv支持npu,需要使用如下脚本进行源码编译、安装
unzip vit.zip -d ~/mmdetection/
cd ~/mmdetection/projects/ViTDet
MMCV_WITH_OPS=1 FORCE_NPU=1 bash -x tools/build_mmcv210_npu_wheel.sh
VERIFY_NPU=1 bash tools/install_mmcv210_npu_wheel.sh由于当前mmcv对npu支持还不够完整,需要增加mmcv patch,相关文件见
用于实现在npu上实现相应的计算。
如下图表示安装成功

首先应用patch文件
cd ~/mmdetection
git apply patch/0001_vitdet_ascend_tracked_changes.patch
验证能在训练脚本中看到npu相关的修改

mkdir -p dataset/coco
cd dataset/coco
# 下载训练集、验证集和标注文件
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
# 解压
unzip train2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip数据集放在如下目录,便于后续训练

wget https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth
单卡训练
cd ~/mmdetection/projects/ViTDet
export ASCEND_RT_VISIBLE_DEVICES=1
bash tools/run_vitdet_real_npu_100e.sh通过ASCEND_RT_VISIBLE_DEVICES指定特定npu卡进行训练。
多卡训练
cd ~/mmdetection
export ASCEND_RT_VISIBLE_DEVICES=6,7
bash tools/dist_train_npu.sh projects/ViTDet/configs/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit.py 2训练曲线图:

使用atc单独导出om文件工具会直接失败,所以将om文件导出分拆为多个文件。
cd ~/mmdetection/projects/ViTDet
CHECKPOINT=~/mmdetection/work_dirs/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit/epoch_2.pth RUN_INFER=0 bash tools/run_vitdet_staged_export_and_infer.sh如下图表示om导出成功
在artificats目录下即可看到相关的om文件导出

cd ~/mmdetection/projects/ViTDet
CHECKPOINT=~/mmdetection/work_dirs/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit/epoch_2.pth RUN_EXPORT=0 RUN_ATC=0 bash tools/run_vitdet_staged_export_and_infer.sh由于本次将om拆分成了3个文件,为了验证精度没有损失,增加了om推理和pytorch推理的结果对比能力。
python3 tools/infer_vitdet_om_staged.py \
--config /root/mmdetection/projects/ViTDet/configs/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit.py \
--checkpoint /root/mmdetection/work_dirs/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit/epoch_2.pth \
--img /root/mmdetection/tools/data/coco/val2017/000000037777.jpg \
--backbone-meta /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/export/backbone/export_meta.json \
--rpn-meta /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/export/rpn/export_meta.json \
--roi-meta /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/export/roi/export_meta.json \
--backbone-om /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/atc/backbone/vitdet_backbone_export_linux_aarch64.om \
--rpn-om /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/atc/rpn/vitdet_rpn_export.om \
--roi-om /root/mmdetection/projects/ViTDet/artifacts/staged_910B3_pipeline/atc/roi/vitdet_roi_export_linux_aarch64.om \
--device-id 0 \
--score-thr 0.3 \
--compare-with-pytorch \
--out-dir artifacts/staged_910B3_pipeline/infer从精度对比结果来看基本符合预期。
"compare_with_pytorch": {
"enabled": true,
"pytorch_device": "cpu",
"dedup_policy": "keep_top1_per_label",
"score_thr": 0.3,
"om": {
"raw_count": 49,
"after_dedup": 19,
"after_score_thr": 1,
"dropped_by_dedup": 30,
"dropped_by_score_thr": 18,
"score_stats": {
"max": 0.3389612138271332,
"mean": 0.3389612138271332
}
},
"pytorch": {
"raw_count": 50,
"after_dedup": 16,
"after_score_thr": 1,
"dropped_by_dedup": 34,
"dropped_by_score_thr": 15,
"score_stats": {
"max": 0.34656473994255066,
"mean": 0.34656473994255066
},
"final_detections_topk": [
{
"bbox": [
301.6426086425781,
75.15811920166016,
348.95098876953125,
225.3976287841797
],
"score": 0.34656473994255066,
"label": 62
}
]
},
"delta": {
"raw_count": -1,
"after_dedup": 3,
"after_score_thr": 0
}
}推理阶段耗时(ms):
157.7461.16285.46859.338303.714 ms3.29 img/s推理可视化图:

mmcv 2.2.0 与当前 MMDetection 不兼容mmcv < 2.2.0。mmcv 2.1.0。mmcv._ext 导入失败(NPU ABI 不匹配)mmcv._ext / mmcv.ops 在运行时导入失败。torch_npu ABI 变化导致。tools/mmcv210_npu_patches/*)。mmcv._ext 导入通过。nms / RoIAlign 最小算子 smoke test 通过。set_env.sh > npu-smi > user > default。Ascend*。num_detections=0,结果不可用。staged_9391_pipeline_fix 版本。75,并按业务规则去重为 6(每个 label 仅保留最高分框),可视化结果正常。开源代码改动(已有文件修改):
mmdet/utils/dist_utils.pytools/train.pyprojects/ViTDet/vitdet/vit.py新增脚本与配置(复用资产):
projects/ViTDet/tools/build_mmcv210_npu_wheel.shprojects/ViTDet/tools/install_mmcv210_npu_wheel.shprojects/ViTDet/tools/run_vitdet_real_npu_100e.shprojects/ViTDet/tools/run_vitdet_9391_staged_export_and_infer.shprojects/ViTDet/tools/infer_vitdet_om_staged.pyprojects/ViTDet/tools/export_vitdet_split_artifacts.pyprojects/ViTDet/configs/npu_real/vitdet_mask-rcnn_vit-b-mae_npu_real_overfit.py