Mono3D 作为“基于单目图像的3D检测方法”的统称,其中以SMOKE(Single-Stage Monocular 3D Object Detection via Keypoint Estimation)为主流代表模型,是自动驾驶领域中一种高效、简洁且实用的单目3D目标检测模型,由阿里巴巴达摩院于2020年提出(论文链接)。它专为前视单目摄像头场景设计,目标是在仅使用一张RGB图像的前提下,实时、准确地预测车辆等物体的3D边界框(位置、尺寸、朝向)。
本文介绍Mono3D代表模型SMOKE的迁移适配昇腾平台及性能优化指导。
| 配套 | 版本 |
|---|---|
| Python | 3.11 |
| torch | 2.5.1 |
| torch_npu | 2.5.1 |
| torchvision | 0.20.1 |
| 设备型号 | NPU配置 |
|---|---|
| Atlas 800T A3 | 1卡 |
镜像地址:昇腾云各版本配套基础镜像
| 机型 | 镜像名称 | 镜像地址 |
|---|---|---|
| 910C | Pytorch2.5容器镜像 | 内网地址:registry-cbu.huawei.com/atelier/pytorch_ascend:pytorch_2.5.1-cann_8.2.rc1-py_3.11-hce_2.0.2503-aarch64-snt9b23-20250729103313-3a25129 外网地址:swr.cn-southwest-2.myhuaweicloud.com/atelier/pytorch_ascend:pytorch_2.5.1-cann_8.2.rc1-py_3.11-hce_2.0.2503-aarch64-snt9b23-20250729103313-3a25129 |
docker pull registry-cbu.huawei.com/atelier/pytorch_ascend:pytorch_2.5.1-cann_8.2.rc1-py_3.11-hce_2.0.2503-aarch64-snt9b23-20250729103313-3a25129docker run -itd \
-u root \
--privileged \
--device=/dev/davinci0 \
--device=/dev/davinci1 \
--device=/dev/davinci3 \
--device=/dev/davinci2 \
--device=/dev/davinci4 \
--device=/dev/davinci5 \
--device=/dev/davinci6 \
--device=/dev/davinci7 \
--device=/dev/davinci_manager \
--device=/dev/devmm_svm \
--device=/dev/hisi_hdc \
-v /usr/local/sbin/npu-smi:/usr/local/sbin/npu-smi \
-v /usr/local/dcmi:/usr/local/dcmi \
-v /etc/ascend_install.info:/etc/ascend_install.info \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
-v /usr/bin/hccn_tool:/usr/bin/hccn_tool \
-v /etc/hccn.conf:/etc/hccn.conf \
-v /home/t00578192:/data \
--shm-size 1024g \
--network=host\
--name mono3d \
2c19dd762161\
/bin/bashdocker exec -it mono3d bash
conda create -n mono3d --clone PyTorch-2.5.1
conda activate mono3dpip install yacs
pip install scikit-image
pip install tqdmcd /data
git clone https://github.com/lzccccc/SMOKE使用官方的 KITTI 3D 对象数据集训练和测试模型

cd /data/SMOKE
mkdir datasets
cd datasets
mkdir kitti
# 下载图像数据,标定参数,数据标签并解压
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip
unzip data_object_image_2.zip
unzip data_object_label_2.zip
unzip data_object_calib.zip
# 创建ImageSets
cd kitti/training
mkdir ImageSets
cd ImageSets
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/test.txt --no-check-certificate --content-disposition -O test.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/train.txt --no-check-certificate --content-disposition -O train.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/val.txt --no-check-certificate --content-disposition -O val.txt
wget -c https://raw.githubusercontent.com/traveller59/second.pytorch/master/second/data/ImageSets/trainval.txt --no-check-certificate --content-disposition -O trainval.txt最终数据集目录结构如下:
kitti
│──training
│ ├──calib
│ ├──label_2
│ ├──image_2
│ └──ImageSets
└──testing
├──calib
├──image_2
└──ImageSets# 修改tool/plain_train_net.py,开头插入自适配代码
import torch_npu
from torch_npu.contrib import transfer_to_npu由于SMOKE的ext扩展模块强依赖cuda,在NPU上无法安装成功,会报错"cuda is not available",这里可以选择在安装时提前注释掉ext扩展模块。
# 修改setup.py,注释_ext的编译
setup(
name="smoke",
version="0.1",
author="lzccccc",
url="https://github.com/lzccccc/SMOKE",
description="Single-Stage Monocular 3D Object Detection via Keypoint Estimation",
packages=find_packages(exclude=("configs", "tests",)),
# ext_modules=get_extensions(),
# cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
)经过分析torch_npu.contrib.module.DCNv2参数与smoke/layers/dcn_v2.py中DCN参数一致

因此可以直接用torch_npu中的DCNv2实现进行覆盖
# 原代码from smoke import _ext as _backend及后续DCN相关代码可以全部注释
from smoke import _ext as _backend
...
# 新代码用torch_npu中的DCNv2实现进行覆盖
import torch_npu
from torch_npu.contrib.module import DCNv2 as DCNpython setup.py build develop由于原代码中训练所需的权重文件下载链接http://dl.yf.io/dla/models/imagenet/dla34-ba72cf86.pth已失效
也可以下载到本地,配置换成本地路径
# 修改配置smoke/config/paths_catalog.py
class ModelCatalog():
IMAGENET_MODELS = {
# "DLA34": "http://dl.yf.io/dla/models/imagenet/dla34-ba72cf86.pth"
"DLA34": "https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.1/dla-34/dla34-ba72cf86.pth"
}
使用单卡训练命令启动训练
python tools/plain_train_net.py --config-file "configs/smoke_gn_vector.yaml"
NPU训练成功,单迭代训练性能为:2.3072 s / it