MindYOLO 套件

MindYOLO 是一个基于MindSpore的YOLO系列的目标检测套件。

版本配套关系如下：

mindyolo	mindspore
0.5	2.5.0

1. 模型支持列表

2. 安装

2.1 使用pip安装第三方依赖包

mindspore == 2.5.0
numpy >= 1.17.0
pyyaml >= 5.3
openmpi 4.0.3 (用于分布式模式)

可以运行以下命令安装python三方包：

pip install -r requirements.txt

⚠️ 注意：当前版本仅支持昇腾硬件，暂时不支持GPU。

2.2 使用pip安装MindYOLO包

pip install mindyolo

更多细节请查看 INSTALLATION

3. 快速开始

3.1 使用预训练模型进行推理

第一步，从模型仓库列表中选择一个模型及其配置文件，例如， ./configs/yolov7/yolov7.yaml.
第二步，从模型仓库列表中下载相应的预训练模型权重文件。

第三步，运行：

# NPU (默认)
python demo/predict.py --config ./configs/yolov7/yolov7.yaml --weight=/path_to_ckpt/WEIGHT.ckpt --image_path /path_to_image/IMAGE.jpg

结果将保存在./detect_results目录下

有关命令行参数的详细信息，请参阅demo/predict.py -h，或查看其源代码。

3.2 训练和评估

3.2.1 数据集准备

按照YOLO格式准备您的数据集。如果在COCO2017数据集进行训练，请从yolov5或darknet准备数据集.

coco/
    train2017.txt
    val2017.txt
    annotations/
    instances_train2017.json
    instances_val2017.json
    images/
        train2017/
            00000001.jpg  # image files that are mentioned in the corresponding train/val2017.txt
            ...
        val2017/
            ...
    labels/
        train2017/
            00000001.txt  # label files that are mentioned in the corresponding train/val2017.txt
            ...
        val2017/
            ...

3.2.2 训练

启动训练（单卡）：

python train.py --config ./configs/yolov7/yolov7.yaml

多卡分布式训练，以8卡为例:

msrun --worker_num=8 --local_worker_num=8 --bind_core=True --log_dir=./yolov7_log python train.py --config ./configs/yolov7/yolov7.yaml  --is_parallel True

注意：默认超参是用于coco2017数据集8卡训练的，单卡或不同数据集的情况需按自己的需要进行调整。

3.2.3 评估

评估模型的精度（单卡）：

python test.py --config ./configs/yolov7/yolov7.yaml --weight /path_to_ckpt/WEIGHT.ckpt

多卡分布式评估模型的精度：

msrun --worker_num=8 --local_worker_num=8 --bind_core=True --log_dir=./yolov7_log python test.py --config ./configs/yolov7/yolov7.yaml --weight /path_to_ckpt/WEIGHT.ckpt --is_parallel True

3.2.4 部署

请在 MindYOLO部署与推理查看.

4. 使用自定义数据集训练

在SHWD数据集（安全帽检测）上使用MindYOLO进行微调。

4.1 版本配套信息

mindspore	ascend driver	firmware	cann toolkit/kernel
2.5.0	24.1.0	7.5.0.3.220	8.0.0.beta1

4.2 数据集格式转换

4.2.1 数据集格式说明

SHWD数据集采用VOC格式的数据标注，其文件目录如下所示：

ROOT_DIR
├── Annotations
│        ├── 000000.xml
│        └── 000002.xml
├── ImageSets
│       └── Main
│             ├── test.txt
│             ├── train.txt
│             ├── trainval.txt
│             └── val.txt
└── JPEGImages
        ├── 000000.jpg
        └── 000002.jpg

Annotations文件夹下的xml文件为每张图片的标注信息，主要内容如下：

<annotation>
  <folder>JPEGImages</folder>
  <filename>000377.jpg</filename>
  <path>F:\baidu\VOC2028\JPEGImages\000377.jpg</path>
  <source>
    <database>Unknown</database>
  </source>
  <size>
    <width>750</width>
    <height>558</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>hat</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>142</xmin>
      <ymin>388</ymin>
      <xmax>177</xmax>
      <ymax>426</ymax>
    </bndbox>
  </object>

其中包含多个object，object中的name为类别名称，xmin、ymin、xmax、ymax则为检测框左上角和右下角的坐标。

MindYOLO支持的数据集格式为YOLO格式，详情请参考yolov5官方仓库或darknet准备数据集，示例如下：

coco/
    train2017.txt
    val2017.txt
    annotations/
    instances_train2017.json
    instances_val2017.json
    images/
        train2017/
            00000001.jpg  # image files that are mentioned in the corresponding train/val2017.txt
            ...
        val2017/
            ...
    labels/
        train2017/
            00000001.txt  # label files that are mentioned in the corresponding train/val2017.txt
            ...
        val2017/
            ...

4.2.2 数据集格式转换

由于MindYOLO在验证阶段选用图片名称作为image_id，因此图片名称只能为数值类型，而不能为字符串类型，还需要对图片进行改名。对SHWD数据集格式的转换包含如下步骤：

将图片复制到相应的路径下并改名
在根目录下相应的txt文件中写入该图片的相对路径
解析xml文件，在相应路径下生成对应的txt标注文件
验证集还需生成最终的json文件

详细实现可参考convert_shwd2yolo.py，运行方式如下：

python examples/finetune_SHWD/convert_shwd2yolo.py --root_dir /path_to_shwd/SHWD

运行以上命令将不改变原数据集，并在同级目录生成YOLO格式的SHWD数据集。

4.2.3 编写yaml配置文件

配置文件主要包含数据集、数据增强、损失函数、优化器、模型结构涉及的相应参数，由于MindYOLO提供yaml文件继承机制，可以只将需要调整的参数编写到yolov7-tiny_shwd.yaml，可复用或不需要修改的参数可以继承于已有模型的yaml文件，其内容如下：

__BASE__: [
  '../../configs/yolov7/yolov7-tiny.yaml',
]

per_batch_size: 16 # 单卡batchsize，总的batchsize=per_batch_size * device_num
img_size: 640 # image sizes
weight: ./yolov7-tiny_pretrain.ckpt
strict_load: False # 是否按严格加载ckpt内参数，默认True，若设成False，当分类数不一致，丢掉最后一层分类器的weight
log_interval: 10 # 每log_interval次迭代打印一次loss结果

data:
  dataset_name: shwd
  train_set: ./SHWD/train.txt # 实际训练数据路径
  val_set: ./SHWD/val.txt
  test_set: ./SHWD/val.txt
  nc: 2 # 分类数
  # class names
  names: [ 'person',  'hat' ] # 每一类的名字

optimizer:
  lr_init: 0.001  # initial learning rate

说明：

__BASE__为一个列表，表示继承的yaml文件所在路径，可以继承多个yaml文件
per_batch_size 表示每张卡上的批处理大小
img_size表示数据处理图片采用的图片尺寸
weight为上述提到的预训练模型的文件路径
strict_load表示丢弃shape不一致的参数
log_interval表示日志打印间隔
data为数据集相关参数
- dataset_name为自定义数据集名称
- train_set 训练数据集的路径
- val_set 验证数据集的路径
- test_set 测试数据集的路径
- nc 为类别数量
- names 为类别名称
optimizer为优化器相关参数
- lr_init为经过warm_up之后的初始化学习率，此处相比默认参数缩小了10倍

参数继承关系和参数说明可参考configuration_CN.md。

4.2.4 下载预训练模型

可选用MindYOLO提供的模型仓库列表中的模型，作为自定义数据集的预训练模型，预训练模型在COCO数据集上已经有较好的精度表现，相比从头训练，加载预训练模型一般会拥有更快的收敛速度以及更高的最终精度，并且能在一定程度上避免初始化不当导致的梯度消失、梯度爆炸等问题。

自定义数据集类别数通常与COCO数据集不一致，MindYOLO中各模型的检测头head结构跟数据集类别数有关，直接将预训练模型导入可能会因为shape不一致而导入失败，可以在yaml配置文件中设置strict_load参数为False，MindYOLO将自动舍弃shape不一致的参数，并抛出该module参数并未导入的告警

4.2.5 模型微调

模型微调过程中，可首先按照默认配置进行训练，如效果不佳，可考虑调整以下参数：

学习率可调小一些，防止loss难以收敛
per_batch_size可根据实际显存占用调整，通常per_batch_size越大，梯度计算越精确
epochs可根据loss是否收敛进行调整
anchor可根据实际物体大小进行调整

由于SHWD训练集只有约6000张图片，选用yolov7-tiny模型进行训练。

在多卡上进行分布式模型训练，以8卡为例:

msrun --worker_num=8 --local_worker_num=8 --bind_core=True --log_dir=./yolov7-tiny_log python train.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml --is_parallel True

在单卡上微调模型：

python train.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml

实验结果(仅供参考)：直接用yolov7-tiny默认参数在SHWD数据集上训练，可取得AP50 87.0的精度。将lr_init参数由0.01改为0.001，可实现ap50为90.5的精度结果。

4.2.6 可视化推理

使用demo/predict.py进行可视化推理，运行方式如下：

python demo/predict.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml --weight=/path_to_ckpt/WEIGHT.ckpt --image_path /path_to_image/IMAGE.jpg

推理效果如下：

4.2.7 更多的例子

5. 性能基准和模型库的精度情况

5.1 目标检测任务

在Ascend 910（8p）上以图模式测试的性能

名称	规模	批次大小	图像尺寸	数据集	边界框mAP（%）	参数数量	配置文件	下载链接
YOLOv8	N	16 * 8	640	MS COCO 2017	37.2	3.2M	yaml	weights
YOLOv8	S	16 * 8	640	MS COCO 2017	44.6	11.2M	yaml	weights
YOLOv8	M	16 * 8	640	MS COCO 2017	50.5	25.9M	yaml	weights
YOLOv8	L	16 * 8	640	MS COCO 2017	52.8	43.7M	yaml	weights
YOLOv8	X	16 * 8	640	MS COCO 2017	53.7	68.2M	yaml	weights
YOLOv7	Tiny	16 * 8	640	MS COCO 2017	37.5	6.2M	yaml	weights
YOLOv7	L	16 * 8	640	MS COCO 2017	50.8	36.9M	yaml	weights
YOLOv7	X	12 * 8	640	MS COCO 2017	52.4	71.3M	yaml	weights
YOLOv5	N	32 * 8	640	MS COCO 2017	27.3	1.9M	yaml	weights
YOLOv5	S	32 * 8	640	MS COCO 2017	37.6	7.2M	yaml	weights
YOLOv5	M	32 * 8	640	MS COCO 2017	44.9	21.2M	yaml	weights
YOLOv5	L	32 * 8	640	MS COCO 2017	48.5	46.5M	yaml	weights
YOLOv5	X	16 * 8	640	MS COCO 2017	50.5	86.7M	yaml	weights
YOLOv4	CSPDarknet53	16 * 8	608	MS COCO 2017	45.4	27.6M	yaml	weights
YOLOv4	CSPDarknet53(silu)	16 * 8	608	MS COCO 2017	45.8	27.6M	yaml	weights
YOLOv3	Darknet53	16 * 8	640	MS COCO 2017	45.5	61.9M	yaml	weights
YOLOX	N	8 * 8	416	MS COCO 2017	24.1	0.9M	yaml	weights
YOLOX	Tiny	8 * 8	416	MS COCO 2017	33.3	5.1M	yaml	weights
YOLOX	S	8 * 8	640	MS COCO 2017	40.7	9.0M	yaml	weights
YOLOX	M	8 * 8	640	MS COCO 2017	46.7	25.3M	yaml	weights
YOLOX	L	8 * 8	640	MS COCO 2017	49.2	54.2M	yaml	weights
YOLOX	X	8 * 8	640	MS COCO 2017	51.6	99.1M	yaml	weights
YOLOX	Darknet53	8 * 8	640	MS COCO 2017	47.7	63.7M	yaml	weights

在Ascend 910*(8p)上测试的性能

名称	规模	批次大小	图像尺寸	数据集	边界框mAP（%）	每步耗时（毫秒）	参数数量	配置文件	下载链接
YOLOv10	N	32 * 8	640	MS COCO 2017	38.3	513.63	2.8M	yaml	weights
YOLOv10	S	32 * 8	640	MS COCO 2017	45.7	503.38	8.2M	yaml	weights
YOLOv10	M	32 * 8	640	MS COCO 2017	50.7	560.81	16.6M	yaml	weights
YOLOv10	B	32 * 8	640	MS COCO 2017	52.0	695.69	20.6M	yaml	weights
YOLOv10	L	32 * 8	640	MS COCO 2017	52.6	782.61	25.9M	yaml	weights
YOLOv10	X	20 * 8	640	MS COCO 2017	53.7	650.63	31.8M	yaml	weights
YOLOv9	T	16 * 8	640	MS COCO 2017	37.3	350	2.0M	yaml	[ weights
YOLOv9	S	16 * 8	640	MS COCO 2017	46.3	377	7.1M	yaml	[ weights
YOLOv9	M	16 * 8	640	MS COCO 2017	51.4	499	20.0M	yaml	[ weights
YOLOv9	C	16 * 8	640	MS COCO 2017	52.6	627	25.3M	yaml	[ weights
YOLOv9	E	16 * 8	640	MS COCO 2017	55.1	826	57.3M	yaml	[ weights
YOLOv8	N	16 * 8	640	MS COCO 2017	37.3	373.55	3.2M	yaml	weights
YOLOv8	S	16 * 8	640	MS COCO 2017	44.7	365.53	11.2M	yaml	weights
YOLOv7	Tiny	16 * 8	640	MS COCO 2017	37.5	496.21	6.2M	yaml	weights
YOLOv5	N	32 * 8	640	MS COCO 2017	27.4	736.08	1.9M	yaml	weights
YOLOv5	S	32 * 8	640	MS COCO 2017	37.6	787.34	7.2M	yaml	weights
YOLOv5	N6	32 * 8	1280	MS COCO 2017	35.7	1543.35	3.5M	yaml	weights
YOLOv5	S6	32 * 8	1280	MS COCO 2017	44.4	1514.98	13.6M	yaml	weights
YOLOv5	M6	32 * 8	1280	MS COCO 2017	51.1	1769.17	38.5M	yaml	weights
YOLOv5	L6	16 * 8	1280	MS COCO 2017	53.6	894.65	82.9M	yaml	weights
YOLOv5	X6	8 * 8	1280	MS COCO 2017	54.4	864.43	140.9M	yaml	weights
YOLOv4	CSPDarknet53	16 * 8	608	MS COCO 2017	46.1	337.25	27.6M	yaml	weights
YOLOv3	Darknet53	16 * 8	640	MS COCO 2017	46.6	396.60	61.9M	yaml	weights
YOLOX	S	8 * 8	640	MS COCO 2017	41.0	242.15	9.0M	yaml	weights

5.2 分割任务

在昇腾910（8p）上以图模式测试性能

名称	规模	批大小	图像尺寸	数据集	边界框mAP（%）	掩码mAP（%）	参数	配置文件	下载链接
YOLOv8-seg	X	16 * 8	640	MS COCO 2017	52.5	42.9	71.8M	yaml	weights

注意

Box mAP：表格中报告的精度为验证集上的结果。

更多结果请查看 Benchmark Results。

6. 注意

⚠️ 当前版本基于 MindSpore 的 图模式+静态Shape 特性构建，更多信息请查阅MindSpore官方文档。

7. 如何贡献到我们的仓库

我们感谢所有包括问题反馈和PR在内的贡献，以帮助MindYOLO变得更好。

贡献指南详情请参见 CONTRIBUTING.md。

8. 许可证

MindYOLO基于 Apache License 2.0 许可证发布。

9. 致谢

MindYOLO是一个开源项目，欢迎任何贡献和反馈。我们希望通过提供灵活且标准化的工具包，该工具库和基准测试能够支持不断发展的研究社区，复现现有方法，并开发他们自己的新型实时目标检测方法。

10. 引用

如果您发现本项目对您的研究有所帮助，请考虑引用：

@misc{MindSpore Object Detection YOLO 2023,
    title={{MindSpore Object Detection YOLO}:MindSpore Object Detection YOLO Toolbox and Benchmark},
    author={MindSpore YOLO Contributors},
    howpublished = {\url{https://github.com/mindspore-lab/mindyolo}},
    year={2023}
}