add finetune tutorial #124

yuedongli1 · 2023-05-30T11:31:42Z

Thank you for your contribution to the MindYOLO repo.
Before submitting this PR, please make sure:

You have read the Contributing Guidelines on pull requests
Your code builds clean without any errors or warnings
You are using approved terminology
You have added unit tests

Motivation

(Write your motivation for proposed changes here.)

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

zhanghuiyao · 2023-06-02T07:23:10Z

tutorials/custom_dataset_finetune.md

@@ -0,0 +1,122 @@
+# 自定义数据集finetune入门


file_name -> custom_dataset.md

zhanghuiyao · 2023-06-02T07:24:12Z

mindyolo/utils/trainer_factory.py

-        def modify_dataset_columns(image, labels, img_files):
-            return image, labels
-
-        loader = self.dataloader.map(
-            modify_dataset_columns,
-            input_columns=["image", "labels", "img_files"],
-            output_columns=["image", "labels"],
-            column_order=["image", "labels"],
-        )
-


本处已在2.0适配pr中修复

zhanghuiyao · 2023-06-02T07:28:45Z

mindyolo/utils/shwd2yolo.py

@@ -0,0 +1,242 @@
+import os


这个放到外层mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py

写一个 mindyolo/examples/finetune_SHWD/finetune_shwd.py 和 README.md

zhanghuiyao · 2023-06-02T07:38:41Z

tutorials/custom_dataset_finetune.md

+### 举例
+
+下面以安全帽佩戴检测数据集(SHWD)为例，介绍自定义数据集在MindYOLO上进行finetune的主要流程。


链接一个 examples/finetune_SHWD/README.md

zhanghuiyao · 2023-06-06T01:50:45Z

tutorials/dataset_format.md

+
+本文主要介绍MindYOLO套件使用的数据集格式。
+
+MindYOLO套件使用yolo数据格式完成模型训练，使用coco数据格式借助coco api完成模型验证。因此，使用MindYOLO提供的api读取自定义数据集，需要将训练集转换为yolo格式，将验证集转换为coco格式。


直接就是yolo格式，然后把具体的形式在下面整体给出来；

这个文件给的是自定义数据集的教程，dataset_format.md -> custom_dataset.md

zhanghuiyao · 2023-06-06T01:59:46Z

examples/finetune_SHWD/README.md

+#### 模型训练
+
+由于SHWD数据集只有7000+张图片，选择yolov7-tiny进行该数据集的训练，可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80个物体类别，SHWD数据集只有两类，需将模型文件的最后一层head层去掉。具体训练流程可参见[GETTING_STARTED.md](https://github.com/mindspore-lab/mindyolo/blob/master/GETTING_STARTED.md)
+
+MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练，即可达到ap50为87.0的精度结果；将lr_init参数由0.01改为0.001，即可实现ap50为89.2的精度结果，高于SHWD官方仓库提供的最高ap50精度88.5。


这个应用的是finetune_shwd.py为入口，把具体的执行流程和命令描述清楚；

“模型去掉最后一层head”是什么意思，在模型定义文件中没见到相应修改

zhanghuiyao · 2023-06-06T02:01:57Z

examples/finetune_SHWD/shwd.yaml

+data:
+  dataset_name: shwd
+
+  train_set: ./SHWD/train.txt 
+  val_set: ./SHWD/val.txt
+
+  nc: 2
+
+  # class names
+  names: [ 'person',  'hat' ]
+
+  train_transforms: []
+  test_transforms: []


是否可以直接继承于外层的./configs/yolov7/yolov7-tiny.yaml，然后在shwd.yaml中修改变化的参数，本目录下的"hyp/yolov7-tiny.yaml"删除掉

末尾增加空行

zhanghuiyao · 2023-06-06T02:03:39Z

examples/finetune_SHWD/README.md

+由于MindYOLO在验证阶段选用图片名称作为image_id，因此图片名称只能为数值类型，而不能为字符串类型，还需要对图片进行改名。对SHWD数据集格式的转换包含如下步骤，详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。
+* 将图片复制到相应的路径下并改名
+* 在根目录下相应的txt文件中写入该图片的相对路径
+* 解析xml文件，在相应路径下生成对应的txt标注文件
+* 验证集还需生成最终的json文件


link -> convert_shwd2yolo.py

把转换的流程写清楚，包括怎么运行convert_shwd2yolo.py得到什么形式的数据集

zhanghuiyao · 2023-06-06T03:04:08Z

examples/finetune_SHWD/yolov7-tiny.yaml

@@ -0,0 +1,123 @@
+__BASE__: [


file name -> yolov7-tiny_shwd.yaml

zhanghuiyao · 2023-06-06T03:33:09Z

examples/finetune_SHWD/README.md

+* 解析xml文件，在相应路径下生成对应的txt标注文件
+* 验证集还需生成最终的json文件
+
+详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下：


link -> convert_shwd2yolo.py

zhanghuiyao · 2023-06-06T03:34:36Z

examples/finetune_SHWD/README.md

+详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下：
+
+  ```shell
+  cd mindyolo


cd mindyolo去掉，默认在mindyolo工程目录下

zhanghuiyao · 2023-06-06T03:35:52Z

examples/finetune_SHWD/README.md

+
+#### 预训练模型文件转换
+
+由于SHWD数据集只有7000+张图片，选择yolov7-tiny进行该数据集的训练，可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别，SHWD数据集只有两类，需将预训练模型文件的最后一层head层去掉， 可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。


link -> convert_yolov7_headless.py

给个运行命令

把head更nc有关加到说明中

zhanghuiyao · 2023-06-06T03:36:50Z

examples/finetune_SHWD/README.md

+
+  ```shell
+  cd mindyolo
+  python examples/finetune_SHWD/convert_shwd2yolo.py --root_dir ROOT_DIR


ROOT_DIR 换成具体的路径，如 /path_to_hswd/HSWD

zhanghuiyao · 2023-06-06T03:37:10Z

examples/finetune_SHWD/README.md

+* 在多卡NPU/GPU上进行分布式模型训练，以8卡为例:
+
+  ```shell
+  cd mindyolo


delete "cd mindyolo"

zhanghuiyao · 2023-06-06T03:37:15Z

examples/finetune_SHWD/README.md

+* 在单卡NPU/GPU/CPU上训练模型：
+
+  ```shell
+  cd mindyolo


delete "cd mindyolo"

zhanghuiyao · 2023-06-06T03:38:11Z

examples/finetune_SHWD/README.md

+
+由于SHWD数据集只有7000+张图片，选择yolov7-tiny进行该数据集的训练，可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别，SHWD数据集只有两类，需将预训练模型文件的最后一层head层去掉， 可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。
+
+#### 模型训练


模型训练 -> 模型微调(Finetune) 这样会不会好一点

zhanghuiyao · 2023-06-06T03:42:35Z

examples/finetune_SHWD/README.md

+  python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml 
+  ```
+
+MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练，即可达到ap50为87.0的精度结果；将lr_init参数由0.01改为0.001，即可实现ap50为89.2的精度结果，高于SHWD官方仓库提供的最高ap50精度88.5。


开头加个 Note

“MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练，即可xxx”
改为
“直接用yolov7-tiny默认coco参数在SHWD数据集上训练，可取得AP50 87.0的精度”
是否更好

zhanghuiyao · 2023-06-06T03:43:58Z

examples/finetune_SHWD/convert_yolov7_headless.py

@@ -0,0 +1,15 @@
+import mindspore as ms


file name -> conver_yolov7-tiny_pretrain_ckpt.py 这样是否更清晰

zhanghuiyao · 2023-06-06T03:48:16Z

tutorials/custom_dataset.md

@@ -0,0 +1,76 @@
+# 数据集格式介绍


只给一个yolo格式的整体目录说明就可以了，然后下面说自定义数据集转成yolo格式进行训练，example参考hswd

zhanghuiyao · 2023-06-06T06:40:39Z

examples/finetune_SHWD/README.md

+* 解析xml文件，在相应路径下生成对应的txt标注文件
+* 验证集还需生成最终的json文件
+
+详细实现可参考[convert_shwd2yolo.py](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下：


zhanghuiyao · 2023-06-06T06:44:04Z

examples/finetune_SHWD/README.md

+  python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml 
+  ```
+
+*注意：直接用yolov7-tiny默认coco参数在SHWD数据集上训练，可取得AP50 87.0的精度。将lr_init参数由0.01改为0.001，即可实现ap50为89.2的精度结果，高于SHWD官方仓库提供的最高ap50精度88.5。*


后面这句"高于xxx"不要说；

zhanghuiyao · 2023-06-06T06:45:20Z

examples/finetune_SHWD/README.md

+
+#### 模型微调(Finetune)
+
+简要的训练流程可参考[finetune_shwd.py](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/finetune_shwd.py)


zhanghuiyao · 2023-06-06T06:48:56Z

tutorials/custom_dataset.md

+适用于MindYOLO的数据集格式具有如下形式：
+```
+            ROOT_DIR
+                ├── val.txt
+                ├── train.txt
+                ├── annotations
+                │        └── instances_val2017.json
+                ├── images
+                │     ├── train
+                │     │     ├── 00000001.jpg
+                │     │     └── 00000002.jpg
+                │     └── val
+                │          ├── 00006563.jpg
+                │          └── 00006564.jpg
+                └── labels
+                      └── train
+                            ├── 00000001.txt
+                            └── 00000002.txt
+```


后面可以按照这个目录结合具体的文件说明下每个文件的标注格式和含义；

CaitinZhao · 2023-06-06T09:19:40Z

examples/finetune_SHWD/convert_shwd2yolo.py

+    val_txt_yolo.close()
+
+    json_file = os.path.join(new_dir, 'annotations', 'instances_val2017.json')
+    json.dump(coco, open(json_file, 'w'))


教程中没有对关键代码的解释，需要在代码中关键部分加注释

CaitinZhao · 2023-06-06T09:20:07Z

examples/finetune_SHWD/convert_yolov7-tiny_pretrain_ckpt.py

+    new_ckpt = []
+    param_dict = ms.load_checkpoint(ori_weight)
+    for k, v in param_dict.items():
+        if '77' in k:


77为什么要drop掉？

CaitinZhao · 2023-06-06T09:20:39Z

examples/finetune_SHWD/finetune_shwd.py

+if __name__ == "__main__":
+    parser = get_parser_train()
+    args = parse_args(parser)
+    train_shwd(args)


为什么要单加一个文件，和训练的文件很像，有点多余

yuedongli1 added documentation Improvements or additions to documentation inside-test 内部开发者提的issue rfc 需求单issue labels May 30, 2023

yuedongli1 added this to the mindyolo-0.1 milestone May 30, 2023

yuedongli1 requested review from zhanghuiyao and CaitinZhao May 30, 2023 11:31

yuedongli1 self-assigned this May 30, 2023

yuedongli1 linked an issue May 30, 2023 that may be closed by this pull request

[New Feature] mindyolo对外文档 #4

Closed

zhanghuiyao reviewed Jun 2, 2023

View reviewed changes

yuedongli1 closed this Jun 6, 2023

yuedongli1 reopened this Jun 6, 2023

yuedongli1 changed the title ~~add finetune tutorial;revise trainer_factory(project)~~ add finetune tutorial Jun 6, 2023

yuedongli1 requested a review from zhanghuiyao June 6, 2023 01:35

zhanghuiyao reviewed Jun 6, 2023

View reviewed changes

yuedongli1 requested a review from zhanghuiyao June 6, 2023 02:56

zhanghuiyao reviewed Jun 6, 2023

View reviewed changes

examples/finetune_SHWD/yolov7-tiny.yaml Outdated

@@ -0,0 +1,123 @@

__BASE__: [

Copy link

Collaborator

zhanghuiyao Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file name -> yolov7-tiny_shwd.yaml

yuedongli1 requested a review from zhanghuiyao June 6, 2023 03:08

zhanghuiyao reviewed Jun 6, 2023

View reviewed changes

yuedongli1 requested a review from zhanghuiyao June 6, 2023 06:09

zhanghuiyao reviewed Jun 6, 2023

View reviewed changes

yuedongli1 requested a review from zhanghuiyao June 6, 2023 07:10

zhanghuiyao approved these changes Jun 6, 2023

View reviewed changes

zhanghuiyao previously approved these changes Jun 6, 2023

View reviewed changes

CaitinZhao reviewed Jun 6, 2023

View reviewed changes

yuedongli1 dismissed zhanghuiyao’s stale review via bea27f3 June 6, 2023 09:52

yuedongli1 requested review from CaitinZhao and zhanghuiyao June 6, 2023 09:59

add finetune tutorial

519abee

CaitinZhao approved these changes Jun 6, 2023

View reviewed changes

zhanghuiyao approved these changes Jun 6, 2023

View reviewed changes

zhanghuiyao merged commit a2b2129 into mindspore-lab:master Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add finetune tutorial #124

add finetune tutorial #124

yuedongli1 commented May 30, 2023 •

edited

Loading

zhanghuiyao Jun 2, 2023

zhanghuiyao Jun 2, 2023

zhanghuiyao Jun 2, 2023 •

edited

Loading

zhanghuiyao Jun 2, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

zhanghuiyao Jun 6, 2023

CaitinZhao Jun 6, 2023

CaitinZhao Jun 6, 2023

CaitinZhao Jun 6, 2023

		### 举例

		下面以安全帽佩戴检测数据集(SHWD)为例，介绍自定义数据集在MindYOLO上进行finetune的主要流程。


		本文主要介绍MindYOLO套件使用的数据集格式。

		MindYOLO套件使用yolo数据格式完成模型训练，使用coco数据格式借助coco api完成模型验证。因此，使用MindYOLO提供的api读取自定义数据集，需要将训练集转换为yolo格式，将验证集转换为coco格式。


		#### 预训练模型文件转换

		由于SHWD数据集只有7000+张图片，选择yolov7-tiny进行该数据集的训练，可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别，SHWD数据集只有两类，需将预训练模型文件的最后一层head层去掉，可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。


		由于SHWD数据集只有7000+张图片，选择yolov7-tiny进行该数据集的训练，可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别，SHWD数据集只有两类，需将预训练模型文件的最后一层head层去掉，可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。

		#### 模型训练


		#### 模型微调(Finetune)

		简要的训练流程可参考[finetune_shwd.py](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/finetune_shwd.py)

add finetune tutorial #124

add finetune tutorial #124

Conversation

yuedongli1 commented May 30, 2023 • edited Loading

Motivation

Test Plan

Related Issues and PRs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhanghuiyao Jun 2, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuedongli1 commented May 30, 2023 •

edited

Loading

zhanghuiyao Jun 2, 2023 •

edited

Loading