-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add finetune tutorial #124
Conversation
tutorials/custom_dataset_finetune.md
Outdated
@@ -0,0 +1,122 @@ | |||
# 自定义数据集finetune入门 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file_name -> custom_dataset.md
mindyolo/utils/trainer_factory.py
Outdated
def modify_dataset_columns(image, labels, img_files): | ||
return image, labels | ||
|
||
loader = self.dataloader.map( | ||
modify_dataset_columns, | ||
input_columns=["image", "labels", "img_files"], | ||
output_columns=["image", "labels"], | ||
column_order=["image", "labels"], | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
本处已在2.0适配pr中修复
mindyolo/utils/shwd2yolo.py
Outdated
@@ -0,0 +1,242 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这个放到外层mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py
- 写一个 mindyolo/examples/finetune_SHWD/finetune_shwd.py 和 README.md
tutorials/custom_dataset_finetune.md
Outdated
### 举例 | ||
|
||
下面以安全帽佩戴检测数据集(SHWD)为例,介绍自定义数据集在MindYOLO上进行finetune的主要流程。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
链接一个 examples/finetune_SHWD/README.md
tutorials/dataset_format.md
Outdated
|
||
本文主要介绍MindYOLO套件使用的数据集格式。 | ||
|
||
MindYOLO套件使用yolo数据格式完成模型训练,使用coco数据格式借助coco api完成模型验证。因此,使用MindYOLO提供的api读取自定义数据集,需要将训练集转换为yolo格式,将验证集转换为coco格式。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 直接就是yolo格式,然后把具体的形式在下面整体给出来;
- 这个文件给的是自定义数据集的教程,dataset_format.md -> custom_dataset.md
examples/finetune_SHWD/README.md
Outdated
#### 模型训练 | ||
|
||
由于SHWD数据集只有7000+张图片,选择yolov7-tiny进行该数据集的训练,可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80个物体类别,SHWD数据集只有两类,需将模型文件的最后一层head层去掉。具体训练流程可参见[GETTING_STARTED.md](https://github.com/mindspore-lab/mindyolo/blob/master/GETTING_STARTED.md) | ||
|
||
MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练,即可达到ap50为87.0的精度结果;将lr_init参数由0.01改为0.001,即可实现ap50为89.2的精度结果,高于SHWD官方仓库提供的最高ap50精度88.5。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 这个应用的是finetune_shwd.py为入口,把具体的执行流程和命令描述清楚;
- “模型去掉最后一层head”是什么意思,在模型定义文件中没见到相应修改
examples/finetune_SHWD/shwd.yaml
Outdated
data: | ||
dataset_name: shwd | ||
|
||
train_set: ./SHWD/train.txt | ||
val_set: ./SHWD/val.txt | ||
|
||
nc: 2 | ||
|
||
# class names | ||
names: [ 'person', 'hat' ] | ||
|
||
train_transforms: [] | ||
test_transforms: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 是否可以直接继承于外层的./configs/yolov7/yolov7-tiny.yaml,然后在shwd.yaml中修改变化的参数,本目录下的"hyp/yolov7-tiny.yaml"删除掉
- 末尾增加空行
examples/finetune_SHWD/README.md
Outdated
由于MindYOLO在验证阶段选用图片名称作为image_id,因此图片名称只能为数值类型,而不能为字符串类型,还需要对图片进行改名。对SHWD数据集格式的转换包含如下步骤,详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。 | ||
* 将图片复制到相应的路径下并改名 | ||
* 在根目录下相应的txt文件中写入该图片的相对路径 | ||
* 解析xml文件,在相应路径下生成对应的txt标注文件 | ||
* 验证集还需生成最终的json文件 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- link -> convert_shwd2yolo.py
- 把转换的流程写清楚,包括怎么运行convert_shwd2yolo.py得到什么形式的数据集
@@ -0,0 +1,123 @@ | |||
__BASE__: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file name -> yolov7-tiny_shwd.yaml
examples/finetune_SHWD/README.md
Outdated
* 解析xml文件,在相应路径下生成对应的txt标注文件 | ||
* 验证集还需生成最终的json文件 | ||
|
||
详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link -> convert_shwd2yolo.py
examples/finetune_SHWD/README.md
Outdated
详细实现可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下: | ||
|
||
```shell | ||
cd mindyolo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cd mindyolo去掉,默认在mindyolo工程目录下
examples/finetune_SHWD/README.md
Outdated
|
||
#### 预训练模型文件转换 | ||
|
||
由于SHWD数据集只有7000+张图片,选择yolov7-tiny进行该数据集的训练,可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别,SHWD数据集只有两类,需将预训练模型文件的最后一层head层去掉, 可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- link -> convert_yolov7_headless.py
- 给个运行命令
- 把head更nc有关加到说明中
examples/finetune_SHWD/README.md
Outdated
|
||
```shell | ||
cd mindyolo | ||
python examples/finetune_SHWD/convert_shwd2yolo.py --root_dir ROOT_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROOT_DIR 换成具体的路径,如 /path_to_hswd/HSWD
examples/finetune_SHWD/README.md
Outdated
* 在多卡NPU/GPU上进行分布式模型训练,以8卡为例: | ||
|
||
```shell | ||
cd mindyolo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete "cd mindyolo"
examples/finetune_SHWD/README.md
Outdated
* 在单卡NPU/GPU/CPU上训练模型: | ||
|
||
```shell | ||
cd mindyolo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete "cd mindyolo"
examples/finetune_SHWD/README.md
Outdated
|
||
由于SHWD数据集只有7000+张图片,选择yolov7-tiny进行该数据集的训练,可下载MindYOLO提供的在coco数据集上训练好的[模型文件](https://github.com/mindspore-lab/mindyolo/blob/master/MODEL_ZOO.md)作为预训练模型。由于coco数据集含有80种物体类别,SHWD数据集只有两类,需将预训练模型文件的最后一层head层去掉, 可参考[代码](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/convert_yolov7_headless.py)。 | ||
|
||
#### 模型训练 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
模型训练 -> 模型微调(Finetune) 这样会不会好一点
examples/finetune_SHWD/README.md
Outdated
python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml | ||
``` | ||
|
||
MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练,即可达到ap50为87.0的精度结果;将lr_init参数由0.01改为0.001,即可实现ap50为89.2的精度结果,高于SHWD官方仓库提供的最高ap50精度88.5。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 开头加个 Note
- “MindYOLO所提供的默认参数下完成yolov7-tiny在SHWD数据集上的训练,即可xxx”
改为
“直接用yolov7-tiny默认coco参数在SHWD数据集上训练,可取得AP50 87.0的精度”
是否更好
@@ -0,0 +1,15 @@ | |||
import mindspore as ms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file name -> conver_yolov7-tiny_pretrain_ckpt.py 这样是否更清晰
@@ -0,0 +1,76 @@ | |||
# 数据集格式介绍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只给一个yolo格式的整体目录说明就可以了,然后下面说自定义数据集转成yolo格式进行训练,example参考hswd
examples/finetune_SHWD/README.md
Outdated
* 解析xml文件,在相应路径下生成对应的txt标注文件 | ||
* 验证集还需生成最终的json文件 | ||
|
||
详细实现可参考[convert_shwd2yolo.py](https://github.com/mindspore-lab/mindyolo/blob/master/mindyolo/examples/finetune_SHWD/convert_shwd2yolo.py)。运行方式如下: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link没改
examples/finetune_SHWD/README.md
Outdated
python examples/finetune_SHWD/finetune_shwd.py --config ./examples/finetune_SHWD/yolov7-tiny_shwd.yaml | ||
``` | ||
|
||
*注意:直接用yolov7-tiny默认coco参数在SHWD数据集上训练,可取得AP50 87.0的精度。将lr_init参数由0.01改为0.001,即可实现ap50为89.2的精度结果,高于SHWD官方仓库提供的最高ap50精度88.5。* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后面这句"高于xxx"不要说;
examples/finetune_SHWD/README.md
Outdated
|
||
#### 模型微调(Finetune) | ||
|
||
简要的训练流程可参考[finetune_shwd.py](https://github.com/mindspore-lab/mindyolo/blob/master/examples/finetune_SHWD/finetune_shwd.py) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link没改
适用于MindYOLO的数据集格式具有如下形式: | ||
``` | ||
ROOT_DIR | ||
├── val.txt | ||
├── train.txt | ||
├── annotations | ||
│ └── instances_val2017.json | ||
├── images | ||
│ ├── train | ||
│ │ ├── 00000001.jpg | ||
│ │ └── 00000002.jpg | ||
│ └── val | ||
│ ├── 00006563.jpg | ||
│ └── 00006564.jpg | ||
└── labels | ||
└── train | ||
├── 00000001.txt | ||
└── 00000002.txt | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后面可以按照这个目录结合具体的文件说明下每个文件的标注格式和含义;
val_txt_yolo.close() | ||
|
||
json_file = os.path.join(new_dir, 'annotations', 'instances_val2017.json') | ||
json.dump(coco, open(json_file, 'w')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
教程中没有对关键代码的解释,需要在代码中关键部分加注释
new_ckpt = [] | ||
param_dict = ms.load_checkpoint(ori_weight) | ||
for k, v in param_dict.items(): | ||
if '77' in k: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
77为什么要drop掉?
if __name__ == "__main__": | ||
parser = get_parser_train() | ||
args = parse_args(parser) | ||
train_shwd(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为什么要单加一个文件,和训练的文件很像,有点多余
Thank you for your contribution to the MindYOLO repo.
Before submitting this PR, please make sure:
Motivation
(Write your motivation for proposed changes here.)
Test Plan
(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)
Related Issues and PRs
(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)