In this tutorial, we will introduce some methods about how to customize workflow and hooks when running your own settings for the project.
Workflow is a list of (phase, duration) to specify the running order and duration. The meaning of "duration" depends on the runner's type.
For example, we use epoch-based runner by default, and the "duration" means how many epochs the phase to be executed in a cycle. Usually, we only want to execute training phase, just use the following config.
workflow = [('train', 1)]
Sometimes we may want to check some metrics (e.g. loss, accuracy) about the model on the validate set. In such case, we can set the workflow as
[('train', 1), ('val', 1)]
so we will run training and valiation for one epoch iteratively.
By default, we recommend using EvalHook
to do evaluation after the training epoch.
The hook mechanism is widely used in the OpenMMLab open-source algorithm library. Inserted in the Runner
, the entire life cycle of the training process can be managed easily. You can learn more about the hook through related article.
Hooks only work after being registered into the runner. At present, hooks are mainly divided into two categories:
- default training hooks
Those hooks are registered by the runner by default. Generally, they fulfill some basic functions, and have default priority, you don't need to modify the priority.
- custom hooks
The custom hooks are registered through custom_hooks. Generally, they are hooks with enhanced functions. The priority needs to be specified in the configuration file. If you do not specify the priority of the hook, it will be set to 'NORMAL' by default.
Priority list
Level | Value |
---|---|
HIGHEST | 0 |
VERY_HIGH | 10 |
HIGH | 30 |
ABOVE_NORMAL | 40 |
NORMAL(default) | 50 |
BELOW_NORMAL | 60 |
LOW | 70 |
VERY_LOW | 90 |
LOWEST | 100 |
The priority determines the execution order of the hooks. Before training, the log will print out the execution order of the hooks at each stage to facilitate debugging.
Some common hooks are not registered through custom_hooks
, they are
Hooks | Priority |
---|---|
LrUpdaterHook |
VERY_HIGH (10) |
MomentumUpdaterHook |
HIGH (30) |
OptimizerHook |
ABOVE_NORMAL (40) |
CheckpointHook |
NORMAL (50) |
IterTimerHook |
LOW (70) |
EvalHook |
LOW (70) |
LoggerHook(s) |
VERY_LOW (90) |
OptimizerHook
, MomentumUpdaterHook
and LrUpdaterHook
have been introduced in schedule strategy. IterTimerHook
is used to record elapsed time and does not support modification.
Here we reveal how to customize CheckpointHook
, LoggerHooks
, and EvalHook
.
The MMCV runner will use checkpoint_config
to initialize CheckpointHook
.
checkpoint_config = dict(interval=1)
We could set max_keep_ckpts
to save only a small number of checkpoints or decide whether to store state dict of optimizer by save_optimizer
. More details of the arguments are here
The log_config
wraps multiple logger hooks and enables to set intervals. Now MMCV supports TextLoggerHook
, WandbLoggerHook
, MlflowLoggerHook
, NeptuneLoggerHook
, DvcliveLoggerHook
and TensorboardLoggerHook
.
The detailed usages can be found in the doc.
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook')
])
The config of evaluation
will be used to initialize the EvalHook
.
The EvalHook
has some reserved keys, such as interval
, save_best
and start
, and the other arguments such as metrics
will be passed to the dataset.evaluate()
evaluation = dict(interval=1, metric='accuracy', metric_options={'topk': (1, )})
You can save the model weight when the best verification result is obtained by modifying the parameter save_best
:
# "auto" means automatically select the metrics to compare.
# You can also use a specific key like "accuracy_top-1".
evaluation = dict(interval=1, save_best="auto", metric='accuracy', metric_options={'topk': (1, )})
When running some large-scale experiments, you can skip the validation step at the beginning of training by modifying the parameter start
as below:
evaluation = dict(interval=1, start=200, metric='accuracy', metric_options={'topk': (1, )})
This indicates that, during the first 200 epochs, evaluation will not be executed. From the 200th epoch, evaluation will be executed after the training process.
Some hooks have been already implemented in MMCV and MMClassification, they are:
If the hook is already implemented in MMCV, you can directly modify the config to use the hook as below
mmcv_hooks = [
dict(type='MMCVHook', a=a_value, b=b_value, priority='NORMAL')
]
such as using EMAHook
, interval is 100 iters:
custom_hooks = [
dict(type='EMAHook', interval=100, priority='HIGH')
]
Here we give an example of creating a new hook in MMSelfSup.
from mmcv.runner import HOOKS, Hook
@HOOKS.register_module()
class MyHook(Hook):
def __init__(self, a, b):
pass
def before_run(self, runner):
pass
def after_run(self, runner):
pass
def before_epoch(self, runner):
pass
def after_epoch(self, runner):
pass
def before_iter(self, runner):
pass
def after_iter(self, runner):
pass
Depending on your intention of this hook, you need to implement different functionalities in before_run
, after_run
, before_epoch
, after_epoch
, before_iter
, and after_iter
.
Then we need to ensure MyHook
imported. Assuming MyHook
is in mmselfsup/core/hooks/my_hook.py
, there are two ways to import it:
- Modify
mmselfsup/core/hooks/__init__.py
as below
from .my_hook import MyHook
__all__ = [..., MyHook, ...]
- Use
custom_imports
in the config to manually import it
custom_imports = dict(imports=['mmselfsup.core.hooks.my_hook'], allow_failed_imports=False)
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value)
]
You can also set the priority of the hook as below:
custom_hooks = [
dict(type='MyHook', a=a_value, b=b_value, priority='ABOVE_NORMAL')
]
By default, the hook's priority is set as NORMAL
during registration.