add use_global_stats in nn.BatchNorm #9420

tornadomeet · 2018-01-14T07:59:14Z

Description

#9419

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

szha

LGTM. Ping @piiswrong in case there's additional comment.

piiswrong · 2018-01-15T19:56:13Z

not sure if we want to add this option, especially under this name.
Does it have a reference from a paper or another framework? Is there a more commonly used name?

tornadomeet · 2018-01-16T06:13:32Z

@piiswrong caffe and paddlepaddle also use name use_global_stats, http://caffe.berkeleyvision.org/tutorial/layers/batchnorm.html

https://github.com/PaddlePaddle/Paddle/blob/master/paddle/gserver/layers/BatchNormBaseLayer.cpp#L41

7oud · 2018-02-25T15:54:56Z

@szha @tornadomeet if training with use_global_stats=True, it seemed all the moving_mean = 0 and moving_var = 1 in the trained model, is is right ? then batch norm changed into a scalar shift op. what situation should use_global_stats=True be used ?

thbupt · 2018-02-26T07:57:32Z

@7oud I have the same question. I think use_global_stats=True should be used as you finetune some pretrained model such as ResNet, VGG.

tornadomeet · 2018-02-27T00:58:10Z

@7oud just as @thbupt said, it used for fine-tune pre-trained model, such as fix some layer which using bn during fine-tune. if with fine-tune, then moving_mean and moving_var will init from pre-trained model, so moving_mean will not be 0, and moving var will not be 1.

7oud · 2018-02-27T01:08:02Z

@thbupt @tornadomeet I found in some small dataset tasks such as segmentation (training from scratch), the inference result is worse than training when using BatchNorm without use_global_stats. Did you have similar situations?

thbupt · 2018-02-27T01:21:09Z

@7oud If you train from scratch, the use_global_status should be set to false in training and true in testing which is default in mxnet.

tornadomeet · 2018-02-27T01:28:44Z

@7oud do you mean in your small task, set use_global_stats=True during training will get better result than use_global_stats=False which is default setting?

if that is true, which means bn has no work in your task, so just remove bn for your network.

7oud · 2018-02-27T01:29:11Z

@thbupt Actually I did like what you said, but the same data batch has different output when using forward(is_train=False) and forward(is_train=True), it means inference results are worse. So I try to train with use_global_status=True, it gives the same results

thbupt · 2018-02-27T01:30:26Z

@tornadomeet Is there a simple way to set all use_global_status=True as finetuning. I know one way is to set use_global_status=True for each bn layer seperately when adding nn.BatchNorm.

7oud · 2018-02-27T01:31:52Z

@tornadomeet it seems that, but I cannot give the conclusion, bcz the dataset is too small to giving truth

thbupt · 2018-02-27T01:35:49Z

@7oud how about your batch size? bn seems to prefer large batch size.

tornadomeet · 2018-02-27T01:36:15Z

@7oud the correct way which using bn during training from scratch is setting use_global_status=False;

just make use_global_status as a parameter for your block class, then you just need changed it once time.

7oud · 2018-02-27T01:43:50Z

@thbupt batch size in training is 8, and in inference is usually 1.

thbupt · 2018-02-27T02:06:18Z

@7oud I think 8 is too small for bn, you can try larger bz like 16, 32.

jonbakerfish · 2018-06-03T08:48:23Z

In Gluon, do we need to set use_global_stats=True for all the layers when we use the pre-trained model (e.g., resnet) to extract features or inference? If so, how can we do that?

In #3871, it said that the is_train will effect the BatchNorm's behavior. But I can't see any is_train option in https://mxnet.incubator.apache.org/api/python/gluon/model_zoo.html.

szha · 2018-06-03T19:27:52Z

@jonbakerfish that flag is automatically set by module or autograd.record. It can be queried via autograd.is_training and overridden with autograd.train_mode/predict_mode when using autograd.

add use_global_stats in nn.BatchNorm

95ba1c0

szha approved these changes Jan 14, 2018

View reviewed changes

piiswrong merged commit dae6cda into apache:master Jan 22, 2018

szha added the API change label Jan 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add use_global_stats in nn.BatchNorm #9420

add use_global_stats in nn.BatchNorm #9420

tornadomeet commented Jan 14, 2018 •

edited

Loading

szha left a comment

piiswrong commented Jan 15, 2018

tornadomeet commented Jan 16, 2018 •

edited by szha

Loading

7oud commented Feb 25, 2018

thbupt commented Feb 26, 2018

tornadomeet commented Feb 27, 2018

7oud commented Feb 27, 2018 •

edited

Loading

thbupt commented Feb 27, 2018

tornadomeet commented Feb 27, 2018 •

edited

Loading

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

tornadomeet commented Feb 27, 2018

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

jonbakerfish commented Jun 3, 2018 •

edited

Loading

szha commented Jun 3, 2018

add use_global_stats in nn.BatchNorm #9420

add use_global_stats in nn.BatchNorm #9420

Conversation

tornadomeet commented Jan 14, 2018 • edited Loading

Description

Checklist

Essentials

szha left a comment

Choose a reason for hiding this comment

piiswrong commented Jan 15, 2018

tornadomeet commented Jan 16, 2018 • edited by szha Loading

7oud commented Feb 25, 2018

thbupt commented Feb 26, 2018

tornadomeet commented Feb 27, 2018

7oud commented Feb 27, 2018 • edited Loading

thbupt commented Feb 27, 2018

tornadomeet commented Feb 27, 2018 • edited Loading

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

tornadomeet commented Feb 27, 2018

7oud commented Feb 27, 2018

thbupt commented Feb 27, 2018

jonbakerfish commented Jun 3, 2018 • edited Loading

szha commented Jun 3, 2018

tornadomeet commented Jan 14, 2018 •

edited

Loading

tornadomeet commented Jan 16, 2018 •

edited by szha

Loading

7oud commented Feb 27, 2018 •

edited

Loading

tornadomeet commented Feb 27, 2018 •

edited

Loading

jonbakerfish commented Jun 3, 2018 •

edited

Loading