Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default value of \beta (args.reg_weight) and \gamma (args.prior_weight) wrong. #6

Closed
killawhale2 opened this issue Jul 17, 2020 · 3 comments

Comments

@killawhale2
Copy link

In the paper, it is described that \beta and \gamma were set to 10 and 0.2. However, the given code uses 0 as the default values for both of them, essentially meaning that the code is going to be ran without the proposed additional loss terms. I would appreciate it if the authors could provide some clarification on how \beta and \gamma were actually used during training.

@Wuziyi616
Copy link
Contributor

Hi! Thank you for your attention on our work! Yes you are right, the default parameters in args are not what we described in the paper. That is because training with the "Sparse Object Prior" term is quite slow (I have to use for-loop to iterate over batch input to calculate s_i log(s_i)). Thus, it would be prohibitive to train BiDet with this term from scratch (may need 3-4 days to one week).

Besides, we find out that training with only the original loc and cls loss at the beginning, while adding the "Sparse Object Prior" and "I(X; F)" regularization term later results in equally good performances. Specifically, we first train with \beta=0 and \gamma=0 using learning_rate=1e-3 for several epochs, and then set \beta=0.1 and \gamma=0.2 at the first lr decay step (that's what the code here does).

Also, I have to point out that the "args.reg_weight" is actually not the same as \beta in IB formula. As you can see, args.reg_weight is multiplied with I(X; F) in the code, while \beta is multiplied with I(F; C, L) in our IB formula. So here args.reg_weight = \frac{1}{\beta}, and that's why I set args.reg_weight=0.1 instead of 10 in the code (0.1 = \frac{1}{10}).

Hope my answer can help you!

@killawhale2
Copy link
Author

Thank you for your quick reply. I do have to mention that there is still a bug in the uploaded code because while args.reg_weight and args.prior_weight is set to a non-zero value some iterations later. the predicate used in calculating the respective losses are global variables REGULARIZATION_LOSS_WEIGHT and PRIOR_LOSS_WEIGHT which are set to 0. and never changed. Although this is quite a straightforward fix, this makes me think that the uploaded code wasn't actually tested to see if it reproduced the results.

Also, I get a division by zero error on L283 in train_bidet_ssd.py after fixing the predicates using the global variables. It seems that loss_count is set to zero and never incremented in the for-loop that calculates the prior loss. It would be really helpful if someone could look into this.

@Wuziyi616
Copy link
Contributor

Wuziyi616 commented Jul 22, 2020

Thank you for your careful inspection of our code. For the first question, yes I have to admit that it's truly a bug. When I tested the code before uploading it to GitHub, I only ran it from a pre-trained weight and set args.reg_weight=0.1 & args.prior_weight=0.2 in the command line. So I didn't discover the problem you mentioned. Really sorry for that.

For the second problem, loss_count is actually increased in the for-loop here. I guess the "division by zero" problem may be due to that the model hasn't been trained enough for prior loss calculation. As you can see, there are many conditions that the code will skip the loss_count += 1. operation such as no objects in the gt classes are detected. Therefore, if you use prior_loss right after random initialization of the model, it's quite possible loss_count will be 0. A quick fix may be to replace loss_p /= loss_count with loss_p /= (loss_count + 1e-6).

Again, thank you for your time on our work and code. I really apologize for the bugs in it and I should definitely test the code much more carefully the next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants