Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why normalize the bounding box targets ? #365

Open
FangShancheng opened this issue Oct 5, 2016 · 5 comments
Open

Why normalize the bounding box targets ? #365

FangShancheng opened this issue Oct 5, 2016 · 5 comments

Comments

@FangShancheng
Copy link

FangShancheng commented Oct 5, 2016

  1. (Resolved) Faster RCNN works well on my own dataset using the following config :

    1. __C.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED = True
    2. __C.TRAIN.BBOX_NORMALIZE_MEANS = (0.0, 0.0, 0.0, 0.0)
    3. __C.TRAIN.BBOX_NORMALIZE_STDS = (0.1, 0.1, 0.2, 0.2)

    Actually it can still get a slightly worse result with False BBOX_NORMALIZE_TARGETS_PRECOMPUTED.
    But when I even did a little modification on the network(such as the shape of the anchors or the architecture of the detection network), I got a wrong result. If I set BBOX_NORMALIZE_TARGETS_PRECOMPUTED = False i can get a right result but not pretty good. And I also tried the BBOX_NORMALIZE_STDS that fit my dataset.

    Answer: The reason I got a wrong result is that I used the defalut caffe snapshot instead of the snapshot function implemented in train.py(There are many similar issues, such as strange detection result depend on bbox_pred #186 )

  2. (Unresolved)I do not understand why Faster RCNN normalizes the bounding box targets? how it works and how should i do to use it correctly ?

    in proposal_target_layer.py we nomalize the bounding box target

    if cfg.TRAIN.BBOX_NORMALIZE_TARGETS_PRECOMPUTED:
          # Optionally normalize targets by a precomputed mean and stdev
          targets = ((targets - np.array(cfg.TRAIN.BBOX_NORMALIZE_MEANS))
                  / np.array(cfg.TRAIN.BBOX_NORMALIZE_STDS))
    

    while in train.py snapshot(), we unnormalize the weights of layer bbox_pred .

    # scale and shift with bbox reg unnormalization; then save snapshot
            net.params['bbox_pred'][0].data[...] = \
                (net.params['bbox_pred'][0].data *
                 self.bbox_stds[:, np.newaxis])
            net.params['bbox_pred'][1].data[...] = \
                (net.params['bbox_pred'][1].data *
                 self.bbox_stds + self.bbox_means)
    

    Why does it work? Could someone give the explanations or the related papers of the theory?
    Thanks for any help

@GeorgiAngelov
Copy link

Good question. I would also like the answer to that...

@FangShancheng
Copy link
Author

The reason I got a wrong result is that I used the defalut caffe snapshot instead of the snapshot function implemented in train.py

while still i cannot understand why we nomalize the bounding box target and unnormalize the **weight of layer bbox_pred works.

@FangShancheng
Copy link
Author

Question 2:
if we unnormalize the target: t_{un} = t * std + mean = ( xw + b ) * std + mean = xwstd + b * std + mean
if we unormalize the weights: t_{un} = x * w_{un} + b_{un} = x * (w
std+mean) + (bstd) = xwstd + b * std + **xmean**
x is the feature vector, w is the weights, b is the bias, t is the target, w_{un} is the unnormalized weight, b _{un} is the unnormalized bais ....

does it mean that if mean is not (0,0,0,0) , the code in snapshot() cannot work well ?
@rbgirshick @GeorgiAngelov

@Edwardmark
Copy link

@FangShancheng Good question, have you figured out why?

@yashkant
Copy link

yashkant commented Feb 2, 2019

If I understand correctly, in Fast R-CNN paper (Page 3, RHS) the authors mention that they're normalizing the bounding boxes, I think it helps with setting the lambda parameter to 1 when we use multi-class loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants