Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning up Ops Boxes and Losses 🧹 #5979

Closed
wants to merge 14 commits into from

Conversation

oke-aditya
Copy link
Contributor

@oke-aditya oke-aditya commented May 9, 2022

Fixes #5976
Work in Progress. Opened PR to see if I keep passing tests :)

@oke-aditya oke-aditya mentioned this pull request May 10, 2022
@oke-aditya oke-aditya changed the title WIP Cleaning up Ops Boxes and Losses WIP Cleaning up Ops Boxes and Losses 🧹 May 11, 2022
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I shouldn't be snooping around on draft PRs. I've added couple of comments but feel free to ignore if you think it's too early. 😄

torchvision/ops/ciou_loss.py Outdated Show resolved Hide resolved
@@ -12,6 +13,9 @@ def complete_box_iou_loss(
) -> torch.Tensor:

"""
# Original Implementation from
https://github.com/facebookresearch/detectron2/blob/main/detectron2/layers/losses.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave this as a comment on the source. Long unrendered URLs are not particularly helpful for the documentation. We should make the same change on diou.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sphinx agrees with you 😃

image

Let me re-wrtie

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap. That looks ugly and it's on multiple places. If you want bring a separate quick PR that moves attributions on the main part of the methods to avoid the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to embed the link e.g.

Original Implementation from Detectron2

But the docstring should at least start by describing what the object is, even if it's very obvious from its name already. So I would suggest to write something like

"""Complete Box IoU Loss.

Implementation is adapted from `Detectron2 <https://github.com/facebookresearch/detectron2/blob/main/detectron2/layers/losses.py>`__.

...

Copy link
Contributor Author

@oke-aditya oke-aditya May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

That's how it's currently over main branch. I would suggest adding 3 words in end
implementation adapted from Detectron2.

Copy link
Member

@NicolasHug NicolasHug May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great description but the preview is a bit long @oke-aditya . It's best to skip a line after the first sentence (there's even a PEP for that) to keep the preview is short and to-the-point.

Multi-line docstrings consist of a summary line just like a one-line docstring, followed by a blank line, followed by a more elaborate description. The summary line may be used by automatic indexing tools; it is important that it fits on one line and is separated from the rest of the docstring by a blank line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be tackling this in seperate PR anyways to unify all stuff. I feel we need bit more revamp for Ops docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely provide attribution to other projects when we use code from them. Let's keep the reference on the code as a comment on the main body of the method similar to what we do in most places instead of placing a link. We can review this on the future and make changes in a coordinated manner.

@@ -67,3 +67,33 @@ def split_normalization_params(
else:
other_params.extend(p for p in module.parameters() if p.requires_grad)
return norm_params, other_params


def _upcast(t: Tensor) -> Tensor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that previous code used different versions of upcast. More specifically boxes permitted integers while losses didn't.

@oke-aditya oke-aditya requested a review from datumbox May 11, 2022 10:46
@oke-aditya oke-aditya marked this pull request as ready for review May 11, 2022 10:46
@oke-aditya oke-aditya changed the title WIP Cleaning up Ops Boxes and Losses 🧹 Cleaning up Ops Boxes and Losses 🧹 May 11, 2022
@oke-aditya
Copy link
Contributor Author

oke-aditya commented May 12, 2022

@pmeier Since the test_ops.py file is growing very big. 1.5k+ lines. I think it's time to take out common items. So I have created a new test file for losses. Should I do the same for IoU ?

Also cc @NicolasHug

# Protects from numerical overflows in multiplications by upcasting to the equivalent higher type
if t.dtype not in (torch.float32, torch.float64):
return t.float()
return t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same _upcast as _utils. It doesn't support integers and converts everything to floats. Could you please review all the places where the giou_loss._upcast() was used and ensure the output will be a float? Basically box ops are OK to maintain things as integers but not losses.

Copy link
Contributor Author

@oke-aditya oke-aditya May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I do understand that I have not used the _upcast_if_not_float. It's intentional, :) Since I would like to know why

box ops are OK to maintain things as integers but not losses.

Copy link
Contributor

@datumbox datumbox May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _upcast on boxes was introduced for operators that estimate information related to a box. Things like the box area for instance. If you put integer boxes and you ask for the area of a box, you kind of expecting you receive the value in integers (that's debatable but that's how the operator worked). The issue was that if you used too small of a precision, the area estimation would overflow. So this method upcasts math operations to a space where it's safe to do multiplications without risking overflowing for most applications.

On the losses side, I'm not aware of any application that does things on integer space. Not only that but doing reduction == "mean" will break things. So we need to be careful to definitely not support ints in losses. Similar care might be needed for some box operators. Area might still make sense to return as integer but I'm not 100% sure if that's the case with all the IoU metrics we deal here.

So I think it's important prior merging this PR, to make an explicit decision of what has to support integers and what doesn't, handle it appropriately and add tests and xfails to ensure we are not breaking anything.

from common_utils import cpu_and_gpu
from torchvision import ops


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a class, inheriting and the calling method is also possible. For now is this fine?

assert_empty_loss(ops.distance_box_iou_loss, dtype, device)


class TestFocalLoss:
Copy link
Contributor Author

@oke-aditya oke-aditya May 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are testing losses here, I felt to add this here . To avoid confusion between files.



def get_boxes(dtype, device):
box1 = torch.tensor([-1, -1, 1, 1], dtype=dtype, device=device)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not super happy with this choice of box. Since this is actually invalid input

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the concern.

Detectron2 used the same set of boxes. see this.

I think we should use valid input boxes.

That being said, should we also check if the input boxes have non-negative values?
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot assert here as it will lead to cuda call and cause trouble.

I'm not sure if we can use torch._assert_async either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, ig this situation is similar to #5776 (comment)

@datumbox
Copy link
Contributor

@oke-aditya Shall we convert to draft and ping us when you are happy to review again?

@oke-aditya
Copy link
Contributor Author

Yep. :)

@datumbox datumbox marked this pull request as draft May 13, 2022 12:24
@oke-aditya
Copy link
Contributor Author

Not sure if I simplified the IoU tests but at least I reduced the code complexity from previous tests. I basically did not want to override a class that handles both the box_area and the IoUs. So I wrote separate test for box_area and then used a class to simplify IoU tests. i feel this keeps code bit readable and clear.

Still need to get rid of _generate().

One way is to use pytest.fixture or other is to use a few global variables.

@oke-aditya oke-aditya marked this pull request as ready for review May 16, 2022 12:32
@oke-aditya
Copy link
Contributor Author

oke-aditya commented May 16, 2022

@datumbox @NicolasHug as discussed with @pmeier . I will split this PR into 2 Parts. The first part will cleanup the code. The second part will cleanup the tests 😃 This will make merging easier.

@oke-aditya oke-aditya closed this May 16, 2022
@oke-aditya oke-aditya mentioned this pull request May 16, 2022
@oke-aditya oke-aditya deleted the cleanup_ops branch June 9, 2022 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clean up Ops Box and Losses implementations
5 participants