-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp transforms doc #7859
Revamp transforms doc #7859
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7859
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New Failures, 1 Unrelated FailureAs of commit 7f6a39d with merge base 2c44eba ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
# Inplace operations on datapoints like ``obj.add_()`` will preserve the type of | ||
# ``obj``. However, the **returned** value of inplace operations will be a pure | ||
# tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more of a drive-by. Turns out those inplace
operations aren't really exceptions at all.
@@ -5,242 +5,449 @@ Transforming and augmenting images | |||
|
|||
.. currentmodule:: torchvision.transforms | |||
|
|||
Torchvision supports common computer vision transformations in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the text below is mostly new. There was very little copy/pasting. Mostly re-writing from scratch. Best to ignore the diff when reviewing.
v2.Resize | ||
v2.ScaleJitter | ||
v2.RandomShortestSize | ||
v2.RandomResize | ||
RandomCrop | ||
|
||
Functionals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to separate the functionals into [subsubsubsub]sections like that. The alternative is to put e.g. resize
just below Resize
, but I found that to hurt readability of the rendered docs because it creates tables that are too long, and it becomes hard to figure out which transforms are actually supported.
(I didn't change the way we document the v1 stuff)
@NicolasHug great work rewriting transforms.rst ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments inline. Otherwise, I stumbled over this:
vision/torchvision/transforms/v2/_type_conversion.py
Lines 46 to 47 in 2c44eba
class ToPILImage(Transform): | |
"""[BETA] Convert a tensor or an ndarray to PIL Image - this does not scale values. |
This does not come from this PR, but might be worth fixing here.
For the most common case, i.e. tensor input and omitting the mode
parameter, we do in fact scale the values:
vision/torchvision/transforms/functional.py
Lines 289 to 291 in 2c44eba
if isinstance(pic, torch.Tensor): | |
if pic.is_floating_point() and mode != "F": | |
pic = pic.mul(255).byte() |
Meaning, this transform will fail for users that do ToPILImage(my_float_image_tensor * 255)
Anyway, that is not really important. Thanks Nicolas for the major rewrite!
Transforms tend to be sensitive to the input strides / memory layout. Some | ||
transforms will be faster with channels-first images while others prefer | ||
channels-last. You may want to experiment a bit if you're chasing the very |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we had user confusion before, should we point out here that we are talking about the memory layout and not the actual shape of the tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 7f6a39d
parametrization. The ``get_params()`` class method of the transforms class | ||
can be used to perform parameter sampling when using the functional APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A public get_params
is only available for BC for v1 transforms. Should we advertise this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still the only way to sample parameters, right? I don't think we just added it for BC, we added it for feature completeness.
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Summary: (Note: this ignores all push blocking failures!) Reviewed By: matteobettini Differential Revision: D48900378 fbshipit-source-id: 2746d77f5c3a01223a98a20ba12f217b89d9652b Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Will follow-up with changes to the gallery examples when this is done.
cc @vfdev-5