-
Notifications
You must be signed in to change notification settings - Fork 23.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce DeprecatedTypeProperties class #17991
Conversation
Differential Revision: D14443117 Differential Version: 75327518
Differential Revision: D14443117 Differential Version: 75511212
Differential Revision: D14443117 Differential Version: 75608133
Differential Revision: D14443117 Differential Version: 75660088
Differential Revision: D14443117 Differential Version: 75688678
Differential Revision: D14443117 Differential Version: 77000356
Differential Revision: D14443117 Differential Version: 77001327
Can you remind me again how we are solving the BC problem, where the return type name from |
aten/src/ATen/Dispatch.h
Outdated
@@ -43,8 +43,8 @@ inline at::ScalarType scalar_type(at::ScalarType s) { | |||
|
|||
C10_DEPRECATED_MESSAGE("passing at::Type to an AT_DISPATCH macro is deprecated, " \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation message isn't even accurate anymore lol
@@ -88,7 +88,8 @@ void SparseTensorImpl::set_indices_and_values_unsafe(const Tensor& indices, cons | |||
AT_CHECK(!indices.is_sparse(), "expected indices to be a dense tensor, but got indices of layout ", indices.layout()); | |||
AT_CHECK(!values.is_sparse(), "expected values to be a dense tensor, but got values of layout ", values.layout()); | |||
|
|||
AT_CHECK(values.type().toSparse() == legacyTensorType(*this), "values type must match sparse tensor type"); | |||
AT_CHECK(values.device().type() == device().type(), "device type of values (", values.device().type(), ") must match device type of device().type()", device().type(), ")"); | |||
AT_CHECK(values.scalar_type() == typeMetaToScalarType(dtype()), "dtype of values (", values.scalar_type(), ") must match dtype of sparse tensor (", typeMetaToScalarType(dtype()), ")"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand, why don't you just compare dtype here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the old comparison is comparing Type, which is more than just dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, and that's why you compared device()
. But for the second one, why not values.dtype() == dtype()
?
@@ -82,7 +85,7 @@ inline LongTensor flatten_indices(const Tensor& indices, IntArrayRef full_size, | |||
indices_mult_cpu_vec[i] = mult; | |||
mult *= full_size[i]; | |||
} | |||
auto indices_mult_cpu = indices.type().cpu() | |||
auto indices_mult_cpu = indices.dispatch_type().cpu() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a non Type/TypeProperties replacement for tensorFromBlob
? Could this site be replaced with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet.
aten/src/ATen/core/Formatting.cpp
Outdated
@@ -238,8 +242,7 @@ std::ostream& print(std::ostream& stream, const Tensor & tensor_, int64_t linesi | |||
stream << "size:\n" << tensor_.sizes() << "\n"; | |||
stream << "]"; | |||
} else { | |||
Type& cpudouble = tensor_.type().toBackend(Backend::CPU).toScalarType(kDouble); | |||
Tensor tensor = tensor_.toType(cpudouble).contiguous(); | |||
Tensor tensor = tensor_.toType(Backend::CPU, kDouble).contiguous(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More idiomatic is tensor_.to(kCPU, kDouble)
aten/src/ATen/core/TypeProperties.h
Outdated
// serves as a replacement return value for Tensor::type(). Previously, | ||
// Tensor::type() returned Type&, but we are changing Type to not be | ||
// dtype-specific. | ||
class TypeProperties { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't remember, how much did we bikeshed this name :) cc @gchanan
aten/src/ATen/core/TypeProperties.h
Outdated
// dtype-specific. | ||
class TypeProperties { | ||
public: | ||
TypeProperties(Backend backend=Backend::Undefined, ScalarType scalar_type=ScalarType::Undefined) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hot take: we shouldn't make these arguments optional. They should be mandatory, and then we should provide a default constructor that's undefined everything.
aten/src/ATen/core/TypeProperties.h
Outdated
return backend_ != Backend::Undefined && scalar_type_ != ScalarType::Undefined; | ||
} | ||
|
||
TypeProperties& operator=(const TypeProperties& other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C++ should be able to infer this definition automatically. No need to specify it.
aten/src/ATen/native/Indexing.cpp
Outdated
@@ -202,7 +201,7 @@ static Tensor computeLinearIndex(const Tensor & src, TensorList indices) { | |||
if (indices[i].defined()) { | |||
// Cast index to the longType matching src's backend | |||
// This allows us to support ie indexing a cuda tensor with a cpu tensor | |||
Tensor index = (wrapIndexOnce(indices[i], i, src.size(i)) * strides[i]).toType(longType); | |||
Tensor index = (wrapIndexOnce(indices[i], i, src.size(i)) * strides[i]).toType(kLong); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to(kLong)
? :)
@@ -370,8 +370,10 @@ Tensor dot(const Tensor& self, const Tensor& tensor) { | |||
|
|||
Tensor& dot_out(Tensor& result, const Tensor& self, const Tensor& tensor) { | |||
result.resize_({}); | |||
AT_CHECK(result.scalar_type() == self.scalar_type(), | |||
"result dtype ", result.scalar_type(), " does not match self dtype ", self.scalar_type()); | |||
// dispatching through type ensures we don't allow mismatched types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment's out of date now? (Or the CHECK above is not needed?)
@@ -364,7 +364,7 @@ Tensor ctc_loss(const Tensor& log_probs, const Tensor& targets, IntArrayRef inpu | |||
} | |||
} | |||
if (reduction == Reduction::Mean) { | |||
auto target_lengths_t = at::tensor(target_lengths, res.options().device(at::Device(at::Device::Type::CPU)).dtype(kLong)).toType(res.type()); | |||
auto target_lengths_t = at::tensor(target_lengths, res.options()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err... really? I'd be chuffed if this is semantics preserving, but it doesn't look obviously so to me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's good, there's a lot of instances of redundant stuff in thsi file.
@@ -12,7 +12,7 @@ Tensor pin_memory(const Tensor& self) { | |||
AT_ERROR("cannot pin '", self.type().toString(), "' only CPU memory can be pinned"); | |||
} | |||
auto* allocator = detail::getCUDAHooks().getPinnedMemoryAllocator(); | |||
auto tensor = self.type().tensorWithAllocator(self.sizes(), self.strides(), allocator); | |||
auto tensor = self.dispatch_type().tensorWithAllocator(self.sizes(), self.strides(), allocator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another one where it would be nice to stop using the factory function on type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't be done yet, but in the future :)
// log_probs: input_len x batch_size x num_labels | ||
// targets [int64]: batch_size x target_length OR sum(target_lengths) | ||
CheckedFrom c = "ctc_loss_gpu"; | ||
using target_t = typename std::conditional<target_scalar_type == kInt, int, int64_t>::type; | ||
auto targets = targets_.toType(log_probs.type().toScalarType(target_scalar_type)); // to log_probs cuda if it isn't there already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is the reason for the weird test I changed (that you commented on).
aten/src/ATen/native/cuda/LossCTC.cu
Outdated
@@ -225,7 +224,7 @@ std::tuple<Tensor, Tensor> ctc_loss_gpu_template(const Tensor& log_probs, const | |||
|
|||
auto target_lengths_t = at::tensor(target_lengths, targets.options().dtype(kLong)); | |||
auto input_lengths_t = at::tensor(input_lengths, targets.options().dtype(kLong)); | |||
tg_batch_offsets = tg_batch_offsets.toType(targets.type().toScalarType(kLong)); | |||
tg_batch_offsets = tg_batch_offsets.toBackend(Backend::CUDA); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to(kCUDA)
then? Or even just tg_batch_offsets.cuda()
@@ -4460,7 +4460,7 @@ def test_CTCLoss_lengthchecks_cpu(self): | |||
def test_CTCLoss_zero_infinity(self): | |||
target_lengths = [60, 25, 20] | |||
input_lengths = [50, 50, 50] | |||
targets = torch.randint(1, 15, (sum(target_lengths),), dtype=torch.int) | |||
targets = torch.randint(1, 15, (sum(target_lengths),), dtype=torch.int, device='cuda') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, how did this ever work?!
@@ -83,11 +83,11 @@ Tensor & map2_(Tensor & self, const Tensor & x_, const Tensor & y_, PyObject* fn | |||
} | |||
if (x_.type() != self.type()) { | |||
throw TypeError("map2_: expected %s for argument 'x' (got %s)", | |||
self.type().toString(), x_.type().toString()); | |||
self.type().toString().c_str(), x_.type().toString().c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're a lucky man, temporary lifetime extends for the entirety of the expression lol
Differential Revision: D14443117 Differential Version: 77388718
What was the resolution on the BC issue? Perhaps if we want to do BC here, we should actually do BC? And rename the internal thing instead? (It seems correct to call the thing DeprecatedTypeProperties in code but we should alias |
Will that help? It still won't return a reference. |
Differential Revision: D14443117 Differential Version: 77464130
You can also make it return a reference -- that won't be that difficult, right? You just need a small registration mechanism for which we have a lot of prior art. |
Differential Revision: D14443117 Differential Version: 77644959
Differential Revision: D14443117 Differential Version: 77671735
Okay. I'll change it to return a reference. I'll do the renaming change in another PR because there will be a lot of changes. |
New changes LGTM. Hope we can kill this soon lol. |
Differential Revision: D14443117 Differential Version: 77808612
Differential Revision: D14443117 Differential Version: 77838952
Differential Revision: D14443117 Differential Version: 78000628
@pytorchbot retest this please |
Differential Revision: D14443117 Differential Version: 78016243
Differential Revision: D14443117 Differential Version: 78021893
Differential Revision: D14443117 Differential Version: 78048258
Differential Revision: D14443117 Differential Version: 78065247
This pull request has been merged in c705d9e. |
Summary: Pull Request resolved: pytorch/pytorch#17991 changes: -Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&. -Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic. -Tensor::dispatch_type() now returns Type& like Tensor::type() used to do. -Changed callsites of Tensor::type() appropriately. Reviewed By: ezyang Differential Revision: D14443117 fbshipit-source-id: 239ccb7a09626279a71d1a37f8f82e7f57bf7d9e
|
||
// This class specifies a Backend and a ScalarType. Currently, it primarily | ||
// serves as a replacement return value for Tensor::type(). Previously, | ||
// Tensor::type() returned Type&, but we are changing Type to not be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is focusing on the wrong thing.
The main point is that type
served as two things:
- our dispatch mechanism
- a 'user-level' API for getting backend/ScalarType and a few other ops
since we want to have freedom to change our dispatch mechanism, we want to keep these separate.
You can then add what you wrote above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, we only used it as (2) in c10d, since we care about grouping by type etc.
Stack:
:black_circle: #17991 Introduce DeprecatedTypeProperties class 💚
:white_circle: #17601 Store ScalarType and Backend instead of Type in TensorIterator 💚
:white_circle: #17786 Pass ScalarType separately from Type in python constructors 💚
changes:
-Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&.
-Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic.
-Tensor::dispatch_type() now returns Type& like Tensor::type() used to do.
-Changed callsites of Tensor::type() appropriately.
Differential Revision: D14443117