Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix empty tensor for unique_dim #19000

Closed
wants to merge 25 commits into from
Closed

Conversation

codexetreme
Copy link
Contributor

Fixes: #18408

cc: @zasdfgbnm

Copy link
Collaborator

@zasdfgbnm zasdfgbnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @codexetreme, could you please complete this PR first, i.e. to add unit test, and do the same check for GPU? By adding unit test, you could basically find problem by yourself, and change it without having to ask help. The error messages in circle CI should be fairly helpful. Also, please resolve merge conflicts.

aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
Copy link
Contributor Author

@codexetreme codexetreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed merge conflicts, the other irregularities you mentioned. Also I am having a bit of difficulty writing tests. Where are these supposed to go exactly? there are many test folders. Should I add the tests in test_torch.py or in some other test folder?

aten/src/ATen/native/cuda/Unique.cu Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
@zasdfgbnm
Copy link
Collaborator

@codexetreme Sorry for the late reply, yes, for the test, you need to write it at test_torch.py. Just search for test_unique_dim in that file. I will check your work later. But in the mean time, could you resolve the merge conflicts again?

@zasdfgbnm zasdfgbnm self-assigned this Apr 11, 2019
@zasdfgbnm
Copy link
Collaborator

BTW, I am working on unique also, so some of my work might frequently cause merge conflicts for this. You might also continue your work, and resolve the conflicts later, which might be of less trouble. It's your choice.

@zasdfgbnm
Copy link
Collaborator

@codexetreme Also, whenever you see build failures from CI, could you also please check where is error and try fixing it? :)

@codexetreme
Copy link
Contributor Author

codexetreme commented Apr 11, 2019

Alright, I am looking into the CI logs... I'll write the tests and try to fix the CI bugs.

You might also continue your work, and resolve the conflicts later, which might be of less trouble.

I will do this...

@codexetreme
Copy link
Contributor Author

So in some of the CI logs, the error that is said is :

Reading package lists...
Building dependency tree...
Reading state information...
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
nvidia-docker2 : Depends: nvidia-container-runtime (= 2.0.0+docker18.06.2-1) but 2.0.0+docker18.06.2-2 is to be installed
E: Unable to correct problems, you have held broken packages.
Exited with code 100

Does this mean that the containers have not been properly instantiated or that the code in the commit is broken? And what to do for these?

@zasdfgbnm
Copy link
Collaborator

@codexetreme For these kind of failures, you could do a "@pytorchbot retest this please"

@codexetreme
Copy link
Contributor Author

@pytorchbot retest this please

@zasdfgbnm
Copy link
Collaborator

@codexetreme The bot is currently broken. You may try again next week.

@zasdfgbnm
Copy link
Collaborator

@pytorchbot retest this please

@codexetreme
Copy link
Contributor Author

The same issue persists with the tests. What should be done ?

@zasdfgbnm
Copy link
Collaborator

@codexetreme I think you need to resolve the conflict and merge with the latest master branch now. They must have updated CircleCI configurations.

@codexetreme
Copy link
Contributor Author

codexetreme commented Apr 18, 2019

Sorry for the late reply (uni assignments take up time :( ). Could you tell me the use of the return_counts bool? my code is failing as I am not returning a third tensor that should contain the counts tensor. According to my understanding, if the tensor is empty should I just return an empty tensor for it? (since there are no elements in the tensor and count would be zero anyway)

@zasdfgbnm
Copy link
Collaborator

@codexetreme Yes

@codexetreme
Copy link
Contributor Author

codexetreme commented Apr 23, 2019

In the Unique.cu file, in unique_dim_cuda_template input_flat, is declared twice,and this is giving the error

Apr 23 07:48:27 /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/Unique.cu(168): error: "input_flat" has already been declared in the current scope

a simple fix is to just rename its usages here, since I dont see these particular lines being used elsewhere.

Tensor input_flat = self.transpose(dim, 0);
auto orig_sizes = input_flat.sizes().vec();
input_flat = input_flat.contiguous().view({input_flat.size(0), -1});

@codexetreme
Copy link
Contributor Author

So, I have made the changes you requested, I changed the logic so now it handles the self.size(dim) !=0 case as well. Can you please check and let me know it is right?

Copy link
Collaborator

@zasdfgbnm zasdfgbnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks much better now. Could you please also fix the linter failures in Travis CI?

./test/test_torch.py:10963:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10964:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10965:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10966:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10969:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10970:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10971:17: E128 continuation line under-indented for visual indent
./test/test_torch.py:10972:17: E128 continuation line under-indented for visual indent
The command "flake8" exited with 1.

aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
@codexetreme
Copy link
Contributor Author

codexetreme commented May 5, 2019

Ok, so I have fixed the linting, what to do next?

aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/Unique.cpp Outdated Show resolved Hide resolved
@codexetreme
Copy link
Contributor Author

Why is that 1 particular test failing? The other cuda based tests seem to have passed...

Copy link
Collaborator

@zasdfgbnm zasdfgbnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codexetreme That failure is not your fault, just ignore it. The code LGTM now. @VitalyFedyunin Could you please take a look and merge?

@zasdfgbnm zasdfgbnm changed the title [WIP] Fix empty tensor for unique_dim Fix empty tensor for unique_dim May 7, 2019
@codexetreme
Copy link
Contributor Author

When will this be merged? Kinda curious .....

@zasdfgbnm
Copy link
Collaborator

@pytorchbot rebase this please

@pytorchbot pytorchbot added module: cuda Related to torch.cuda, and CUDA support in general module: operators labels May 12, 2019
@zasdfgbnm
Copy link
Collaborator

zasdfgbnm commented May 12, 2019

kindly ping @VitalyFedyunin :)

Could you please take a look at this? This does not have API change and should be easy to review.

@VitalyFedyunin
Copy link
Contributor

Sure, I was hoping to have #20017 fixed (this helps to write tests quickly), guess I will do workaround.

// check how many zero dimentions exist
auto num_zero_dims = std::count(sizes.begin(), sizes.end(), 0);

// tensor is not well formed as it has 0 sized dimentions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dimenSions, here and everywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this :)


# test not a well formed tensor
# Checking for runtime error, as this is the expected behaviour
self.assertRaises(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It is cleaner to write

with self.assertRaises(RuntimeError):
  torch.unique(
                x_ill_formed_empty, 
                return_inverse=True, 
                return_counts=True, 
                dim=1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the change like suggested :)

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@VitalyFedyunin
Copy link
Contributor

Could you please cleanup training whitespaces

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request May 29, 2019
Summary:
Fixes: #18408

cc: zasdfgbnm
Pull Request resolved: pytorch/pytorch#19000

Reviewed By: ezyang

Differential Revision: D15470136

Pulled By: VitalyFedyunin

fbshipit-source-id: daf71566b4dbdc91927d164f813b5ee8645af1a2
@facebook-github-bot
Copy link
Contributor

@VitalyFedyunin merged this pull request in 0ffd20c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: cuda Related to torch.cuda, and CUDA support in general open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_unique_dim does not support zero size tensor
6 participants