Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2 nifti stream and patch transforms #28

Merged
merged 12 commits into from
Jan 22, 2020

Conversation

ericspod
Copy link
Member

This includes code for implementing Nifti file reading as a data stream and streams for selecting parts of images as random patches or gridwise patches (gridwise sampling or windowing). The notebook in examples illustrates their use.

The idea behind the Nifti file reading is to define a stream which accepts as input file names to load, caches what has been loaded, and yields the loaded files with their headers if needed. This doesn't do any transformations on the data or header except converting to a canonical ordering, if we should be doing specific thing to clean up the files or enforce an expectation of array dimensions we need to discuss what that would be, either as a change to NiftiCacheReader or by defining further transforms to modify the data coming from it. The header also isn't being transformed so we could define a further transform to do that.

The transforms in patch_streams.py select out parts of input images, either selecting slices along a dimension in order, selecting random patches, or selecting patches in a grid. The idea is to implement these as transforms so they can be composed in a data stream. Most of the code is in separate utility functions which would be useful for other functions as they are for generic input. The transforms should work for data of arbitrary dimensions although they don't make any assumptions about which dimension is channel and should be altered, that's something the user would specify if we didn't want to build in assumptions.

One issue all these raise is that the input and outputs are all in tuple format. If we want to redefine all data streams types to use dict for IO we would have to modify these, although the change should be minimal. We may also want to define transform streams to pass along some data unmodified, eg. Nifti header dicts, this is another design discussion to have.

@ericspod ericspod requested a review from wyli January 16, 2020 15:17
@wyli wyli requested a review from Nic-Ma January 16, 2020 15:21
@wyli
Copy link
Contributor

wyli commented Jan 20, 2020

This PR actually addresses #2 #3 #4. @ericspod since you are on these, could you update this PR to use 'vanilla' pytorch as we discussed in the weekly meeting? (It should replace the existing datastream and transform objects currently in the repo.)

@ericspod
Copy link
Member Author

Yes those changes are in progress.

@ericspod
Copy link
Member Author

Here is the update to address #2 #3 #4. I need to refresh the segmentation example to use this code now.

@ericspod
Copy link
Member Author

Example segmentation notebook added.

@wyli
Copy link
Contributor

wyli commented Jan 21, 2020

Comparing this with PR #25 (addresses #9 the intensity transform), both agree that transforms should be callable:

(1) (from PR #25)

def __call__(self, data):
assert data is not None and isinstance(data, dict), 'data must be in dict format with keys.'
for key in self.apply_keys:

(2) (from this PR)

def __call__(self, img):
return rescale_array(img, self.minv, self.maxv, self.dtype)

but the format of data input to the callables is different.
I feel option (2) is a more 'vanilla' implementation for this sprint.
we could merge this PR and postpone #25 for now, and have a refactoring as an immediate next step after the first sprint. @Nic-Ma what do you think?

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Jan 22, 2020

Hi @ericspod ,

Thanks for your hard work, glad to see you complete all the 3 tasks so quickly!
And it looks good to me that you followed the native PyTorch way for MVP as we discussed.
I will also update my PR to remove the dict-key operations for MVP to be compatible with yours.
Could you please help also remove our previous MultiFormatTransform and DataStream related code in this PR?
I think we need to keep the code tree simple and clean for this sprint and let's discuss to add more complicated logics in the next sprints.
What do you think?
Thanks.

Copy link
Contributor

@Nic-Ma Nic-Ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@Nic-Ma
Copy link
Contributor

Nic-Ma commented Jan 22, 2020

Hi @ericspod ,

Thanks for your update, I think it's good to me now.

Hi @wyli ,
Could you please also help review it?
If you don't have any more comments, I will merge it ASAP.
Thanks.

yield tuple(slice(s, s + p) for s, p in zip(position[::-1], patch_size))


def iter_patch(arr, patch_size, start_pos=(), copy_back=True, pad_mode="wrap", **pad_opts):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be extended to support dense sampling grid (overlapping patches) so that it could be used for #39

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was thinking this too for other tasks.

@wyli wyli deleted the 2-nifti-stream-and-patch-transforms branch May 21, 2020 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants