Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About CAMUS dataset #4

Open
dengxl0520 opened this issue Sep 26, 2022 · 8 comments
Open

About CAMUS dataset #4

dengxl0520 opened this issue Sep 26, 2022 · 8 comments

Comments

@dengxl0520
Copy link

I want to use the CAMUS dataset on this project, but I have some problems:
I use the castor/vital/vital/data/camus/dataset_generator.py to generate the HDF5 file, but I got an error:

(castor) dengxiaolong@rtx2:~/code/castor/vital/vital/data/camus$ python dataset_generator.py /data/dengxiaolong/
Traceback (most recent call last):
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 353, in
main()
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 339, in main
CrossValidationDatasetGenerator()(
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 106, in call
fold_subset_patients = self.get_fold_subset_from_file(data, fold, subset_name_in_data)
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 136, in get_fold_subset_from_file
with open(str(list_fn), "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/data/dengxiaolong/listSubGroups/subGroup1_training.txt'

how can I get the 'subGroup1_training.txt'?

@gungui98
Copy link

@dengxl0520 do you have any update on this? I was shocked by the amount of over-engineering of their projects.

@gungui98
Copy link

gungui98 commented Sep 30, 2022

@dengxl0520 I have just read through the dataset_generator.py and it looks like the team could access to a CAMUS dataset with fully annotated sequence from ED to ES, it seems confuse to me because the public training set of CAMUS does not contain any of this information.

https://github.com/vitalab/vital/blob/e5ce208b3263e781e7ed306b9f46b6e84134cc6a/vital/data/camus/dataset_generator.py#L240

@dengxl0520
Copy link
Author

dengxl0520 commented Oct 2, 2022

@gungui98 About the dataset, you can see the paper https://arxiv.org/abs/2112.02102
It mentioned a dataset called TED ,which with fully annotated sequence from ED to ES.
But I still don't understand how to use this dataset...

@gungui98
Copy link

gungui98 commented Oct 3, 2022

@dengxl0520 Seem like they extend the original dataset to full cycle by manual annotation. I have email the author for the dataset but haven't receive the response yet.

@dengxl0520
Copy link
Author

dengxl0520 commented Oct 3, 2022

@gungui98 you can download it from here (https://humanheart-project.creatis.insa-lyon.fr/ted.html)

@gungui98
Copy link

gungui98 commented Oct 3, 2022

@dengxl0520 I have successfully processed the dataset and got h5 file, first you have to run script with full cycle option, where the input dataset is from your provided link.
python dataset_generator.py --output ~/data/camus.h5 --sequence_type full_cycle ~/data/camus_full_cycle/TED/database/

I have also skip the k-fold part where I simply split the dataset into 80/10/10 for train test val for the function get_fold_subset_from_file from vital/vital/data/camus/dataset_generator.py into

    def get_fold_subset_from_file(
            cls, data: Path, fold: int, subset: Literal["training", "validation", "testing"]
    ) -> List[str]:
        """Reads patient ids for a subset of a cross-validation configuration.

        Args:
            data: Path to the CAMUS root directory, under which the patient directories are stored.
            fold: ID of the test set for the cross-validation configuration.
            subset: Name of the subset for which to fetch patient IDs for the cross-validation configuration.

        Returns:
            IDs of the patients that are included in the subset of the fold.
        """
        # list_fn = data / "listSubGroups" / f"subGroup{fold}_{subset}.txt"
        # # Open text file containing patient ids (one patient id by row)
        # with open(str(list_fn), "r") as f:
        #     patient_ids = [line for line in f.read().splitlines()]
        import glob
        patient_ids = glob.glob(str(data / "*"))
        # patient_ids = sorted(patient_ids)
        train_set = patient_ids[:int(len(patient_ids) * 0.8)]
        test_set = patient_ids[int(len(patient_ids) * 0.8):int(len(patient_ids) * 0.9)]
        val_set = patient_ids[int(len(patient_ids) * 0.9):]
        if subset == "training":
            return train_set
        if subset == "testing":
            return test_set
        return val_set

I will try to implement the correct and fixed k-fold part but this simply made thing run at first.
PS: I have also trained a model with this file, but with CRISP project!

@dengxl0520
Copy link
Author

dengxl0520 commented Oct 6, 2022

@gungui98 I try to modify the python file dataset_generator.py like you, and i run the script then i meet other problem.

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 363, in
main()
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 349, in main
CrossValidationDatasetGenerator()(
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 118, in call
self._write_patient_data(dataset.create_group(patient_id))
File "/home/dengxiaolong/code/castor/vital/vital/data/camus/dataset_generator.py", line 176, in _write_patient_data
data_x_proc = resize_image(data_x, self.target_image_size, resample=Resampling.BILINEAR)
File "/home/dengxiaolong/miniconda3/envs/castor/lib/python3.10/site-packages/vital/utils/image/transform.py", line 22, in resize_image
resized_image = np.array(Image.fromarray(image).resize(size, resample=resample))
File "/home/dengxiaolong/miniconda3/envs/castor/lib/python3.10/site-packages/PIL/Image.py", line 2955, in fromarray
raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
TypeError: Cannot handle this data type: (1, 1, 748), |u1

@gungui98
Copy link

gungui98 commented Oct 7, 2022

@dengxl0520 not really sure about your problem, but this is code that I have used, it could come from reading the image data:

https://gist.github.com/gungui98/364e8f77930880132dee9704aca9a90d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants