-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Adds new format GenomeDataDirectoryFormat
with genome_dict
function
#345
Conversation
Hi @misialq Also in the same action there are these checks if the files match the pathspec:
So it checks if the files end in fasta or fa. But isn't this redundant because in the validation of the format only files with those extensions are allowed anyway? Or am I missing something? I am asking because I copied that sample_dict function and modified it. |
Hey @VinzentRisch, So for the Re: matching the pathspec: that is also a good question - technically, it should not be required since, as you wrote, only those files are allowed. I guess that was more of a failsafe but you could probably skip it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @VinzentRisch - so how about being a bit more generous and adding those methods to the remaining directory formats here? Loci and Genomes could also make use of those and that's what was also described in the original issue. 🙏
Hi @misialq |
GenesDirectoryFormat
and ProteinsDirectoryFormat
GenomeDataDirectoryFormat
with genome_dict function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @VinzentRisch, looks good, thanks! 🚀
@lizgehret would you like to look this through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this all looks reasonable to me, thanks @VinzentRisch!
GenomeDataDirectoryFormat
with genome_dict function.GenomeDataDirectoryFormat
with genome_dict
function
closes #344
For easier handling of
GenesDirectoryFormat
,ProteinsDirectoryFormat
,LociDirectoryFormat
andDNASequencesDirectoryFormat
in q2-amrfinderplus I created a new DirFormat calledGenomeDataDirectoryFormat
that they all inherit from. This new dirformat has a function called genome_dict.MultiMAGSequencesDirFmt
or feature_dict ofMAGSequencesDirFmt
.GenesDirectoryFormat
andProteinsDirectoryFormat
can contain files in per sample directories or not. So depending on the directory structure the output of genome_dict will be similar to sample_dict ofMultiMAGSequencesDirFmt
or feature_dict ofMAGSequencesDirFmt
.