Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nodes argument to genotype_matrix #1207

Open
hyanwong opened this issue Feb 17, 2021 · 5 comments
Open

Add nodes argument to genotype_matrix #1207

hyanwong opened this issue Feb 17, 2021 · 5 comments
Labels
enhancement New feature or request Python API Issue is about the Python API

Comments

@hyanwong
Copy link
Member

In #1202 (comment) @petrelharp suggested (and I agree) that it would be useful to add a 'nodes' parameter to the genotype_matrix() method, not only to restrict the size of the matrix returned, but also to be able to output ancestral genotypes/haplotypes (see #1206).

If this is done, I suggest that we might want to reimplement the haplotypes() method to use genotype_matrix() rather than iterating over the variants(), as this would probably remove some fiddly code and also automatically provide a way of getting ancestral (string-format) haplotypes, if we ever need that.

@jeromekelleher
Copy link
Member

SGTM - should be a straightforward extension of the APIs in variants.

@hyanwong
Copy link
Member Author

Great, thanks @jeromekelleher. I understand that you imagined it would be a bit fiddly to implement non-sample nodes together with isolated_as_missing=True in the variants code in C; I imagine this therefore also applies to the _ll_tree_sequence.get_genotype_matrix() method underlying the genotype_matrix() function. I'm just noting it in this issue for reference, as the ability to output ancestral haplotypes (with missing sections either side) was my original motivation for bringing this up.

@jeromekelleher
Copy link
Member

I'm not sure there's actually much of a perf-gain from doing the genotype matrix stuff in C - we can probably use the variants() method in Python, which would simplify things.

@petrelharp
Copy link
Contributor

I think this is a duplicate of #678?

@jeromekelleher
Copy link
Member

Not quite @petrelharp - there we're talking about having a sites argument as well, which would require random access to the trees to really do properly.

@benjeffery benjeffery added enhancement New feature or request Python API Issue is about the Python API labels Apr 20, 2021
@benjeffery benjeffery added this to the Python upcoming milestone Apr 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Python API Issue is about the Python API
Projects
None yet
Development

No branches or pull requests

4 participants