-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add order argument to TreeSequence.nodes #2370
Comments
Same for youngest nodes too, I guess. I don't know if returning an iterator over node objects would be nicer, e.g.
or something like that. |
There's a trade-off here between having useful methods and having a gigantic unwieldy API. Do you think this is a very common operation? My gut instinct here is that this is simple enough that it is a user operation. |
Yes, I see that this is a bit of an edge case. One issue is that people might do np.argmax(ts.tables.nodes.time) which would only return the first of the oldest nodes (which is what I did before realising my mistake). The recommended way is (I guess, as above) a two-liner, which people are less likely to use: node_times = ts.tables.nodes.time
np.flatnonzero(node_times == np.max(node_times)) Having that automatically available somehow would stop people making that mistake. Implementing it by restricting the node iterator, as I suggested above, would mean not adding another method, at the cost of increasing the calling parameters of |
Another point is that mostly people expect the node table to have the nodes in time order (because that's what is output by msprime) but this is not guaranteed. So having a documented method to iterate over nodes in time order would be a way to ensure that this was enforced in a routine. Something like the following doesn't seem too bad an idea (although the exact semantics could probably be improved):
Then the oldest node is I have seen a fair bit of code that assumes that the samples are the first N nodes, and that the oldest node is the last one, so it might be useful to have a method that can be used in the examples without having to jump through numpy hoops. |
Adding an Getting the k nodes that are exactly as old as each other is very niche I think and not worth a library function. |
Yes, I think this is nice and general. 👍 |
Changed issue title. def nodes(self, *, order=None):
order = "id" if order is None else order
if order not in ["id", "timedesc", "timeasc"]:
raise ValueError(...)
# do stuff |
In #2470 I noted that we want to break ties by using the node ID (i.e. |
I was just after getting the oldest node(s) in the tree sequence, which I did by
node_times = ts.tables.nodes.time; oldest = np.flatnonzero(node_times == np.max(node_times))
but beginners might not know how to do this sort of thing, which involves knowing about tables. Is it worth having a few wrapper functions for things like this (although I don't know where they would belong)?The text was updated successfully, but these errors were encountered: