Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial framework for balance metrics and tests #2246

Merged
merged 1 commit into from
May 10, 2022

Conversation

jeromekelleher
Copy link
Member

This is the first step for adding tree balance metrics as discussed in #2245.

If we agree with this approach, we can follow up with some specific issues to (a) add other metrics; (b) document them; and (c) implement them in C.

@codecov
Copy link

codecov bot commented May 9, 2022

Codecov Report

Merging #2246 (cb057e7) into main (db7af18) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2246   +/-   ##
=======================================
  Coverage   93.31%   93.31%           
=======================================
  Files          27       27           
  Lines       26025    26035   +10     
  Branches     1166     1170    +4     
=======================================
+ Hits        24284    24294   +10     
  Misses       1711     1711           
  Partials       30       30           
Flag Coverage Δ
c-tests 92.23% <ø> (ø)
lwt-tests 89.05% <ø> (ø)
python-c-tests 71.96% <10.00%> (-0.05%) ⬇️
python-tests 98.88% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/tskit/trees.py 98.02% <100.00%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update db7af18...cb057e7. Read the comment docs.

@jeromekelleher
Copy link
Member Author

@jeremyguez would you mind taking a quick look here to see what you think? Are there any other tree topology examples we should include here? The idea is that we'll add tests for each of the balance metrics to the classes as we add them, so we can see the what the values are for these simple cases, easily.

Copy link
Member

@benjeffery benjeffery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I wonder if we should go for defining these topologies in an examples.py though.

assert tree.sackin_index() == sackin_index_definition(tree)


class TestBalancedBinaryOdd:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's quite a bit of repeated boilerplate here for what is looping over several different topologies. If we're doing #1804 then should these trees be defined in examples.py then this file is a test that checks a dict of tree_name->value pairs as a single parameterised test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a tension here though between avoiding repetition and being able to see what you're testing. I find it really helpful to be able to just look at the example and see what it is your testing, and see what the result should be. It helps a lot to be able to fine-tune the examples for specific things that you're doing, too, rather than trying to come up with a universal set of examples that cover all possibilities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, might not be obvious here but the idea is to put test functions for all the other metrics into these classes, so the boilder-plate fraction will go down quite a bit.

@jeremyguez
Copy link
Contributor

@jeromekelleher
The tested topologies look good enough to me. (We could also have the case of mixed arities in one tree, but I'm not sure if it matters much.)

@jeromekelleher
Copy link
Member Author

OK, great, thanks @jeremyguez. I don't think there's much need to have mixed arity in a tree, as these examples are really just to make it easy for us to see what the answer is in some representative situations. The systematic testing is done in the pytest.mark.parametrize where we compare the definition with the implementation over lots of examples.

OK, going to merge this then. Are you happy to pick this up from here @jeremyguez? (I'll do the C version of this now, so you have examples to follow there)

@jeromekelleher jeromekelleher added the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label May 10, 2022
@jeremyguez
Copy link
Contributor

@jeromekelleher Yes, I started to look at #2249.

@mergify mergify bot merged commit 5bff374 into tskit-dev:main May 10, 2022
@mergify mergify bot removed the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants