Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align methods for #858 #859

Merged
merged 18 commits into from
Jul 28, 2021
Merged

Align methods for #858 #859

merged 18 commits into from
Jul 28, 2021

Conversation

mortonjt
Copy link
Contributor

@mortonjt mortonjt commented Jul 26, 2021

This pull request adds alignment functionality, to allow the biom table to be aligned against metadata (represented as pd.DataFrame) and trees (represented as skbio.TreeNode).

I have made this a little more general, allowing these alignment methods to be performed on both axes (sample and observations).

Once we have agreed on the basic user interface, I will push in the remaining unittests.

Outstanding questions

  1. Should we enable an inplace option? Maybe useful for massive datasets.
  2. Should we enable a how or join flag, specifying if this is going to be an inner, outer, left or right join? Right now, I'm only implementing an inner flag since those are my only use cases. But if there is a compelling use case, I am open for discussion.

Copy link
Member

@wasade wasade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! There are a few suggestions to make the code more direct. If this is based on an existing implementation, it may be worth verifying the sorting works as expected as the prior use of sort_f I don't believe would produce a correct table

biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
biom/table.py Outdated Show resolved Hide resolved
@wasade
Copy link
Member

wasade commented Jul 27, 2021

Inner is fine. Tests would be good :) Possible to address the failure items (looks like lint), and to also include a changelog note?

@mortonjt
Copy link
Contributor Author

mortonjt commented Jul 27, 2021

@wasade thank you for the thorough review - still a little struck how the tree alignment could have been off this whole time.
The code here seems to be working, but I may need to double check my other workflows once this is merged in ...

EDIT : there are some failing tests that don't appear to be relevant to this PR, for instance.

test_table.py:668: in test_from_hdf5_custom_parsers
    self.assertIn(m['BODY_SITE'], ('GUT', 'SKIN'))
E   AssertionError: b'GUT' not found in ('GUT', 'SKIN')

ChangeLog.md Outdated Show resolved Hide resolved
Copy link
Member

@wasade wasade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! As written, the skbio tests would never have ran btw. Can CI be updated as well to install skbio? I would like CI to pass for at least some of the builds in the matrix before merge, and I can adjust later some builds fail

biom/table.py Outdated Show resolved Hide resolved
biom/table.py Show resolved Hide resolved
biom/tests/test_table.py Outdated Show resolved Hide resolved
biom/tests/test_table.py Outdated Show resolved Hide resolved
biom/tests/test_table.py Show resolved Hide resolved
biom/tests/test_table.py Outdated Show resolved Hide resolved
biom/tests/test_table.py Outdated Show resolved Hide resolved
mortonjt and others added 9 commits July 27, 2021 12:58
@mortonjt
Copy link
Contributor Author

Just added in doctests - I believe the outstanding comments have been addressed.

@wasade
Copy link
Member

wasade commented Jul 28, 2021

Wonderful, thank you @mortonjt!!

@wasade wasade merged commit c3cd89f into biocore:master Jul 28, 2021
@mortonjt
Copy link
Contributor Author

mortonjt commented Jul 28, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants