Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rank searching mistakenly assumes taxonomy will be the first field of feature metadata #125

Closed
fedarko opened this issue May 19, 2019 · 0 comments · Fixed by #142
Closed
Assignees
Labels
bug Something isn't working

Comments

@fedarko
Copy link
Collaborator

fedarko commented May 19, 2019

Breaks with the sleep apnea test dataset. Should really add more tests for this (#2).

Problem is clear from https://github.com/fedarko/rankratioviz/blob/00d8085547a96944491df5c050b850f682fa53e4/rankratioviz/support_files/js/feature_computation.js#L84

potential solutions:

  • check each feature metadata field for taxonomy-like stuff (i.e. if it contains semicolons, give it a shot)?
  • Maybe the python code should look for "taxonomy" as a header for feature metadata and somehow pass that information to the js side of things?
    • actually: maybe if we find "taxonomy" as a feature metadata header, then we just make that the first field of the feature IDs? This would break some of the python tests, but it'd be pretty simple / intuitive to implement.
    • and if the user tries to do taxonomy search and no taxonomy is available (e.g. because the python script modifies a variable in main.js that is passed to RRVDisplay), we could give the user a warning saying "hey we couldn't find any taxonomy in the feature metadata, so searching by tax. rank won't work". Or we could just do this in python.

Ok, right now I like the second option better.

@fedarko fedarko added the bug Something isn't working label May 19, 2019
@fedarko fedarko self-assigned this May 19, 2019
fedarko added a commit that referenced this issue May 19, 2019
Still broken, tho, as described in #125.

that being said p sure that #125 is older than this branch, so
I'll probably merge this branch in soon and then finish up #125
fedarko added a commit that referenced this issue May 22, 2019
This constitutes most of the python progress on #132 (and by extension
also #125).

Next up for #132:

-Add tooltip defs in the rank plot for all feature metadata cols.
 I guess we could do this in the python side of things, in the
 rank_chart definition.

-create a list of non-ranking feature data columns (all feature data
 fields, minus ranking columns, minus Feature ID and Classification)
 to populate two <select>s with all avail. feature metadata columns.

-Make the JS searching functionality look by selected feature metadata
 field (one select for num search, one select for den search).
        -Ignore features that have "null" for a given col (i.e. no row
         in the feature md file). This might cause some confusion if any
         metadata actually has "null" as a given string, but I *think*
         this should be ok. (Should add test cases; tagging #2 and #62.)
        -Support multiple searches (e.g. with multiple taxa).
                -MAYBE support searching by taxonomic rank as current?
                -IDK.
                -Also maybe more specialized searches by field type
                 (e.g. if the field is numeric, limit to ranges).
                -Probably goes beyond the scope of #132 tho.
fedarko added a commit that referenced this issue May 23, 2019
All that's needed to reestablish most of the prior search
functionality (and close out #125) will be making filterFeatures()
accept the *entire* dataset (or perhaps just a subset based on
the search type) and look through that.
fedarko added a commit that referenced this issue May 24, 2019
Shouldn't be *too* difficult to modify filterFeatures() to detect
the search type and then apply that, same as before.

As discussed in #132 and the various commit messages that reference
it, I'd like to eventually support "joint" queries where you can
filter on multiple criterion (e.g. "contains this text ... and
contains these taxonomic ranks ... and has a confidence greater than
..."), but that sounds like a ton of work. For now, just getting
back to the previous functionality in a bug-free state (i.e. with
issue #125 knocked out) will be good enough.

...So future issues to make after #132 and #125 are:
    -support searching by ranges on numerical feature metadata fields
    -support joint queries across multiple feature metadata fields
fedarko added a commit that referenced this issue May 26, 2019
gonna do some manual sanity checking, then we can merge this branch
in and close #125 (and probs a few other issues) :D
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant