-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More sophisticated feature metadata annotation #132
Labels
enhancement
New feature or request
important
Things that are critical for getting Qurro in a working/useful state
Comments
This was referenced May 21, 2019
fedarko
added a commit
that referenced
this issue
May 22, 2019
This constitutes most of the python progress on #132 (and by extension also #125). Next up for #132: -Add tooltip defs in the rank plot for all feature metadata cols. I guess we could do this in the python side of things, in the rank_chart definition. -create a list of non-ranking feature data columns (all feature data fields, minus ranking columns, minus Feature ID and Classification) to populate two <select>s with all avail. feature metadata columns. -Make the JS searching functionality look by selected feature metadata field (one select for num search, one select for den search). -Ignore features that have "null" for a given col (i.e. no row in the feature md file). This might cause some confusion if any metadata actually has "null" as a given string, but I *think* this should be ok. (Should add test cases; tagging #2 and #62.) -Support multiple searches (e.g. with multiple taxa). -MAYBE support searching by taxonomic rank as current? -IDK. -Also maybe more specialized searches by field type (e.g. if the field is numeric, limit to ranges). -Probably goes beyond the scope of #132 tho.
One TODO: use column name and/or type to do cool stuff in searching. e.g. if it's "Taxonomy" or "Taxon" then split by semicolons? |
fedarko
added a commit
that referenced
this issue
May 22, 2019
Also stores feature metadata cols in the rank plot JSON -- will be super easy to retrieve and use these in the viz interface #132
fedarko
added a commit
that referenced
this issue
May 23, 2019
All that's needed to reestablish most of the prior search functionality (and close out #125) will be making filterFeatures() accept the *entire* dataset (or perhaps just a subset based on the search type) and look through that.
fedarko
added a commit
that referenced
this issue
May 23, 2019
Notes: 1) Ignores numeric feature metadata values. In the future, we should support detecting these types? or at least convert to string before doing text searching. 2) No "rank search" equivalent yet, but users can still include semicolons in the search. Might reimplement this if featureMetadataField is "Taxon" or "Taxonomy" or something, but not urgent IMO. 3) would be nice to add multiple queries for searching -- e.g. Confidence > 0.95 AND Taxon contains text "p__Firmicutes". 4) I'd like to include at least the taxonomy in the feature lists in the textareas -- should be doable to "redo" the annotation process in JS
fedarko
added a commit
that referenced
this issue
May 23, 2019
Also made the search functionality now output the entire data object (i.e. the entire "row" for a given feature) instead of just the feature ID. This will let us eventually customize how selected features appear in the textareas.
fedarko
added a commit
that referenced
this issue
May 24, 2019
Shouldn't be *too* difficult to modify filterFeatures() to detect the search type and then apply that, same as before. As discussed in #132 and the various commit messages that reference it, I'd like to eventually support "joint" queries where you can filter on multiple criterion (e.g. "contains this text ... and contains these taxonomic ranks ... and has a confidence greater than ..."), but that sounds like a ton of work. For now, just getting back to the previous functionality in a bug-free state (i.e. with issue #125 knocked out) will be good enough. ...So future issues to make after #132 and #125 are: -support searching by ranges on numerical feature metadata fields -support joint queries across multiple feature metadata fields
This was referenced May 26, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
enhancement
New feature or request
important
Things that are critical for getting Qurro in a working/useful state
This would preserve the original feature IDs. Instead of annotating them by adding on a
|
next to each feature metadata field to make a really long ID, this would store feature metadata somewhere accessible byRRVDisplay
and Vega (probs as a property of each feature in the rank plot data). Then this data would show up naturally split up in the rank plot tooltips, which would look nice.e.g.
Also, the more important benefit of this: we could also store all the feature metadata column names (plus
Feature ID
?) in another easily-accessible place. Then, in the JS, we could useRRVDisplay.populateSelectDOM()
to add feature metadata column names to a list of search options: this would essentially let us just perform exact matching, but limited to whatever feature metadata field we care about (so no worries about having a feature ID that coincidentally has the word "Bacteria" in it, or whatever). This would address #125.The primary downsides are 1) extra storage costs due to having to store feature metadata field names for each feature, and 2) I think this might remove the support for searching by different taxa at once in the same query that's currently available. We could get around 2) by improving the search functionality to make that more explicit (or heck, make "OR" queries doable for every feature metadata field), and 1) isn't that big of a deal (we could use column mapping if it's a huge enough problem).
The text was updated successfully, but these errors were encountered: