Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support altering which ranks are used in the browser #46

Closed
fedarko opened this issue Feb 19, 2019 · 4 comments
Closed

Support altering which ranks are used in the browser #46

fedarko opened this issue Feb 19, 2019 · 4 comments
Assignees
Labels
enhancement New feature or request important Things that are critical for getting Qurro in a working/useful state

Comments

@fedarko
Copy link
Collaborator

fedarko commented Feb 19, 2019

i.e. for the Byrd dataset, this would be supporting something besides log(PostFlare/Flare) + K.

Ideally this would entail us storing all of the ranks for every taxon/metabolite in the rank plot JSON, and then just altering around which rank(s)/combination of ranks is used in the rank plot. This should also alter the y-axis title of the rank plot accordingly.

In terms of actual code changes: right now RRV just gets the first rank column (since the rank_col
parameter of gen_rank_plot() is set as 0 when both the standalone and Q2 scripts call it). I guess in order to fit an arbitrary amount of ranks into the rank plot JSON, we'd just need to throw in multiple coefs Series (we'd have to name these in a consistent way -- maybe ranks0, ranks1, ranks2, ... or coefs0, coefs1, coefs2, ... -- doesn't really matter as long as it's consistent.)

In terms of the JS side of things, the user would be presented with a list of all available ranks in the browser (ideally with a nice description, but I don't think we're guaranteed to have that in the OrdinationResults so this might end up just being an Emperor-esque thing where you have a list of ranks and their "proportion explained" -- this would be a cool place to use a Scree plot as Emperor does). The user could then select a rank column to use, and the rank plot would thus change accordingly.

@fedarko fedarko added enhancement New feature or request important Things that are critical for getting Qurro in a working/useful state labels Feb 19, 2019
@fedarko fedarko self-assigned this Feb 19, 2019
@fedarko
Copy link
Collaborator Author

fedarko commented Feb 19, 2019

A cool benefit of this: if we can change around the ranks being used without actually changing the status of the various taxa/metabolites within the rank plot, that'd be cool to observe how changing a rank changes the proportion of various selected taxa/metabolites.

e.g. if the user just sets numerator = all bacteria, denominator = all viruses, then it'd be cool to see how the colors in the rank plot shift (or don't) based on the ranks changing.

@fedarko
Copy link
Collaborator Author

fedarko commented Feb 21, 2019

Another question: would it be worth devoting some time to developing a way to display sample ranks (as generated by DEICODE)? I looked through the code, and the U variable for sample ranks is passed around a lot but ultimately only really used to match a dataframe of samples with the BIOM table in process_input().

(Update from ~10 months in the future, at a point when I know that the correct terminology is "sample loadings": this is an open issue at #233)

fedarko added a commit that referenced this issue Feb 21, 2019
Since it wasn't getting used for anything after matching it with the
BIOM table's dataframe.

See #46 for a relevant discussion.
@fedarko
Copy link
Collaborator Author

fedarko commented Feb 23, 2019

As a temporary measure, we can let the user specify a rank column to use (replacing the 0 currently in use). This is lower priority than #31, so I'm ok with that for the time being.

fedarko added a commit that referenced this issue Feb 25, 2019
This might require some tweaking regarding implications for #55.
Also the code is a bit ugly.

Anyway, I just gotta adjust the viz interface to handle this stuff
properly:

-detect all given ranks
-use signal to set up a bound input with options being all the
 given rank values, and set this signal to the y-axis field (and
 title? Maybe?)
-if not already done: on signal change (we can use
 view.addSignalListener() or whatever), re-sort the feature ranks
 in ascending order. This could be as simple as just adjusting each
 feature's x value via change().

Once that's properly hammered out, #46 will be done -- there's a lot
less complexity associated with that than there is with #31.
fedarko added a commit that referenced this issue Feb 26, 2019
fedarko added a commit that referenced this issue Feb 26, 2019
I'd still like to make these auto-sort in ascending order before
I close this issue, but this is a big step.

NOTE that as with my preliminary solution for #31 this UI code isn't
very well tested yet, so that's a huge priority: we should validate
the effectiveness of this (#2).
@fedarko
Copy link
Collaborator Author

fedarko commented Mar 5, 2019

For sorting, there's a few options. Here's a few of the ones I've read about today.

  • set a signal listener on the ranks, and when that changes manually perform sorting of the x-values.
    • pros: simple to do, shouldn't need to mess around with the Vega spec too much or even at all
    • cons: might be inefficient due to manual sorting
  • use a Vega collect transform
    • pros: can just insert this directly into a Vega spec via the patch parameter of Vega-Embed; no need to manually do sorting
    • cons: I have literally no idea how to use this
  • use a Vega-Lite sort property
    • pros: easier than collect transform; no need for manual sorting
    • cons: still not sure how to do this. also not sure how easy it'll be to edit which rank is being sorted by -- I guess you'd look at the signal for the rank being used, but we'd be editing the Vega generated from Vega-Lite which can get hairy
  • use an Altair/Vega-Lite/Vega window transform
    • pros: doesn't look that bad; could maybe even use this to generate x values with the ranking option; if done right, could minimize the amount of JS I have to write
    • cons: still not really sure how this works tbh; not sure if creating 0-based ranks is doable easily
      • This example looks like it'd be helpful in using this. As does this one (the entire page, really).

I'll pick one and try to get this done soon.

fedarko added a commit that referenced this issue Mar 7, 2019
Not 100% satisfied with this implementation -- it feels kinda hacky.
I'll look into one of the other options (where we shove more of
the work onto altair/vega/etc) later on. For now, this definitely
works. which is cool.
fedarko added a commit that referenced this issue Mar 7, 2019
Something silly -- since I was just fetching the rank values directly
from the JSON (without Vega in the middle), these values were getting
treated as strings (so the comparison function I defined when sorting
them didn't work). This sort of thing has happened before, so fixing
this was a lot easier than then.

This emphasizes the importance of following through with #62. The less
there is to worry about, the better.
fedarko added a commit that referenced this issue May 17, 2019
Progress towards a way of implementing #46 without a bunch of
crazy custom sorting code in JS.

Things to do before we're done here:
-modify JS code to adjust rank plot spec in the following ways on
 rank change:
        -change encoding.y field
        -change transform[0].sort[0].field
-modify python code to completely replace rankratioviz_x with rrv_x
 (i.e. we're still using a field called rankratioviz_x, but now
 Vega-Lite is generating it for us)
-remove all the unused code that this will replace :D
fedarko added a commit that referenced this issue May 17, 2019
Just need to re-make the rank plot on the change, and then switch
around rrv_x and rankratioviz_x
fedarko added a commit that referenced this issue May 17, 2019
:D

This basically addresses #46. All that's left is removing the explicit
rankratioviz_x values from the python generate code, and I guess
figuring out why the rank plot is so far on the left now lmao.

Also should test that the rank plot JSON is being adjusted accordingly,
but I'm content to roll that under #2 -- this is basically making Vega*
do all the work for us, which is great.
fedarko added a commit that referenced this issue May 17, 2019
Vega-Lite does all the sorting for us now, thanks to #46's new
solution. So no need to do sorting in python.
fedarko added a commit that referenced this issue May 17, 2019
Now, no sorting is done based on the first rank -- Vega-Lite does all
the work for us. This saves some time, and it has the added benefit of
making the tests a bit simpler (fixing them kinda sucked tho lol).

Also, I was able to delete a lot of code by just vectorizing the
addition of the "Classification" column to rank_data.

A few other tiny changes, but IMO the only super important one left to
mention is that we explicitly create a deep copy of V in
gen_rank_plot() now (#63).

At this point, #46 is almost done. All that's left is figuring out why
the rank plot is getting pushed over to the left of the display lol
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request important Things that are critical for getting Qurro in a working/useful state
Projects
None yet
Development

No branches or pull requests

1 participant