Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

selection_interval docs describe unusable "fields" parameter #3000

Closed
dpoznik opened this issue Mar 27, 2023 · 12 comments · Fixed by #3001
Closed

selection_interval docs describe unusable "fields" parameter #3000

dpoznik opened this issue Mar 27, 2023 · 12 comments · Fixed by #3001
Labels

Comments

@dpoznik
Copy link
Contributor

dpoznik commented Mar 27, 2023

I'm not sure whether this is just a documentation bug or an actual bug, as I've only been using altair for one day.
I am using version "5.0.0rc1".

The altair.selection_interval docs include a description of a fields parameter, but attempting to use it:

import altair as alt
alt.selection_interval(fields=["iid"])

elicits:

SchemaValidationError: `IntervalSelectionConfig` has no parameter named 'fields'

#2365 suggests that this is indeed just a documentation bug. Perhaps it arose in #2908?

Thanks!

@dpoznik dpoznik added the bug label Mar 27, 2023
@joelostblom
Copy link
Contributor

Thanks for reporting and welcome to the Altair community! You're correct that selection_interval should not list fields in the docstring (it seems like the signature is correct). @ChristopherDavisUCI , I think it would be enough to just delete this section, does that seem correct to you?

https://github.com/altair-viz/altair/blob/37e91ec2174fe6716c7c8372b0bbdb1800340d31/altair/vegalite/v5/api.py#L492-L494

@dpoznik
Copy link
Contributor Author

dpoznik commented Mar 27, 2023

OK, thanks for confirming and for the quick response!

@ChristopherDavisUCI
Copy link
Contributor

Thank you! I will take a look today. My first impression is that this fields parameter should be usable, but I'll have to look more closely.

@joelostblom
Copy link
Contributor

Great, thank you. My understanding is that it is only usable for point selection but not for intervals as per https://vega.github.io/vega-lite/docs/selection.html#current-limitations so maybe the docstring from point selection was just copied over?

@ChristopherDavisUCI
Copy link
Contributor

Ah, thanks @joelostblom, the reason I thought fields was usable was because it is listed here: https://vega.github.io/vega-lite/docs/selection.html#selection-props

You seem to be right though, so please go ahead with the change you mentioned #3000 (comment)

@joelostblom
Copy link
Contributor

Sounds good! @dpoznik Since you reported this initially, are you interested in contributing by removing the lines I linked above in a PR? No worries if not, I can do it too.

@dpoznik
Copy link
Contributor Author

dpoznik commented Mar 28, 2023

@joelostblom, sure I'd be happy to contribute. Will do!

Could I ask a question though? Since I am brand new to this package, I'll take no offense at a response along the lines of ~"read the docs" :)

When I saw the fields parameters in the selection_interval docs, it seemed perfectly suited to my needs. Would you be willing to point me to the "right" way to accomplish the following?

I've got a dataframe, df, with columns: ["id", "x", "y", "a", "b", "c"], where "id" is an identifier, and the other columns are float measurements for the given ID. I would like to generate a scatter plot of "y" vs. "x" and, for a given interval selection on the scatter plot, view the distributions of "a", "b", and "c" as violins.

Having seen the fields parameter in the docs, my initial attempt was to melt df into melted_df with columns ["id", "category", "value"], where "category" takes values {"a", "b", "c"} and then do something like:

selection = alt.selection_interval(fields=["id"])

points = (
    alt.Chart(df)
    .mark_point(...)
    .encode(...)
    .add_params(selection)
    .properties(...)
)

violins = (
    alt.Chart(melted_df)
    .transform_density(...)
    .mark_area(...)
    .encode(...)
    .properties(...)
    .transform_filter(selection)
)

points & violins

Thanks!

@dpoznik
Copy link
Contributor Author

dpoznik commented Mar 28, 2023

P.S. It works to instead melt df such that "iid", "x", and "y" are ID variables, generate the scatter plot on the subset for which category == "a" (i.e., one row per "iid"), and then generate box plots for the selection interval instead of violin plots. That should probably be fine for my purposes. I got the violins to work fine on their own but not with the selection interval.

@joelostblom
Copy link
Contributor

It sounds like you were able to work out your issue, but in case you haven't seen it already this gallery example might be helpful https://altair-viz.github.io/gallery/selection_histogram.html (although is sounds like it might be similar to what you already have figured out with the melting into a category variable similar to how that example uses origin).

@dpoznik
Copy link
Contributor Author

dpoznik commented Mar 30, 2023

Cool, thanks. What I realized to be the case was that the interval selection would link the two charts via "x" and "y", as long as these columns were present in the dataframe from which the second chart was constructed, thereby obviating the need to specify a fields linker.

So I could transform the original dataframe in two ways:

  • df_scatter for the scatter plot, with columns:
    • "x", "y"
    • all the metadata columns for the scatter plot tooltip, fill color, etc.
  • df_box for the box plots, with columns:
    • "x" and "y", for linking to the scatter plot via the interval selection
    • ["a", "b", "c", ...] melted (into "category", and "value")
    • no metadata columns

This worked great with box plots, so I think my issue was specific to the violin plots I'd initially attempted. I haven't had a chance to debug that version though. Thanks!

@dpoznik
Copy link
Contributor Author

dpoznik commented Mar 31, 2023

so I think my issue was specific to the violin plots I'd initially attempted

Just to close the loop on this... I had a chance to explore a bit further, and I figured out why the interval selection was not working with violins. The transform_filter call must occur before the transform_density call used to construct the violin plots. Makes sense :)

@joelostblom
Copy link
Contributor

Ah yes that makes sense, thanks for sharing your solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants