-
Notifications
You must be signed in to change notification settings - Fork 83
Using mapping histology groups for plotting -- Part 2 implemention (PR 1 of 4) #904
Using mapping histology groups for plotting -- Part 2 implemention (PR 1 of 4) #904
Conversation
…ansavvy/mapping-table
…ansavvy/mapping-table
…ansavvy/mapping-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good to me, but I had a thought that will will likely change some of the readme content if implemented:
One thing that I didn't comment on in previous PRs but I am now thinking more heavily about here is that we should not only have the display histology order standardized as well as the color. Arranging alphabetically doesn't really make sense, as we will want Other
and probably Benign
to be at the end, and there may be other ordering we want to use (largest to smallest?)
I'm guessing this should happen when reading in the data, but we should probably make display_group
a factor. My first thought was to just use `forcats::fct_relevel(display_group, "Other tumor", "Benign", after = Inf) to push those two categories to the end.
On further reflection, it may be worth adding a display_order
column to the data table to make this easier to standardize across scripts: Then you could use forcats::fct_reorder(display_group, display_order)
to make changes that would easily propagate, and can allow us to specify ordering of all factors just as easily. (Do we want to keep LGAT and HGAT next to each other, for example, or order by # of samples)
I like this idea. Will implement.
I'd say small to large makes sense with the exception that Other and Benign should be at the end? |
Large to small, but yeah, I think that is the place to start until somebody expresses a different preference. |
I'm not sure how much this is or isn't an issue, but do different data types have different sample sizes so |
@jashapiro, I added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a suggestion for streamlining, because I was confused by the max_level stuff for longer than I want to admit.
file.path(figures_dir, "palettes", "histology_label_color_table.tsv") | ||
file.path(figures_dir, "palettes", "histology_label_color_table.tsv"), | ||
# Read in each column as a specific type - skip extra columns | ||
col_types = paste0(c("c", rep("-", 6), "f", "i", "c"), collapse = "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need this? I guess it is helpful for skipping columns, but the factor will be automatically converted when you call fct_reorder, and the order should get read in correctly as numeric. You are implicitly specifying column order, which is something I try to avoid as it can be brittle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that I'm not happy about this thing which is rigged with respect to column order. However, fct_reorder will not as the table here is read in. display_group
does not get read in as a factor which is what is causing the problem.
I can instead get rid of this col_types
line and add back a mutate
where we force it to be a factor first and then also add back a select
to get to the four columns we want. (this is what I had previously. It includes more steps but it is less brittle with respect to column order).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I've never had forcats
complain about being given a character vector. Maybe I haven't used whatever old version we currently have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched back to using a select
and as.factor()
. It may be more official "step" but it is more robust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I must be imagining things. I dropped the as.factor and it works fine. Never mind everything I said. will push that change.
analyses/chromosomal-instability/02a-plot-chr-instability-heatmaps.Rmd
Outdated
Show resolved
Hide resolved
Things have been streamlined. Thanks for those suggestions. @jashapiro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just a few little cleanup suggestions.
Purpose/implementation Section
What scientific question is your analysis addressing?
This PR addresses Step 3 and part Step 4 discussed in #897 and #898. This PR uses the
palettes/histology_label_color_table.tsv
table that is created by the notebook added in #899What was your approach?
This PR is 1 of a few PRs where I am updating things in response to the table added in #898
Here's what's accomplished in this PR:
palettes/histology_label_color_table.tsv
palettes/histology_label_color_table.tsv
in one of the fives places (chromosomal-instability heatmaps) mentioned in Make mapping histology groups for plotting -- Part 2 implementing the use of the mapping table #898 (needed to try out this some place so I knew how to update the README instructions).palettes/histology_color_palette.tsv
which is now in the process of being retired (as described in Make mapping histology groups for plotting -- Part 2 implementing the use of the mapping table #898).Note that the
palettes/histology_color_palette.tsv
file itself is not deleted yet, I will do this in the next PR so the other modules that use this file don't break in CI yet.What GitHub issue does your pull request address?
Closes #897 and
begins to address #898
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
Yes. Take a look at the chromosomal-instability plots here to see what you think:
chromosomal-instability-heatmaps.zip
Results
What is your summary of the results?
Results are the same, the labels are changed.
Reproducibility Checklist
Documentation Checklist
README
and it is up to date. --the README in figures is now up to date.analyses/README.md
and the entry is up to date. -- it is not because its not located in the analyses folder.