Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Hex code update from adding cancer_group as display_group #1142

Merged
merged 23 commits into from
Aug 25, 2021

Conversation

kgaonkar6
Copy link
Collaborator

@kgaonkar6 kgaonkar6 commented Aug 17, 2021

Purpose/implementation Section

What scientific question is your analysis addressing?

Add updated hex codes for plotting groups.

What was your approach?

Updated figures/mapping-histology-labels.Rmd to generate hex codes for cancer_group which was extracted from harmonized_diagnosis, additionally we removed benign and non-tumors and any group which has less than 5 counts

What GitHub issue does your pull request address?

#1140

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Which areas should receive a particularly close look?

NA

Is there anything that you want to discuss further?

NA

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

table

What is your summary of the results?

38 cancer_group are now available.

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@kgaonkar6
Copy link
Collaborator Author

CI fails because of "High-grade glioma/astrocytoma" used as the filename which is not valid since the "/" in unix denotes the sub-folder "astrocytoma" which does not exist. We will update to display_group in #1142

@jharenza
Copy link
Collaborator

Just want to be sure we are keeping display group as is and adding cancer group as a separate set of hex codes. Can check in later.

@kgaonkar6
Copy link
Collaborator Author

This is the description of the 2 columns in the output file:

- `display_group` - the high-level histology labels that should be used for plotting
- `hex_codes` the direct colors that should be used for plotting

My understanding from the comment here was that we will use cancer_group as the disease label/grouping (ie. display_group right?) #917 (comment).

But did you want to use 1 set of hex_codes for display_group ( which in master branch which is extracted from broad_histology) and another for set of hex_codes for cancer_group ? Then we will update plotting scripts in oncoprint/mutsig to use the cancer_group column values instead of values in display_group column?

@jharenza
Copy link
Collaborator

I started reviewing some of the figures and I think we need to keep some at a broader display group, have some be annotated with cancer group, but may have some annotated with both, so hex codes for both, yes. Sorry for the confusion - I had not dove into the figures when making that comment, but started making notes in the GSlides doc here and making a ticket here #1144. For now, I would say simply add a new cancer_group with respective hex codes.

That being said, I know we were talking about not including colors for some benign/ N<5 groups because those groupings are too low for group analyses. However, when looking at the transcriptomic reduction figure, I think that we should not remove any samples from that, for example. I think we might have to do the filtering within modules, rather than in this hex code module - does that make sense?

@kgaonkar6
Copy link
Collaborator Author

kgaonkar6 commented Aug 18, 2021

I started reviewing some of the figures and I think we need to keep some at a broader display group, have some be annotated with cancer group, but may have some annotated with both, so hex codes for both, yes. Sorry for the confusion - I had not dove into the figures when making that comment, but started making notes in the GSlides doc here and making a ticket here #1144. For now, I would say simply add a new cancer_group with respective hex codes.

Thank you for the clarification. I believe my code update in c2a9392 satisfies the requirements above.

That being said, I know we were talking about not including colors for some benign/ N<5 groups because those groupings are too low for group analyses. However, when looking at the transcriptomic reduction figure, I think that we should not remove any samples from that, for example. I think we might have to do the filtering within modules, rather than in this hex code module - does that make sense?

That makes sense, I'll remove the <5 filter .

However cancer_group is NA for Other, Benign tumor, Dysplasia/Gliosis and Normal samples.
Should I update the hex color codes to be filtered for tumor samples , so when cancer_group==NA we know they are Other, Benign tumor, Dysplasia/Gliosis so we can add the gray hex code.
Note Normal samples will be removed from the hex_code assignment I don't see a use-case for the normal samples getting a cancer_group_hex_code.

Currently for display_group in master (extracted from broad_histologies) adds a gray color for benign and other tumors and black for Normal samples.

@jharenza
Copy link
Collaborator

Should I update the hex color codes to be filtered for tumor samples , so when cancer_group==NA we know they are Other, Benign tumor, Dysplasia/Gliosis so we can add the gray hex code.

This sounds good for consistency and agree with not having normals in there

Copy link
Collaborator

@jharenza jharenza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to test out some other color palettes to try to make sure the new colors are cohesive with the display_group, but in the meantime, made some other notes.

Comment on lines 208 to 211
# Keep only cancer_groups with 5 or more counts
dplyr::filter(cancer_group_n >=5,
# remove NA
!is.na(cancer_group)) %>%
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Keep only cancer_groups with 5 or more counts
dplyr::filter(cancer_group_n >=5,
# remove NA
!is.na(cancer_group)) %>%
# Remove NA
dplyr::filter(!is.na(cancer_group)) %>%

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not removing the cancer_group NA now but keeping them and adding a gray color

kgaonkar6 and others added 8 commits August 18, 2021 23:48
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
Co-authored-by: Jo Lynne Rokita <jharenza@gmail.com>
@jharenza
Copy link
Collaborator

per slack discussion, @kgaonkar6 will update cancer group hex codes to:

"#ff0000",
"#f20000",
"#997373",
"#403030",
"#330700",
"#ff9180",
"#591800",
"#b2502d",
"#cca799",
"#ff6600",
"#ffb380",
"#7f5940",
"#cc6d00",
"#331b00",
"#ccb499",
"#ffaa00",
"#996600",
"#594316",
"#ffd580",
"#ffee00",
"#998f00",
"#999673",
"#303300",
"#fbffbf",
"#ccff00",
"#494d39",
"#b5d96c",
"#6a8040",
"#66ff00",
"#42a600",
"#bfffbf",
"#003307",
"#00661b",
"#00ff88",
"#86b39e",
"#00b377",
"#006652",
"#00ffee",
"#00a7b3",
"#bffbff",
"#567173",
"#00ccff",
"#003d4d",
"#00aaff",
"#267399",
"#0088ff",
"#0042a6",
"#001a40",
"#bfd9ff",
"#0044ff",
"#394973",
"#000e66",
"#bfbfff",
"#9180ff",
"#5800a6",
"#754d99",
"#aa00ff",
"#3a3040",
"#aa86b3",
"#530059",
"#ff00ee",
"#a60085",
"#330022",
"#ff80d5",
"#ff0088",
"#804062",
"#a60042",
"#590024",
"#ffbfd9",
"#ff0044",
"#990014",
"#ff8091"

@jharenza jharenza added blocked Blocked by factors external to this project don't merge labels Aug 19, 2021
@kgaonkar6
Copy link
Collaborator Author

Thanks for the colors @jharenza ! I've updated the colors and I'm using v21 histologies file now.

Copy link
Collaborator

@jharenza jharenza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colors look better

@jaclyn-taroni
Copy link
Member

Can I ask why this is marked with blocked and don't merge? Do we want #1157 to go in first?

@jharenza
Copy link
Collaborator

The cancer groups are being updated with the new histology file (we had GNG and Ganglioglioma separately and there were some issues with the / in the names.

@jharenza jharenza removed the blocked Blocked by factors external to this project label Aug 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants