Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Using mapping histology groups for plotting -- Part 2 implemention (PR 2 of 4) #911

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
e708bb1
Push the basic notebook so far
cansavvy Jan 12, 2021
d2e0fc7
Make color palette smaller and move set.seed()
cansavvy Jan 12, 2021
f571738
Polishing polishing polishing
cansavvy Jan 12, 2021
15a3651
One more formatting edit
cansavvy Jan 12, 2021
a02c616
Add to CI testing
cansavvy Jan 12, 2021
bc4935a
Merge branch 'master' into cansavvy/mapping-table
cansavvy Jan 12, 2021
5e00fe2
Fix formatting in CI test file whoops
cansavvy Jan 12, 2021
4580439
Merge remote-tracking branch 'cansavvy/cansavvy/mapping-table' into c…
cansavvy Jan 12, 2021
571e8c7
Update chromosomal-instability and figures bits
cansavvy Jan 13, 2021
959b683
Incorporate part of jashapiro review
cansavvy Jan 13, 2021
a173f91
Push a few other changes from jashapiro review
cansavvy Jan 13, 2021
a3c165b
re-run notebook
cansavvy Jan 13, 2021
81b3004
Update documentation
cansavvy Jan 13, 2021
f339b42
Merge branch 'master' into cansavvy/mapping-table
cansavvy Jan 13, 2021
566a7c6
Add back the `Normal` labels
cansavvy Jan 13, 2021
3ceaaf3
Merge remote-tracking branch 'cansavvy/cansavvy/mapping-table' into c…
cansavvy Jan 13, 2021
55819b3
Change summary_group -> display_group
cansavvy Jan 13, 2021
4e4a7f0
Switch to jharenza recomendations (also do capitalization thing)
cansavvy Jan 13, 2021
465477e
Get rid of small_groups_cutoff remnant
cansavvy Jan 13, 2021
8896f0a
Merge branch 'master' into cansavvy/mapping-table
cansavvy Jan 14, 2021
709e623
Incorporate jharenza review -- make Other and Benign gray
cansavvy Jan 14, 2021
a0603b7
Merge remote-tracking branch 'cansavvy/cansavvy/mapping-table' into c…
cansavvy Jan 14, 2021
c34f8d4
Merge branch 'cansavvy/mapping-table' into cansavvy/mapping-README
cansavvy Jan 14, 2021
6ac8a8e
Some tidyverse rearranging because hex_codes weren't actually saving …
cansavvy Jan 14, 2021
35a0700
Okay had my hex_code list/names backwards now we're good
cansavvy Jan 14, 2021
098fece
Merge branch 'cansavvy/mapping-table' into cansavvy/mapping-README
cansavvy Jan 14, 2021
4cdd4dd
Re-run chromosomal instability with the new colors
cansavvy Jan 14, 2021
eadf795
A couple more things in figures/README were out of date
cansavvy Jan 14, 2021
1110c69
Add a re-run of figures/mapping-histology-labels.Rmd to generate_figu…
cansavvy Jan 14, 2021
9a77701
Update cnv-chrom-plot to use new histology colors
cansavvy Jan 15, 2021
975fbf4
Change to mutational-signatures to new histology mappings -- re-run
cansavvy Jan 15, 2021
a479b34
Don't do tolower on all those variables!
cansavvy Jan 15, 2021
5220245
Merge branch 'cansavvy/mapping-table' into cansavvy/mapping-README
cansavvy Jan 15, 2021
d8a599a
Make normal NA
cansavvy Jan 15, 2021
3b238ec
Merge remote-tracking branch 'origin/master' into cansavvy/mapping-RE…
cansavvy Jan 19, 2021
69d7f79
Re-run chromsomal-instability heatmaps
cansavvy Jan 19, 2021
5cdb6af
Merge remote-tracking branch 'origin/master' into cansavvy/update-his…
cansavvy Jan 19, 2021
f6136ce
Re-run cn-status-heatmap.Rmd
cansavvy Jan 19, 2021
86df39f
Re-run known mutational signatures nb
cansavvy Jan 19, 2021
1317989
Add display_order column and instructions
cansavvy Jan 19, 2021
54172ad
Implement `display_order`
cansavvy Jan 19, 2021
48d0761
Also push reran notebook for cn_breaks heatmap
cansavvy Jan 19, 2021
f2764b1
Incorporate jashapiro streamlining suggestion
cansavvy Jan 19, 2021
aec0573
Missed a `dplyr::`
cansavvy Jan 19, 2021
fa67d60
Change to using select and as.factor() and re-run
cansavvy Jan 19, 2021
abff3ff
Drop as.factor and re-run
cansavvy Jan 19, 2021
05969e9
Merge branch 'cansavvy/mapping-README' into cansavvy/update-histology…
cansavvy Jan 20, 2021
451e5e2
Update cnv_status_heatmap with display_order
cansavvy Jan 20, 2021
8b99efb
Re-run known signatures
cansavvy Jan 20, 2021
e264996
Merge branch 'master' into cansavvy/update-histology-colors-1
cansavvy Jan 20, 2021
a796d4b
Merge remote-tracking branch 'origin/master' into cansavvy/update-his…
cansavvy Jan 20, 2021
490ad12
Add jashapiro function change to get rid of spaces
cansavvy Jan 20, 2021
783390b
Re-run mutational signatures so plot names are updated
cansavvy Jan 20, 2021
094dcc6
Push updated notebook html too I guess
cansavvy Jan 20, 2021
0bb8aff
Merge branch 'master' into cansavvy/update-histology-colors-1
jaclyn-taroni Jan 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 28 additions & 48 deletions analyses/cnv-chrom-plot/cn_status_heatmap.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,11 @@ Import color palettes.

```{r}
# Import standard color palettes for project
histology_col_palette <- readr::read_tsv(
file.path(figure_dir, "palettes", "histology_color_palette.tsv")
) %>%
# We'll use deframe so we can use it as a recoding list
tibble::deframe()
histology_label_mapping <- readr::read_tsv(
file.path(figure_dir, "palettes", "histology_label_color_table.tsv")
) %>%
# Select just the columns we will need for plotting
dplyr::select(Kids_First_Biospecimen_ID, display_group, display_order, hex_codes)
```

Read in the divergent color palette and set it up with three colors.
Expand All @@ -114,13 +114,8 @@ divergent_col_palette <- readr::read_tsv(
# Read in metadata
metadata <-
readr::read_tsv(file.path(input_dir, "pbta-histologies.tsv"), guess_max = 10000) %>%
# Easier to deal with NA short histologies if they are labeled something different
dplyr::mutate(short_histology = as.character(tidyr::replace_na(short_histology, "none"))) %>%
# Tack on the sample color using the short_histology column and a recode
dplyr::mutate(sample_color = dplyr::recode(
short_histology,
!!!histology_col_palette
))
# Join on the colors
dplyr::left_join(histology_label_mapping, by = "Kids_First_Biospecimen_ID")
```

### Set up consensus copy number data
Expand All @@ -145,7 +140,7 @@ seg_data <- seg_data %>%
dplyr::select(
metadata,
"Kids_First_Biospecimen_ID",
"short_histology",
"display_group",
"tumor_ploidy"
),
by = c("ID" = "Kids_First_Biospecimen_ID")
Expand Down Expand Up @@ -179,7 +174,7 @@ seg_ranges <- GenomicRanges::GRanges(
end = seg_data$loc.end
),
status = seg_data$status,
histology = seg_data$short_histology,
histology = seg_data$display_group,
biospecimen = seg_data$ID
)
```
Expand Down Expand Up @@ -380,51 +375,36 @@ This annotation object strategy was originally from [chromosomal-instability](ht
histologies <-
data.frame(Kids_First_Biospecimen_ID = bin_calls_df$biospecimen_id) %>%
dplyr::inner_join(metadata %>%
dplyr::select(Kids_First_Biospecimen_ID, short_histology, sample_color)) %>%
dplyr::select(Kids_First_Biospecimen_ID, display_group, display_order, hex_codes)) %>%
# Count numbers of samples per histology group and make new variable with counts
dplyr::group_by(short_histology) %>%
dplyr::mutate(group_n = dplyr::n()) %>%
# Ungroup the data
dplyr::ungroup() %>%
# Temporarily we will put the n = 1 `short_histology` samples in the `Other` group.
#TODO: Remove lines 387 - 400 when this is split into two panels
dplyr::mutate(
short_histology = dplyr::case_when(
group_n < min_group_size ~ "Other",
TRUE ~ as.character(short_histology))) %>%
# Reapply colors to groups
dplyr::mutate(sample_color = dplyr::recode(
short_histology,
!!!histology_col_palette
)) %>%
# ReCount numbers after the Other switches
dplyr::group_by(short_histology) %>%
dplyr::group_by(display_group) %>%
dplyr::mutate(group_n = dplyr::n()) %>%
# Ungroup the data
dplyr::ungroup() %>%
# Add sample sizes
dplyr::mutate(short_histology = factor(paste0(short_histology, " (n = ", group_n, ")"))) %>%
# Put in alphabetical order
dplyr::arrange(short_histology) %>%
# ComplexHeatmap wants this.
tibble::column_to_rownames("Kids_First_Biospecimen_ID")
dplyr::mutate(display_group = factor(paste0(display_group, " (n = ", group_n, ")"))) %>%
# Reorder display_group based on display_order
dplyr::mutate(display_group = forcats::fct_reorder(display_group, display_order)) %>%
# Make sure they are in display_order
dplyr::arrange(display_order) %>%
# Store as rownames
tibble::column_to_rownames("Kids_First_Biospecimen_ID") %>%
# We don't want this to actually be displayed on the heatmap though
dplyr::select(-display_order)
```

Make a color key that's formatted for ComplexHeatmap.

```{r}
# Make color key specific to these samples
histologies_color_key_filtered <- unique(histologies$sample_color)
names(histologies_color_key_filtered) <- unique(histologies$short_histology)

# Drop this column so ComplexHeatmap isn't tempted to plot it
histologies <- dplyr::select(histologies, -sample_color, -group_n)
histologies_color_key_filtered <- unique(histologies$hex_codes)
names(histologies_color_key_filtered) <- unique(histologies$display_group)
Comment on lines +400 to +401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about separate unique() calls preserving order... I think it should work fine, but it seems somehow dangerous to me. So I present an alternative below.

Suggested change
histologies_color_key_filtered <- unique(histologies$hex_codes)
names(histologies_color_key_filtered) <- unique(histologies$display_group)
histologies_color_key_filtered <- histologies %>%
dplyr::select(display_group, hex_codes) %>%
dplyr::distinct() %>%
dplyr::pull(hex_codes, name = display_group)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I like this better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out the name argument for pull is not in our version of dplyr for our docker image, so I can't use the last part, but I will add a distinct to this part but in my subsequent PR of this series.


# Get coordinate start positions
hist_start <- match(names(histologies_color_key_filtered), histologies$short_histology)
hist_start <- match(names(histologies_color_key_filtered), histologies$display_group)

# Get coordinate end positions for each histology group
hist_end <- hist_start + summary(histologies$short_histology)
hist_end <- hist_start + summary(histologies$display_group)

# Get mid points of
mid_points <- floor((hist_start + hist_end) /2)
Expand All @@ -434,7 +414,7 @@ mid_points <- floor((hist_start + hist_end) /2)
# Make text labels for chromosome text
hist_text <- ComplexHeatmap::anno_mark(
at = mid_points,
labels = levels(histologies$short_histology),
labels = levels(histologies$display_group),
which = "row",
side = "right",
labels_gp = grid::gpar(cex = 0.65),
Expand All @@ -443,8 +423,8 @@ hist_text <- ComplexHeatmap::anno_mark(

# Create the Heatmap annotation object
hist_annot <- ComplexHeatmap::HeatmapAnnotation(
df = data.frame(histologies),
col = list(short_histology = histologies_color_key_filtered),
df = data.frame(histologies) %>% dplyr::select(-group_n, -hex_codes),
col = list(display_group = histologies_color_key_filtered),
which = "row",
show_annotation_name = FALSE,
show_legend = FALSE,
Expand Down Expand Up @@ -475,7 +455,7 @@ heatmap <- ComplexHeatmap::Heatmap(
bin_calls_mat,
name = "CN status",
col = color_key,
row_split = histologies$short_histology,
row_split = histologies$display_group,
cluster_columns = FALSE,
cluster_rows = FALSE,
rect_gp = grid::gpar(col = "black", lwd = .0005),
Expand Down
Loading