Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

V15 Fix molecular-subtype-chordoma analysis #590

Merged
merged 15 commits into from
Mar 2, 2020

Conversation

cansavvy
Copy link
Collaborator

@cansavvy cansavvy commented Mar 2, 2020

Purpose/implementation Section

There are not enough chordoma samples in the subset CircleCI files to test so I switched to the approach that was used in the other subtyping modules.

Looks like the rename steps were not working because of library load and it was not calling the dplyr::rename but another rename. So to fix I added a series of dplyr::, reran the notebook with v15 and pushed the results.

What GitHub issue does your pull request address?

#574

Reproducibility Checklist

These items were all taken care of previously.

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

@cansavvy cansavvy mentioned this pull request Mar 2, 2020
@cansavvy
Copy link
Collaborator Author

cansavvy commented Mar 2, 2020

@jaclyn-taroni I have two fix ideas for the lack of chordoma samples in subset for this notebook.

  1. Force the subset to include chordoma samples by adding another step in create-subset-files (This is the more painful route since we'd have to re-run the subsetting steps).
  2. Make this filter step in the notebook only occur when it is NOT being run in CircleCI:
    smarcb1_expression <- smarcb1_expression[, which(colnames(expression_data) %in% chordoma_samples) ]

At this point in time, we are thinking 2 would be okay, but I wanna clear it by you before I implement it.

@jaclyn-taroni
Copy link
Member

There is another option - have the first step of this module be it’s own subsetting where all resulting files are committed to the repository. That is what we do for other subtyping modules.

@cansavvy
Copy link
Collaborator Author

cansavvy commented Mar 2, 2020

There is another option - have the first step of this module be it’s own subsetting where all resulting files are committed to the repository. That is what we do for other subtyping modules.

I was just noticing this for the other subtyping modules. I will try to follow their suit then.

@cansavvy cansavvy requested a review from cbethell March 2, 2020 20:10
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small things, while biggish changes are happening.

@jaclyn-taroni
Copy link
Member

fusion-summary is commented out in a6e2923 #569 because it's going to require a bigger change (because the fusion files are part of what is changing in v15) which is tracked with #578. I intended for #578 to wait until after v15 so we don't hold things up.

@jashapiro
Copy link
Member

Meant to leave that comment in the main thread...

Copy link
Contributor

@cbethell cbethell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the comments below are addressed 👍

```
This bash script the same regardless of where it is called and will first subset the data to `Chordoma` samples
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're missing a verb here, correct?

# Subset metadata
subset_metadata <- histologies_df %>%
dplyr::filter(short_histology == "Chordoma") %>%
select(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're calling the filter function above using dplyr:: but not doing the same for select here. Is this because filter was behaving similarly to rename as you noted in your original comment? If so, then disregard this comment. If not, removing the dplyr:: altogether (except for in the case of the rename function) or adding the dplyr:: to the rest of the functions would be nice for consistency purposes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter also has a base:: function that can mess with stuff sometimes.

# remove large expression matrix that's no longer needed
rm(expression_data)
# now only the columns correspond to chordoma samples
smarcb1_expression <- smarcb1_expression[, which(colnames(subset_expression_data) %in% subset_metadata$Kids_First_Biospecimen_ID) ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line looks weird (format-wise), did you run this through a styler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did but it doesn't do anything with really long lines. I'd rather use dplyr::filter here anyway but was trying to keep the original code here.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jashapiro
Copy link
Member

Merging before expected CI failure.

@jashapiro jashapiro merged commit 2a86c8e into AlexsLemonade:master Mar 2, 2020
@cansavvy cansavvy deleted the v15-fix-chordoma branch March 25, 2020 20:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants