Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

CNV consensus (8 of 6): Changing Step3 to use Bedtools Subtract #430

Merged
merged 13 commits into from
Jan 14, 2020

Conversation

nhatduongnn
Copy link
Contributor

Purpose/implementation Section

Increase code run time and efficiency

What GitHub issue does your pull request address?

#128

Which areas should receive a particularly close look?

The changes in the cnv_consensus.tsv. The Manta CNV order changed for one singular consensus CNV call.

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I forgot that we had already made the intermediate files tab delimited! The one change is that I think we may as well go ahead and remove the now orphaned get_rid_bad_segments.py script from the repository (it will live on in the git history).

One other related comment: In some discussions around #392, we have decided we would very much like to preserve the cnvkit seg.mean column into this output file. I think this should be as simple as modifying the awk commands to include that column in the output (adding NA, perhaps for the other callers), modifying the first_merge bedtools command to also include that column, and finally altering restructure_column.py to include that column as well.

I don't know how much time you have left to work on this before your break ends, so let me know if you will be able to handle that modification this week, or if it should be handled by one of us.

@nhatduongnn
Copy link
Contributor Author

@jashapiro I just got rid of the get_rid_bad_segments.py file. Sad but a necessary step 😃 .

As for the discussion around #392, I just had a look and it seems that, just like you said, we will have to make adjustment to the awk command by giving cnvkit its seg.mean column and give each of the other two callers a column of NA. We then have to change first_merge and restructure_column.py. With this, the format of any CNV in columns 4, 5, and 6 would be chr:start:end:copy_number:seg.mean.

Sadly, my break ended last Monday and I have been back in school for a week now. That's why it took a while for me to make this PR. However, I would like to see my project until the end so I think I can handle the implementation of seg.mean this week and have you look over it. Will this be part (9 of 6)?

@jashapiro
Copy link
Member

Sadly, my break ended last Monday and I have been back in school for a week now. That's why it took a while for me to make this PR. However, I would like to see my project until the end so I think I can handle the implementation of seg.mean this week and have you look over it. Will this be part (9 of 6)?

Okay, I will look forward to seeing your changes. You can just submit them as a new PR: no need to keep up the numbering if you don't want to!

Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Thank you for your contributions!

@jaclyn-taroni jaclyn-taroni merged commit 998b84e into AlexsLemonade:master Jan 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants