-
Notifications
You must be signed in to change notification settings - Fork 83
CNV consensus (7 of 6): Remove duplicated coordinates and NULLs #416
CNV consensus (7 of 6): Remove duplicated coordinates and NULLs #416
Conversation
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…t_calling_updated.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…t_calling_updated.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fingerfen. This looks good. I suggested sorting the coordinates before writing, and a minor reversion to fix, but that's about it.
Thank you for your contributions!
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
@jashapiro Thanks for catching the reversion. That was not on purpose. Thank you for your effort and quick responses. Also, since we are finished here, I just want to revisit the 3rd (3 of n) PR that I made. Recall you wanted to see if Bedtools Subtract could be a quicker alternative to my Python script. Do you think we should go back at this point to take a look and make changes to that? If I recall correctly, that would be the only part left to improve in this pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, with just a fix to potential sorting error to add (and a format suggestion). The sorting fix would mean rerunning the last step though to generate new output.
On the topic of updating step 3, yes, I think that is worth doing (in a separate PR). I looked back at some test code I had written, and it looks like you should be able to use:
"bedtools subtract -N -a {input.bedfile} -b {input.bad_list} -f 0.5 > {output.filtered}"
in place of your get_rid_bad_segments.py
script, assuming the input files are actually bedfiles, which should just mean replacing space delimiters with tabs in the intermediate files (making them true bedfiles rather than pseudo-bed).
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
analyses/copy_number_consensus_call/src/scripts/remove_dup_NULL_entries.py
Outdated
Show resolved
Hide resolved
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
…L_entries.py Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
Co-Authored-By: jashapiro <josh.shapiro@ccdatalab.org>
@fingerfen - it looks like the last addition before you merged in the master branch (7b500c4) failed on the copy number consensus step: https://circleci.com/gh/AlexsLemonade/OpenPBTA-analysis/2390 Here's the full output: build_2390_step_116_container_0.txt |
I just tested the line break fix that @jashapiro just did and it worked on my local. I am not sure why it is failing on CI at the momment. |
It didn't really fail... I just merged in master before it finished running. |
@fingerfen I am just waiting on the update to |
@fingerfen I'm guessing the last commit (f72938a) was to respond to my request for the updated version, but that one doesn't seem to be the latest... it still uses |
@jashapiro I apologize. I didn't re-run the merging step. I just re-run it and this new file should have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! We'll merge as soon as tests finish again.
Purpose/implementation Section
Remove duplicated coordinates and NULLs in the result file
What GitHub issue does your pull request address?
#128
analyses/README.md
.