-
Notifications
You must be signed in to change notification settings - Fork 83
TCGA Lancet data has only "calls" with t_alt_count = 0 or NA #512
Comments
Hi @cansavvy - thanks for noticing this - was this also an issue in V13? I am assuming so, since this file should not have changed. I am cc-ing @tkoganti, @migbro, and @yuankunzhu for this. |
I can confirm that it was also an issue for v13: > lancet <- data.table::fread("data/release-v13-20200116/pbta-tcga-snv-lancet.vep.maf.gz", data.table = FALSE)
> summary(lancet$t_alt_count)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0 0 0 0 0 0 622 |
Quick spot check on individual MAFs - seems they are all like this, so not a merge/filter issue (no filtering for these). I asked those above to check out the run, as it appears something went wrong there. |
@cansavvy and @jaclyn-taroni it looks like for some reason, all of the lancet tasks were run with the tumor/normal inputs swapped, and the correct IDs were used in VEP/VCF2MAF, and that is why the strange output. @migbro is queueing these up to rerun tonight and I will ask @tkoganti to pick up VCF2MAF/merge in the morning. The strelka2/mutect2 MAF runs were run correctly. Thank you again for finding this! |
readme update for upcoming lancet MAF per issue [here](#512)
* add v14 release docs -update `release-notes.md` -update `data-files-description.md` -update `data-formats.md` * Update download-data.sh add new folder for V14 to download scipt * remove intersect_cds_WXS.bed per @cansavvy [comment](#432 (comment)) * add intersect_cds_lancet.bed and description from @cansavvy [comments](#507 (comment)) * Update release-notes.md - add removal of polyA+stranded samples that were still in file in v13 * Update data-formats.md add more information on gistic output files, to replace PR [#456](#456) * Reorganize derived CN section and make formatting consistent * Add links to relevant subtyping modules * Update release-notes.md readme update for upcoming lancet MAF per issue [here](#512) * Update doc/release-notes.md yup, nice catch Co-Authored-By: Jaclyn Taroni <jaclyn.n.taroni@gmail.com> * Update release-notes.md fix embryonal broad histology Co-authored-by: Jaclyn Taroni <jaclyn.n.taroni@gmail.com>
closed via #507 |
What data file(s) does this issue pertain to?
The TCGA Lancet data :
pbta-tcga-snv-lancet.vep.maf.gz
of version 14What release are you using?
v14
Put your question or report your issue here.
I was confused because Lancet's data for TCGA was not agreeing at all with Mutect or Strelka.
I looked into the VAF distributions and saw Lancet was all zeroes, because
t_alt_count
is only zeroes and NAs.t_alt_count
's of 0 shouldn't be calls. Did something happen with a filtering step? It looks like it may have been filtered byn_alt_count
> 0 ?As a positive control, and by contrast, Mutect and Strelka show no 0's for
t_alt_count
The text was updated successfully, but these errors were encountered: