Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about different version. #353

Closed
Shenglai opened this issue Mar 3, 2024 · 9 comments
Closed

Question about different version. #353

Shenglai opened this issue Mar 3, 2024 · 9 comments

Comments

@Shenglai
Copy link

Shenglai commented Mar 3, 2024

My name is Shenglai Li, and I am reaching out from GDC regarding an issue we encountered while investigating the possibility of upgrading the version of PureCN for our new release.

We have been using PureCN version 2.2.0 for our production environment, and it has been functioning well. However, upon testing the upgraded version (2.6.4 and the latest), we encountered failures when using a specific capture kit (Sureselect v5). Unfortunately, due to data privacy concerns, we are unable to share the data with you for debugging purposes.

I have attached the logs from both versions (purecn.2.2.0.log and purecn.2.6.4.log) for your reference.

purecn.2.2.0.log
purecn.2.6.4.log

I apologize that the uuids and file names do not look like the same but I am pretty sure there are a few cases they failed on new version and passed on older version. Also I tried not only version 2.6.4 but also the latest version, I think they will fail as well.

Given the circumstances, we would greatly appreciate your insight on which parts of the code we should investigate independently. Any guidance or suggestions you can provide would be immensely helpful in resolving this issue.

@lima1
Copy link
Owner

lima1 commented Mar 3, 2024

It should not crash like that, but looks like no variants are passing filters. Might be related to #320 . I would turn off the base quality filter and make sure this is dealt with upstream.

@lima1
Copy link
Owner

lima1 commented Mar 3, 2024

Also the population allele frequency check labels only 7 as germline. Make sure that germline are not filtered out (especially tumor/normal pairs).

@Shenglai
Copy link
Author

Shenglai commented Mar 3, 2024

I think we ran it with tumor only samples. The upstream was handle by GATK4 Mutect2 pipeline (4.2.4). Will definitely try turn off the base quality filter. The upstream is running https://github.com/NCI-GDC/gatk4_mutect2_cwl/blob/master/subworkflows/gatk4.2.4.1_mutect2_workflow.cwl FYI. I believe it's running Mutect2 best practices filtering only. (Filtering alignment artifacts, and etc.)

@lima1
Copy link
Owner

lima1 commented Mar 4, 2024

I think 4.2.4 should not suffer from the BQ issue. Can you check that the exact same sample works with old PureCN?

@Shenglai
Copy link
Author

Shenglai commented Mar 4, 2024

I'm still in the progress of checking the sample that works with old PureCN. (Sorry for the delay. Our system is under migration.)
However, for the one that I posted with only 7 as germline, it fails at both versions.
purecn.2.2.0.log
purecn.2.8.1.log
Is there anything I can do to get it pass by PureCN? For this particular sample, from the Mutect2 VCF, there's no variants labeled as germline nor panel_of_normals unfortunately.

@lima1
Copy link
Owner

lima1 commented Mar 4, 2024

I assume that's a tumor/normal pair that was not run in Mutect2 with the flags to genotype germline? Then there is nothing we can do, PureCN needs germline calls. The runtime is not much longer and the filtering is trivial, so it wouldn't be a big change for your T/N pipeline. Alternatively, you can merge somatic and germline VCFs, but that's more of headache. You can also skip any somatic variants, but this then obviously won't give you the subclonality. In samples without many CNVs, somatic mutations also provide some purity signal (like in microsatellite high CRC).

@Shenglai
Copy link
Author

Shenglai commented Mar 4, 2024

Unfortunately it's tumor only that's run in Mutect2. :(

@lima1
Copy link
Owner

lima1 commented Mar 4, 2024

You'll figure it out (maybe from reading the GATK logs) :-) Probably some germline filtering going on somewhere.

@Shenglai
Copy link
Author

Shenglai commented Mar 6, 2024

Thanks for your input. I don't think I have the initial issue. At least now I actually can not find the jobs that failed on later version would pass on old version. Will close the issue and thanks for the help again!

@Shenglai Shenglai closed this as completed Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants