Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Memory Consumption During QC Process #198

Open
Keep-Raining opened this issue Dec 31, 2024 · 5 comments
Open

High Memory Consumption During QC Process #198

Keep-Raining opened this issue Dec 31, 2024 · 5 comments

Comments

@Keep-Raining
Copy link

Keep-Raining commented Dec 31, 2024

When running the QC step, the RAM usage spikes unexpectedly, leading to the process being killed by the operating system.
I am using the polars branch of pycisTopic and the code is in the below:

pycistopic qc \
    --fragments data/E11.5_part2.bed.gz \
    --regions outs2/consensus_peak_calling/E11.5_consensus_regions.bed \
    --tss outs2/qc_part2/mm10-tss_modified.bed \
    --output outs2/qc_part2/E11.5_part2

I also tried setting n_threads to 2 but it doesn't work neither.
And there is about 100,000 cells in my data and I have ~350GB memory
I wonder how much memory should I request to run the QC step with large datasets?

@ghuls
Copy link
Member

ghuls commented Dec 31, 2024

Do you know in which part of the QC, you run out of memory?
How big is your fragments file?

zcat data/E11.5_part2.bed.gz | wc -c

@Keep-Raining
Copy link
Author

Do you know in which part of the QC, you run out of memory? How big is your fragments file?

zcat data/E11.5_part2.bed.gz | wc -c

I'm not sure I ran out of mem in which part ... and now I have to spilt my files for the QC process and merge the cistopicOBJ created later.
And my fragment file is 92.84GB.

@ghuls
Copy link
Member

ghuls commented Jan 1, 2025

If you can paste the output of the pycistopic qc command, should indicate at which step it failed.

@Keep-Raining
Copy link
Author

If you can paste the output of the pycistopic qc command, should indicate at which step it failed.

I don’t remember any output besides “killed.” Perhaps it failed at the beginning? Now I have split my data and the mem is sufficient. Perhaps I will offer more information if I need to do QC on the large data

@ghuls
Copy link
Member

ghuls commented Jan 6, 2025

The progresss is written to the log file: outs2/qc_part2/E11.5_part2.pycistopic_qc.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants