setting directories #23

etr-asu · 2023-02-20T23:37:02Z

Hi,

I am trying to run the DB creation part of the workflow and I am struggling to parse all the locations I need to point to to run on my system locally.
https://github.com/functional-dark-side/agnostos-wf/wiki#db-creation

Running this command:
snakemake --conda-frontend conda --use-conda -j 100 --config module="creation" --cluster-config config/cluster.yaml --cluster "sbatch --export=ALL -t {cluster.time} -c {threads} --ntasks-per-node {cluster.ntasks_per_node} --nodes {cluster.nodes} --cpus-per-task {cluster.cpus_per_task} --job-name {rulename}.{jobid} --partition {cluster.partition}" -R --until creation_workflow_report

With the attached my yaml files (renamed to txt so I can upload them).

Results in this error:

rule gene_prediction:
    input: /data/etrembat/agnostos_test/db_creation_data/TARA_039_041_SRF_0.1-0.22_5K_contigs.fasta
    output: /vol/cloud/agnostos_test/db_creation/gene_prediction/orf_seqs.fasta, /vol/cloud/agnostos_test/db_creation/gene_prediction/orf_partial_info.tsv
    log: logs/gene_stdout.log, logs/gene_stderr.err
    jobid: 11
    benchmark: benchmarks/gene_prediction.tsv
    reason: Missing output files: /vol/cloud/agnostos_test/db_creation/gene_prediction/orf_partial_info.tsv, /vol/cloud/agnostos_test/db_creation/gene_prediction/orf_seqs.fasta
    resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=<TBD>

I am unsure where these files should be since the download for the creation data only has the contigs.fasta files.

Thanks!

config_yaml.txt
config_communities_yaml.txt
cluster_yaml.txt

The text was updated successfully, but these errors were encountered:

genomewalker · 2023-02-21T05:45:36Z

Hi @etr-asu

the error seems to be related that you need to define a path that exists in your system here in config.yaml:

rdir: "/vol/cloud/agnostos_test/db_creation"
idir: "/vol/cloud/agnostos_test/db_creation_data"

Antonio

etr-asu · 2023-02-21T15:50:47Z

Thanks for responding! The config.yaml file now reads:

# This file should contain everything to configure the workflow on a global scale.
# In case of sample based data, it should be complemented by a samples.tsv file that contains
# one row per sample. It can be parsed easily via pandas.
wdir: "/data/etrembat/agnostos-wf/workflow"
rdir: "/data/etrembat/agnostos_test/db_creation"
idir: "/data/etrembat/agnostos_test/db_creation_data"

And I get the same error:

rule gene_prediction:
    input: /data/etrembat/agnostos_test/db_creation_data/TARA_039_041_SRF_0.1-0.22_5K_contigs.fasta
    output: /data/etrembat/agnostos_test/db_creation/gene_prediction/orf_seqs.fasta, /data/etrembat/agnostos_test/db_creation/gene_prediction/orf_partial_info.tsv
    log: logs/gene_stdout.log, logs/gene_stderr.err
    jobid: 2
    benchmark: benchmarks/gene_prediction.tsv
    reason: Missing output files: /data/etrembat/agnostos_test/db_creation/gene_prediction/orf_seqs.fasta, /data/etrembat/agnostos_test/db_creation/gene_prediction/orf_partial_info.tsv
    resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=<TBD>

genomewalker · 2023-02-22T06:21:40Z

Hi @etr-asu
can you send the log files? The message you pasted doesn't show an error, but the reason why the rule is executed.

etr-asu · 2023-02-22T17:01:27Z

Hi @genomewalker, I think the issue is the cluster.yaml file is asking for settings that are more than what my university hpc allows by default (for example I can't ask for a 1000 hour job). Is there a minimum set of requirements you would recommend? Or some other approach for users adapting to a shared hpc system? Thanks!

genomewalker · 2023-02-24T10:31:59Z

Yes, you should adapt the cluster.yaml settings to your HPC system. The time will depend on the size of your dataset. You can use the maximum allowed for the partition you will use. You can check it with the sinfo command.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

setting directories #23

setting directories #23

etr-asu commented Feb 20, 2023

genomewalker commented Feb 21, 2023 •

edited

Loading

etr-asu commented Feb 21, 2023

genomewalker commented Feb 22, 2023

etr-asu commented Feb 22, 2023

genomewalker commented Feb 24, 2023 •

edited

Loading

setting directories #23

setting directories #23

Comments

etr-asu commented Feb 20, 2023

genomewalker commented Feb 21, 2023 • edited Loading

etr-asu commented Feb 21, 2023

genomewalker commented Feb 22, 2023

etr-asu commented Feb 22, 2023

genomewalker commented Feb 24, 2023 • edited Loading

genomewalker commented Feb 21, 2023 •

edited

Loading

genomewalker commented Feb 24, 2023 •

edited

Loading