Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singularity image trying to access Amazon S3 #1778

Closed
dmd opened this issue Sep 16, 2019 · 35 comments
Closed

Singularity image trying to access Amazon S3 #1778

dmd opened this issue Sep 16, 2019 · 35 comments
Labels
container faq Questions to add to the FAQ templateflow

Comments

@dmd
Copy link

dmd commented Sep 16, 2019

In 1.5.0, the Singularity image fails with:

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-OASIS30ANTs/tpl-OASIS30ANTs_res-01_T1w.nii.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x2aaacc203198>: Failed to establish a new connection: [Errno 110] Connection timed out'))

@dmd
Copy link
Author

dmd commented Sep 16, 2019

This fails because compute nodes, at our institution, do not have internet access. Assuming this is something 1.5.0 needs (this didn't happen in 1.4.1), is there a way to tell the process doing this inside the container to go via a proxy?

@effigies
Copy link
Member

I believe this is and #1777 are duplicates of #1699, but I might be wrong...

Can you try:

export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow

@dmd
Copy link
Author

dmd commented Sep 16, 2019

I'm having trouble understand what's wanted here. What of the following do I need?

  1. Does the container need internet access, which case I must set SINGULARITYENV_https_proxy?

  2. Do I need to map /home/fmriprep (-B $HOME:/home/fmriprep) ?

  3. Do I need SINGULARITYENV_TEMPLATEFLOW_HOME?

Or some combination of these?

@oesteban
Copy link
Member

  1. Does the container need internet access, which case I must set SINGULARITYENV_http_proxy?

Yes, it does. However, you can pre-download everything you need and run on a node without internet access.

2. Do I need to map /home/fmriprep (-B $HOME:/home/fmriprep) ?

Although this might do, I would recommend to map only the templateflow folder (see below)

3. Do I need SINGULARITYENV_TEMPLATEFLOW_HOME?

I think this is the easiest route to combine the answers to your three questions.

Binding a TemplateFlow home

This would be the recommendation assuming your nodes have internet access:

  1. Prepare a folder where templates will be stored (i.e., a persistent file-system, which you probably want to make available to a group of users that will run fMRIPrep). Let's say this path in the host is /share/group/TemplateFlow.
  2. Before running the singularity image, tell Singularity that you want to set up some environment variables. In particular, you'll change the templateflow home folder:
export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow
  1. When running fMRIPrep, bind the templateflow folder: -B /share/group/TemplateFlow:/home/fmriprep/.cache/templateflow.

Notes:

  • Some singularity installations auto-bind $HOME. Of those, some will update the $HOME variable and some won't. This is very problematic, and that's why I'm suggesting to bind only the templateflow folder AND setting $SINGULARITYENV_TEMPLATEFLOW_HOME.
  • Some singularity installations will not allow you to bind a directory into a path that does not exist in the image. This is why I'm suggesting /home/fmriprep/.cache/fmriprep for the bind point.
  • Make sure that fMRIPRep users have permissions to access /share/group/TemplateFlow.
  • Make sure you run singularity with the flag --cleanenv so that the local $TEMPLATEFLOW_HOME does not leak into the container.

When you don't have internet access from compute nodes.

You'll need to prepare the /share/group/TemplateFlow accordingly. The easiest route is using a Python 3 environment.

  1. Point templateflow to the shared folder:
export TEMPLATEFLOW_HOME=/share/group/TemplateFlow
  1. Install templateflow
python -m pip install templateflow
  1. Fetch those templates you plan to use:
python -c "from templateflow import api; api.get('MNI152NLin2009cAsym')"
python -c "from templateflow import api; api.get('OASIS30ANTs')"
  1. Follow the previous guidelines (when internet access is available), and make sure you run fMRIPrep with the flag --notrack to avoid further internet access attempts.

@dmd
Copy link
Author

dmd commented Sep 16, 2019

Thanks for this.

If I only bind:

-B /data/TemplateFlow:/home/fmriprep/.cache/templateflow

then I get:

Traceback (most recent call last): File "/usr/local/miniconda/bin/fmriprep", line 10, in <module> sys.exit(main()) File "/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/cli/run.py", line 311, in main opts = get_parser().parse_args() File "/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/cli/run.py", line 284, in get_parser latest = check_latest() File "/usr/local/miniconda/lib/python3.7/site-packages/fmriprep/cli/version.py", line 22, in check_latest cachefile.parent.mkdir(parents=True, exist_ok=True) File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1241, in mkdir self._accessor.mkdir(self, mode) OSError: [Errno 30] Read-only file system: '/home/fmriprep/.cache/fmriprep'

It looks like it still wants to write there.

My complete command now is:

export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow
/cm/shared/singularity/bin/singularity run \
    -B /data:/data -B /data1:/data1 -B /data2:/data2 -B /data3:/data3 -B /cm/shared:/cm/shared \
    -B /data/TemplateFlow:/home/fmriprep/.cache/templateflow \
    --cleanenv \
    /cm/shared/singularity/images/fmriprep-1.5.0.simg \
    /data/ajanes/testing \
    /data/ajanes/testing/derivatives \
    participant \
    --fs-license-file /cm/shared/freesurfer-6.0.1/license.txt \
    --participant_label acj \
    --output-spaces "MNI152NLin2009cAsym:res-2 anat func fsaverage" \
    --n_cpus 4 \
    --mem-mb 8192 \
    --notrack \
    --use-plugin /data/ddrucker/workaround.yml \
    --use-aroma --ignore-aroma-denoising-errors \
    --use-syn-sdc \
    -w  /data/ajanes/testing/fmriprep-work

@oesteban
Copy link
Member

Not sure if this will make any difference, but please note the folder it is complaining about /home/fmriprep/.cache/fmriprep (instead of /home/fmriprep/.cache/templateflow)

There must be a glitch somewhere in your command line.

@dmd
Copy link
Author

dmd commented Sep 16, 2019

My command line is exactly as pasted above...

@effigies
Copy link
Member

Oh, this is the version check that's causing the problem in #1777, @oesteban.

@oesteban
Copy link
Member

Can you ls /data/TemplateFlow ?

@dmd
Copy link
Author

dmd commented Sep 16, 2019

$ ls -l /data/TemplateFlow/
total 64
drwxrwxr-x  2 ddrucker ddrucker  4096 Sep 16 13:54 tpl-fsaverage
drwxrwxr-x  3 ddrucker ddrucker  4096 Sep 16 13:54 tpl-fsLR
drwxrwxr-x  3 ddrucker ddrucker  4096 Sep 16 13:54 tpl-MNI152Lin
drwxrwxr-x  2 ddrucker ddrucker 12288 Sep 16 13:54 tpl-MNI152NLin2009cAsym
drwxrwxr-x  3 ddrucker ddrucker  8192 Sep 16 13:54 tpl-MNI152NLin6Asym
drwxrwxr-x  3 ddrucker ddrucker  4096 Sep 16 13:54 tpl-MNI152NLin6Sym
drwxrwxr-x 14 ddrucker ddrucker  4096 Sep 16 13:54 tpl-MNIInfant
drwxrwxr-x  9 ddrucker ddrucker  4096 Sep 16 13:54 tpl-MNIPediatricAsym
drwxrwxr-x  3 ddrucker ddrucker  4096 Sep 16 13:54 tpl-NKI
drwxrwxr-x  2 ddrucker ddrucker  4096 Sep 16 13:54 tpl-OASIS30ANTs
drwxrwxr-x  2 ddrucker ddrucker  4096 Sep 16 13:54 tpl-PNC

@dmd
Copy link
Author

dmd commented Sep 16, 2019

Does that need to be writable by other users, or just readable?

@oesteban
Copy link
Member

oesteban commented Sep 16, 2019

Judging by #1778 (comment) I'm now inclined to think you just found a nasty bug.

EDIT: if all templates are downloaded, it works if it is just readable (and folders executable)

@dmd
Copy link
Author

dmd commented Sep 16, 2019

It's definitely trying to write to /home/fmriprep/.cache/fmriprep

@dmd
Copy link
Author

dmd commented Sep 16, 2019

I'm getting on STDERR:

Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_desc-brain_mask.nii.gz

and then

requests.exceptions.ConnectionError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_desc-brain_mask.nii.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x2aaadb08d748>: Failed to establish a new connection: [Errno 110] Connection timed out'))

Even though I have
export SINGULARITYENV_TEMPLATEFLOW_HOME=/data/TemplateFlow
and
-B /data/TemplateFlow:/data/TemplateFlow

and in #1778 (comment) you can see I have that downloaded...

@dmd
Copy link
Author

dmd commented Sep 17, 2019

If I give it a https_proxy, it succeeds in downloading, so it doesn't fail ... even though it then doesn't actually write it anywhere.

My TEMPLATEFLOW_HOME is read-only, so it didn't write it there, and it didn't write it to .cache/fmriprep either (the only file it wrote there is latest).

It looks to me like you're attempting to download the file even if you already have it in TEMPLATEFLOW_HOME, and then just ignoring the result?

@dmd
Copy link
Author

dmd commented Sep 17, 2019

python -c "from templateflow import api; api.get('MNI152NLin2009cAsym')"

Is it normal that most of the .nii.gz files this downloaded are 0 length?

https://gist.github.com/3faa8b9b1f9e1cdae14fcbc39988fd8f

@effigies
Copy link
Member

Yes.

@dmd
Copy link
Author

dmd commented Sep 17, 2019

fmriprep is still errorring out trying to write to TEMPLATEFLOW_HOME, as well as all the errors above involving trying to download things it already has.

@dmd
Copy link
Author

dmd commented Sep 17, 2019

When I make TEMPLATEFLOW_HOME writeable, fmriprep downloads these files:

$ diff -b ls-lR-before-run ls-lR-after-running-for-a-while
222c222
< total 352
---
> total 11272
272c272
< -rw-rw-r-- 1 ddrucker ddrucker     0 Sep 17 11:22 tpl-MNI152NLin6Asym_res-01_desc-brain_mask.nii.gz
---
> -rw-rw-r-- 1 ddrucker ddrucker   149930 Sep 17 11:45 tpl-MNI152NLin6Asym_res-01_desc-brain_mask.nii.gz
275c275
< -rw-rw-r-- 1 ddrucker ddrucker     0 Sep 17 11:22 tpl-MNI152NLin6Asym_res-01_T1w.nii.gz
---
> -rw-rw-r-- 1 ddrucker ddrucker 11027773 Sep 17 11:45 tpl-MNI152NLin6Asym_res-01_T1w.nii.gz

@effigies
Copy link
Member

Okay. So apologies for this going on so long. I think it got a bit confused (or possibly I'm just getting confused reading this). I was pretty busy yesterday, so I wasn't able to follow in real-time. If I can summarize the issue:

fMRIPrep checks two directories in /home/fmriprep/.cache.

  • /home/fmriprep/.cache/templateflow is a pre-populated directory in the image that you only need write access to if you want to use a non-standard directory.
  • /home/fmriprep/.cache/fmriprep is a place to cache version checks to alert users that a newer version is available. This needs to be writable. In the future, we should find a better approach that doesn't require this.

We can treat these independently.

Templateflow

If all you need access to is the usual templates, exporting SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow will ensure that fMRIPrep can find these, assuming you don't bind other directories over /home, /home/fmriprep or /home/fmriprep/.cache. I think this was the initial problem, where you were mounting /home/fmriprep, and losing your pre-cached templates.

If you need additional templates, then you can download them into a separate directory, make sure its bound and pointed to by the TEMPLATEFLOW_HOME environment variable. For consistency, it's probably easiest to use /home/fmriprep/.cache/templateflow.

If you lack network access, pre-populating with all templates is probably simplest.

Version check

Just binding a writable directory into /home/fmriprep/.cache/fmriprep is probably sufficient. Hopefully we handle reset or timed out network connections properly...


Your last comment arrived while I was typing this, but I'll look at it separately so as not to spend forever here...

@dmd
Copy link
Author

dmd commented Sep 17, 2019

My latest command line is:

export SINGULARITYENV_https_proxy=http://micc:8899
export SINGULARITYENV_TEMPLATEFLOW_HOME=/data/TemplateFlow
/cm/shared/singularity/bin/singularity run \
    -B /data:/data -B /data1:/data1 -B /data2:/data2 -B /data3:/data3 -B /cm/shared:/cm/shared \
    -B /data/TemplateFlow:/data/TemplateFlow \
    -B $HOME/.cache:/home/fmriprep/.cache \
    --cleanenv \
    /cm/shared/singularity/images/fmriprep-1.5.0.simg \
    /data/ddrucker/testing \
    /data/ddrucker/testing/derivatives \
    participant \
    --fs-license-file /cm/shared/freesurfer-6.0.1/license.txt \
    --participant_label acj \
    --output-spaces "MNI152NLin2009cAsym:res-2 anat func fsaverage" \
    --n_cpus 4 \
    --mem-mb 8192 \
    --notrack \
    --use-plugin /data/ddrucker/workaround.yml \
    --use-aroma --ignore-aroma-denoising-errors \
    --use-syn-sdc \
    -w  /data/ddrucker/fmriprep-working

@dmd
Copy link
Author

dmd commented Sep 17, 2019

Running now with

export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/fmriprep/.cache/templateflow

and

    -B /home/ddrucker/.cache/fmriprep:/home/fmriprep/.cache/fmriprep \

and it seems to be progressing without trying to download anything.

Oddly, it hasn't actually written anything to .cache/fmriprep -- I thought it was going to write a version file there?

@effigies
Copy link
Member

It wouldn't if internet access failed.

Sorry, in meetings again, so I'll be slow...

@dmd
Copy link
Author

dmd commented Sep 17, 2019

Ah, that makes sense, as I no longer have https_proxy defined. I'll update this thread when my job completes. I'm still concerned that the api.get commands you posted were not actually sufficient to populate if I did want to use a custom TEMPLATEFLOW_HOME -- it was still downloading new files into it.

@effigies
Copy link
Member

I'm still concerned that the api.get commands you posted were not actually sufficient to populate if I did want to use a custom TEMPLATEFLOW_HOME -- it was still downloading new files into it.

Right, Oscar forgot to add

python -c "from templateflow import api; api.get('MNI152NLin6Asym')"

This I believe is only used for ICA-AROMA.

@dmd
Copy link
Author

dmd commented Sep 17, 2019

And in fact we're using ICA-AROMA, so that explains that. In that case, I think we're done here :) Thank you so much!

@effigies
Copy link
Member

Excellent. Thanks for your patience!

dmd added a commit to dmd/mictools that referenced this issue Sep 17, 2019
@emdupre
Copy link
Collaborator

emdupre commented Sep 18, 2019

Can we add this to the documentation, @effigies ? I've seen other folks with similar issues. What do you think ?

@effigies effigies added the faq Questions to add to the FAQ label Sep 18, 2019
@effigies
Copy link
Member

Yes, this should absolutely go into the FAQ.

@oesteban
Copy link
Member

oesteban commented Oct 2, 2019

Hi @dmd could you check whether the --home argument to singularity works for you?

The idea is to map your home directory into /home/fmriprep:

export SINGULARITYENV_https_proxy=http://micc:8899
/cm/shared/singularity/bin/singularity run \
    --home /home/fmriprep/ \
    -B $HOME:/home/fmriprep -B /data:/data \
    --cleanenv \
    /cm/shared/singularity/images/fmriprep-1.5.0.simg \
    /data/ddrucker/testing \
    /data/ddrucker/testing/derivatives \
    participant \
    --fs-license-file /cm/shared/freesurfer-6.0.1/license.txt \
    --participant_label acj \
    --output-spaces "MNI152NLin2009cAsym:res-2 anat func fsaverage" \
    --n_cpus 4 \
    --mem-mb 8192 \
    --notrack \
    --use-plugin /data/ddrucker/workaround.yml \
    --use-aroma --ignore-aroma-denoising-errors \
    --use-syn-sdc \
    -w  /data/ddrucker/fmriprep-working

If that works, the export SINGULARITYENV_TEMPLATEFLOW_HOME should not be necessary anymore, the /home/fmriprep/.cache/fmriprep file should be writeable and everything else should be fine.

@dmd
Copy link
Author

dmd commented Oct 3, 2019

I actually think I prefer setting SINGULARITYENV_TEMPLATEFLOW_HOME, simply on the general principle of minimum privilege -- if I can give the container access to just ~/.cache/fmriprep rather than my entire $HOME, that seems like a better thing to do than giving it access to all of $HOME.

@oesteban
Copy link
Member

oesteban commented Oct 3, 2019 via email

@dmd
Copy link
Author

dmd commented Oct 3, 2019

Yes, it does work. But we'll stick with least-privilege. Thanks!

@oesteban
Copy link
Member

oesteban commented Oct 3, 2019

Closing this thread as this was just added to the FAQ in the latest commit.

@hillhillll
Copy link

Hi,
I have the similar connection error, and I don't know how the set the SINGULARITYENV_http_proxy variable, how can I get the proxy ip address and port ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
container faq Questions to add to the FAQ templateflow
Projects
None yet
Development

No branches or pull requests

5 participants