-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pathogen-embed to the base image #221
Conversation
@@ -312,6 +312,7 @@ RUN if [[ "$TARGETPLATFORM" == linux/arm64 ]]; then \ | |||
; \ | |||
fi | |||
|
|||
RUN pip3 install pathogen-embed==2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(non-blocking)
Noting from the run summary that this added 8.5 minutes to the build time:
[linux/arm64] RUN pip3 install pathogen-embed==2.0.0 | 496.7s (61.9%) ████████████████████
[linux/amd64] RUN pip3 install pathogen-embed==2.0.0 | 17.0s (2.1%) ▋
I'm not worried since this dependency is pinned, meaning the cached result will be used most of the time and that the increased build time would only be noticeable during (infrequent) cache misses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one change also adds ~57 MB to the image file size... 😕
I expected an increase, since we need to install scikit-learn, HDBSCAN, and UMAP, but it is kind of a bummer.
Even though the CI passed, this PR doesn't currently work as expected. When I download the image for this branch and run a Nextstrain shell like so, the image is missing the executables for
However, the Python package is installed, since I can load the Python REPL from the Nextstrain shell and import it:
|
I forgot this step: Lines 433 to 452 in 3f15ce3
Fixing in the next commit. |
Adds paths for the three command line scripts provided by the pathogen-embed Python package which is the primary interface to that package.
I confirmed that copying the scripts in the last commit properly provided each of the pathogen-embed tools. I additionally tested that these tools worked in this image with pathogen-embed's cram tests (from the pathogen-embed repo) like so: $ nextstrain shell --docker --image docker.io/nextstrain/base:branch-add-pathogen-embed .
~/build $ python -m pip install cram
~/build $ ~/.local/bin/cram --shell=/bin/bash tests
.............................
# Ran 29 tests, 0 skipped, 0 failed.
~/build $ I'll plan to merge this PR and the sibling conda-base PR on Monday. |
Description of proposed changes
Adds pathogen-embed to base image, so we can use its tools for analysis of reassortment in our influenza builds.
Related issue(s)
nextstrain/conda-base#71
Checklist