-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supplemental development process for Docker #1319
Conversation
So, similar to the site-community change to the workflow, isn't this what you'd use docker compose for? Seems this is reproducing much of what that gives us already -- defined volumes, ports, image building. |
Yes, I have noted the choice several places: Either you have docker-compose as a dependency or you write a rather long |
Aye. We're already using docker compose in other repos, so seems that should be safe to include as a dependency here too. A homebrew approach is almost certainly bound to cause more problems than introducing docker compose would cause, and docker compose is way more approachable. I still don't think we want to go as far as to worry about docker in CI, as we aren't using that pattern in any of our repos. |
I disagree. This repo is much more public-facing. |
Anyways, this works, and since you're not using Docker in your development environment and this is just a supplemental change, I would like to conclude it without adding docker-compose. I'm totally open to adding it later 👍 But perhaps we can wait and see if there's a CI integration coming up as well 🔮 |
Regardless of who is using it, I don't think asking for docker-compose is that high of a barrier. I would say it's a much lower barrier, as I can work with docker compose, but would not feel comfortable mucking around with a homebrew replacement for docker compose. I'm sure other contributors will feel the same. |
|
||
docker-copy-assets: | ||
docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" . | ||
docker cp "$(shell cat .container_id):/project/package-lock.json" . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And discussed elsewhere already, but just noting here, this homebrew configuration seems to overlap with what docker-compose provides. I'd put some preference on using standard tooling, as it's more approachable.
@humitos would have the most input/opinions here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about homebrew
, it's a MacOS thing right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of docker cp
is to avoid a write operation on a bind mount because file system permissions are usually troublesome, so it can be better to be explicit about what you expect to get from the docker runtime.
I'm already fixing this on your request @agjohnson
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now removed, the docker-compose command mounts the current git repo and builds inside that one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about homebrew, it's a MacOS thing right?
I meant this in the sense that this is a homebrewed process -- it is a manual, one-off creation, as opposed to a standardized, repeatable process using docker-compose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand what this step is for. Why are we copying these files from inside the container into the host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to backtrack again from the approach where a host directory was mounted and written to by the container.
The reason was that I had issues with a mix of files owned by the user and files owned by root. I also needed to run npm cache clear
and I never understood why.
But the fundamental reason was that I was mounting host system stuff into the container and writing irrelevant files there.
The best approach IMO is to keep everything containerized and copy out the artifacts directly from their known location. If you can separate out a bind mount and ensure that ONLY intended artifacts are written there, then that's also a fine approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! Yeah, creating files inside the container that then are required outside the container usually generates permission problems. The only way to do that correctly is mapping the UID/GID between the host and the container and that is where things get more complicated. We've been there and I didn't enjoyed it 😄
I like the pattern you used here with docker cp
for this requirement, where files generated inside the container are required to be distributed/used outside the container.
We currently have this problem when creating migrations on Read the Docs, for example, where the resulting .py
files is owned by root and left in a directory owned by "your user" in the host. Then you have to run sudo chown
to fix the permissions on that file.
…ssets out of containers
This additions is high priority for working through the remaining PRs for 1.1 to be fast-tracked. Since it's an optional and supplemental part of the development/build process, I'll merge it. I'm quite sure that I'll be back with some adjustments, since this isn't really perfect. @humitos as you can see from the current setup, there's a bit of simplification work left in the separation logic between docker image, docker runtime and docker host. I'll happily introduce this on a call :) Lines 7 to 13 in ba059c7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is great. I just asked a bunch of questions to understand the workflow a little more.
# This implicitely includes Python 3.10 | ||
FROM node:14-alpine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to use Ubuntu here if possible. Mainly to keep consistency with the rest of the repositories we have. Also, because I've found different weird compilation issues when working with Alpine and the saving in space is not really huge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're on the same page here and Ubuntu images would be the best way to go.
# Copy in files that we need to run the project. These files | ||
# will not be mounted into the runtime from external environment | ||
# so we copy them during build. | ||
COPY webpack.common.js webpack.dev.js webpack.prod.js /project/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we don't mount them via docker compose instead of copying them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can mount stuff at runtime. This is build time, so we need to copy into the image. The goal is to have node packages built together with the image.. however, if I copy the ENTIRE repo, then Docker's cache will be invalid everytime I build the image, so I have to build it over and over again.. so I copy the minimal set of files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure I follow the need for avoiding mounts. Could you clarify what issue are you trying to solve? There is probably prior art to reuse in our application Docker configuration, whether that is permissions issues or avoiding host filesystem artifacts.
I would look at what we do in the application docker processes, as we don't seem to have the issues that this docker config is trying to solve. I would put strong preference on reusing patterns that are working for us already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify what issue are you trying to solve?
Issue is to create a reusable and containerized environment for development. As the development currently also has responsibility for building the production assets, then that's also solved, but I understand that we want to remove these artifacts from Git as soon as possible.
I'd like to propose that the development container specification can be reused for CI. I can open an issue for that.
I would put strong preference on reusing patterns that are working for us already.
Or perhaps developing and improving those patterns? :)
The reason I put docker-compose with a bind mount was to echo what's in readthedocs.org, although I'd always recommend to keep the mounted set of files to an absolute minimum, especially so that files from the container aren't dumped on the host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue is to create a reusable and containerized environment for development.
This seems like the overall goal of your change, but I'm more curious what issue you are hitting that requires this.
The main concern here is that we're making a complex solution for a problem we don't really have currently. The initial goal is to give contributors easy environment creation, which this setup does. But absolute isolation isn't currently a strong roadmap concern.
A longer term discussion point should be describing why isolation of build files is necessary for us. There are general arguments for this, but I'm not sure they are enough to bump up the priority of this work to an immediate team priority.
I'd like to propose that the development container specification can be reused for CI
What issue are you hitting that requires this change? The reasons to stick with the current pattern are that it's already providing isolation and deterministic builds, and there is more prior art and information on using CircleCI without custom Docker images.
So switching this feels a bit like sideways progress, but might also lead to us spending more time scratching our heads.
Or perhaps developing and improving those patterns?
It depends on the problems you are trying to solve, what design decision we come to, and where we place the work on our roadmap. The place to start changes like this is with discussion first, and I would suggest opening an issue on meta. From there, we'd find a place for the work on our roadmap if we all feel it's a priority.
Whatever we decide, sticking to a singular pattern is the main concern. One-off processes will add confusion and friction to maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can mount stuff at runtime. This is build time, so we need to copy into the image
I don't think there is need to copy files into the container that are not used at build time. If these files are used at run time, they should be mounted at run time. Otherwise, modifying any of these files would require rebuilding the Docker image when it's not if they are mounted.
Actually, that's why I commented on this particular line and not the previous one (https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R26-R28) where, to me, it's a justified used of COPY
and makes total sense there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The files are used at build time, they are required for package.json
and npm install
to work. We could perhaps mock them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm... in that case, I don't follow it. These files are copied after running npm install
in the Dockerfile: https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R36-R38
Also, the comment right above copying these files says they are required to run the project.
Following these two things, I'd think they are not required at build time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm giving some pretty bad off-hand answers right now. I'll have to go over this again and give a proper answer. As I said, the idea is to avoid having anything mounted into the runtime environment to avoid making a mess in the host system's files.
I think this is easier to go through over a meeting than async.
|
||
docker-copy-assets: | ||
docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" . | ||
docker cp "$(shell cat .container_id):/project/package-lock.json" . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand what this step is for. Why are we copying these files from inside the container into the host?
rm -f .container_id | ||
docker-compose run -d sphinx_rtd_theme build > .container_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there is no need to do rm -f
since you are redirecting the output with >
in the next command. If it exists, it will overwrite it with the new output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha yes, will remove it.... 'twas because docker run
could write to a file - now I need to detach in order to get the ID because docker-compose run
doesn't have the same option that docker run
has.
# Runs the development webserver | ||
$ docker-compose run sphinx_rtd_theme dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be better to use docker-compose up
here which is the standard and call whatever is required behind the scenes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker-compose run
is doing what docker run
does but reading the docker-compose configuration - I start the service that I want and give an argument dev
to the entrypoint. docker-compose up
is a different type of command.
# If you want to copy stuff out of the Docker environment, run this make | ||
# target or read the actual Makefile to see what is going on. | ||
# We suggest running this command every time that you want to quickly build | ||
# new CSS/JS assets | ||
$ make docker-build-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using invoke
to handle this type of extra commands in other repositories. I'm not opposite to use Makefile, tho, just saying 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also prefer to have a singular pattern to use, as much as I like make.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Containers provide a quick way to get started. make
exists on most systems and doesn't sit on top of an additional stack that's customized by the project.
With invoke
, we'd have to firstly bootstrap a Python environment with the correct version of invoke
- or as a new contributor understand what invoke
even is. I think there's a better chance that people can copy-paste make
commands without worrying about setting up an environment for make
.
If the feeling is "let's not have Make", then I would rather have a shell script for running the docker-build-all
target, since that's the only command that chains a few things together and should be scripted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still agree that we should stick with a single pattern and use Invoke here. Both Python and Invoke are a normal part of our development workflow and already in use in other repositories, so are reasonable requirements for contributors. The goal is not to make this repository universally accessible as possible, it's just to make a repeatable environment that is easy for the core team to maintain.
Updated the approach from #1312
npm run dev
by proxying commands tomake docker-run command=<npm-command>