Supplemental development process for Docker #1319

benjaoming · 2022-08-19T12:57:49Z

Updated the approach from #1312

Docker environment no longer used in CircleCI
Added support for using npm run dev by proxying commands to make docker-run command=<npm-command>
Documentation updated
Found an approach to install Python package in editable mode without writing to the external environment (that's the intention!)

agjohnson · 2022-08-19T18:53:36Z

So, similar to the site-community change to the workflow, isn't this what you'd use docker compose for? Seems this is reproducing much of what that gives us already -- defined volumes, ports, image building.

docs/contributing.rst

benjaoming · 2022-08-19T19:17:37Z

So, similar to the site-community change to the workflow, isn't this what you'd use docker compose for? Seems this is reproducing much of what that gives us already -- defined volumes, ports, image building.

Yes, I have noted the choice several places: Either you have docker-compose as a dependency or you write a rather long docker run command. I was happy about this one in particular because I didn't have to add docker-compose in CI (when I was doing that). But it also saves at least a line of text in the instructions for setting this up :)

agjohnson · 2022-08-19T19:27:18Z

Aye. We're already using docker compose in other repos, so seems that should be safe to include as a dependency here too. A homebrew approach is almost certainly bound to cause more problems than introducing docker compose would cause, and docker compose is way more approachable.

I still don't think we want to go as far as to worry about docker in CI, as we aren't using that pattern in any of our repos.

benjaoming · 2022-08-19T19:30:25Z

Aye. We're already using docker compose in other repos

I disagree. This repo is much more public-facing.

benjaoming · 2022-08-19T19:33:15Z

Anyways, this works, and since you're not using Docker in your development environment and this is just a supplemental change, I would like to conclude it without adding docker-compose. I'm totally open to adding it later 👍 But perhaps we can wait and see if there's a CI integration coming up as well 🔮

agjohnson · 2022-08-19T19:36:56Z

I disagree. This repo is much more public-facing.

Regardless of who is using it, I don't think asking for docker-compose is that high of a barrier. I would say it's a much lower barrier, as I can work with docker compose, but would not feel comfortable mucking around with a homebrew replacement for docker compose. I'm sure other contributors will feel the same.

Dockerfile

agjohnson · 2022-08-19T21:44:52Z

Makefile

+
+docker-copy-assets:
+	docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" .
+	docker cp "$(shell cat .container_id):/project/package-lock.json" .


And discussed elsewhere already, but just noting here, this homebrew configuration seems to overlap with what docker-compose provides. I'd put some preference on using standard tooling, as it's more approachable.

@humitos would have the most input/opinions here

I don't know anything about homebrew, it's a MacOS thing right?

The idea of docker cp is to avoid a write operation on a bind mount because file system permissions are usually troublesome, so it can be better to be explicit about what you expect to get from the docker runtime.

I'm already fixing this on your request @agjohnson

This is now removed, the docker-compose command mounts the current git repo and builds inside that one.

I don't know anything about homebrew, it's a MacOS thing right?

I meant this in the sense that this is a homebrewed process -- it is a manual, one-off creation, as opposed to a standardized, repeatable process using docker-compose

I'm not sure to understand what this step is for. Why are we copying these files from inside the container into the host?

I had to backtrack again from the approach where a host directory was mounted and written to by the container.

The reason was that I had issues with a mix of files owned by the user and files owned by root. I also needed to run npm cache clear and I never understood why.

But the fundamental reason was that I was mounting host system stuff into the container and writing irrelevant files there.

The best approach IMO is to keep everything containerized and copy out the artifacts directly from their known location. If you can separate out a bind mount and ensure that ONLY intended artifacts are written there, then that's also a fine approach.

Gotcha! Yeah, creating files inside the container that then are required outside the container usually generates permission problems. The only way to do that correctly is mapping the UID/GID between the host and the container and that is where things get more complicated. We've been there and I didn't enjoyed it 😄

I like the pattern you used here with docker cp for this requirement, where files generated inside the container are required to be distributed/used outside the container.

We currently have this problem when creating migrations on Read the Docs, for example, where the resulting .py files is owned by root and left in a directory owned by "your user" in the host. Then you have to run sudo chown to fix the permissions on that file.

…e entire context

…ssets out of containers

benjaoming · 2022-08-24T15:08:35Z

This additions is high priority for working through the remaining PRs for 1.1 to be fast-tracked. Since it's an optional and supplemental part of the development/build process, I'll merge it.

I'm quite sure that I'll be back with some adjustments, since this isn't really perfect.

@humitos as you can see from the current setup, there's a bit of simplification work left in the separation logic between docker image, docker runtime and docker host. I'll happily introduce this on a call :)

sphinx_rtd_theme/Makefile

Lines 7 to 13 in ba059c7

    
           docker-npm-build: 
        
           	rm -f .container_id 
        
           	docker-compose run -d sphinx_rtd_theme build > .container_id 
        
           	docker container wait "$(shell cat .container_id)" 
        
           	docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" . 
        
           	docker cp "$(shell cat .container_id):/project/package-lock.json" . 
        
           	@echo "Done building"

humitos

I think this is great. I just asked a bunch of questions to understand the workflow a little more.

humitos · 2022-08-25T11:37:56Z

Dockerfile

+# This implicitely includes Python 3.10
+FROM node:14-alpine


I'd prefer to use Ubuntu here if possible. Mainly to keep consistency with the rest of the repositories we have. Also, because I've found different weird compilation issues when working with Alpine and the saving in space is not really huge.

I think we're on the same page here and Ubuntu images would be the best way to go.

Dockerfile

humitos · 2022-08-25T11:40:42Z

Dockerfile

+# Copy in files that we need to run the project. These files
+# will not be mounted into the runtime from external environment
+# so we copy them during build.
+COPY webpack.common.js webpack.dev.js webpack.prod.js /project/


Is there a reason why we don't mount them via docker compose instead of copying them here?

You can mount stuff at runtime. This is build time, so we need to copy into the image. The goal is to have node packages built together with the image.. however, if I copy the ENTIRE repo, then Docker's cache will be invalid everytime I build the image, so I have to build it over and over again.. so I copy the minimal set of files.

I'm still not sure I follow the need for avoiding mounts. Could you clarify what issue are you trying to solve? There is probably prior art to reuse in our application Docker configuration, whether that is permissions issues or avoiding host filesystem artifacts.

I would look at what we do in the application docker processes, as we don't seem to have the issues that this docker config is trying to solve. I would put strong preference on reusing patterns that are working for us already.

Could you clarify what issue are you trying to solve?

Issue is to create a reusable and containerized environment for development. As the development currently also has responsibility for building the production assets, then that's also solved, but I understand that we want to remove these artifacts from Git as soon as possible.

I'd like to propose that the development container specification can be reused for CI. I can open an issue for that.

I would put strong preference on reusing patterns that are working for us already.

Or perhaps developing and improving those patterns? :)

The reason I put docker-compose with a bind mount was to echo what's in readthedocs.org, although I'd always recommend to keep the mounted set of files to an absolute minimum, especially so that files from the container aren't dumped on the host.

Issue is to create a reusable and containerized environment for development.

This seems like the overall goal of your change, but I'm more curious what issue you are hitting that requires this.

The main concern here is that we're making a complex solution for a problem we don't really have currently. The initial goal is to give contributors easy environment creation, which this setup does. But absolute isolation isn't currently a strong roadmap concern.

A longer term discussion point should be describing why isolation of build files is necessary for us. There are general arguments for this, but I'm not sure they are enough to bump up the priority of this work to an immediate team priority.

I'd like to propose that the development container specification can be reused for CI

What issue are you hitting that requires this change? The reasons to stick with the current pattern are that it's already providing isolation and deterministic builds, and there is more prior art and information on using CircleCI without custom Docker images.

So switching this feels a bit like sideways progress, but might also lead to us spending more time scratching our heads.

Or perhaps developing and improving those patterns?

It depends on the problems you are trying to solve, what design decision we come to, and where we place the work on our roadmap. The place to start changes like this is with discussion first, and I would suggest opening an issue on meta. From there, we'd find a place for the work on our roadmap if we all feel it's a priority.

Whatever we decide, sticking to a singular pattern is the main concern. One-off processes will add confusion and friction to maintenance.

@benjaoming

You can mount stuff at runtime. This is build time, so we need to copy into the image

I don't think there is need to copy files into the container that are not used at build time. If these files are used at run time, they should be mounted at run time. Otherwise, modifying any of these files would require rebuilding the Docker image when it's not if they are mounted.

Actually, that's why I commented on this particular line and not the previous one (https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R26-R28) where, to me, it's a justified used of COPY and makes total sense there.

The files are used at build time, they are required for package.json and npm install to work. We could perhaps mock them.

Hrm... in that case, I don't follow it. These files are copied after running npm install in the Dockerfile: https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R36-R38

Also, the comment right above copying these files says they are required to run the project.

Following these two things, I'd think they are not required at build time.

I'm giving some pretty bad off-hand answers right now. I'll have to go over this again and give a proper answer. As I said, the idea is to avoid having anything mounted into the runtime environment to avoid making a mess in the host system's files.

I think this is easier to go through over a meeting than async.

humitos · 2022-08-25T11:43:00Z

Makefile

+
+docker-copy-assets:
+	docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" .
+	docker cp "$(shell cat .container_id):/project/package-lock.json" .


I'm not sure to understand what this step is for. Why are we copying these files from inside the container into the host?

humitos · 2022-08-25T11:43:59Z

Makefile

+	rm -f .container_id
+	docker-compose run -d sphinx_rtd_theme build > .container_id


nit: there is no need to do rm -f since you are redirecting the output with > in the next command. If it exists, it will overwrite it with the new output.

haha yes, will remove it.... 'twas because docker run could write to a file - now I need to detach in order to get the ID because docker-compose run doesn't have the same option that docker run has.

humitos · 2022-08-25T11:46:37Z

docs/contributing.rst

+    # Runs the development webserver
+    $ docker-compose run sphinx_rtd_theme dev


Wouldn't be better to use docker-compose up here which is the standard and call whatever is required behind the scenes?

docker-compose run is doing what docker run does but reading the docker-compose configuration - I start the service that I want and give an argument dev to the entrypoint. docker-compose up is a different type of command.

humitos · 2022-08-25T11:48:13Z

docs/contributing.rst

+    # If you want to copy stuff out of the Docker environment, run this make
+    # target or read the actual Makefile to see what is going on.
+    # We suggest running this command every time that you want to quickly build
+    # new CSS/JS assets
+    $ make docker-build-all


We are using invoke to handle this type of extra commands in other repositories. I'm not opposite to use Makefile, tho, just saying 😄

I'd also prefer to have a singular pattern to use, as much as I like make.

Containers provide a quick way to get started. make exists on most systems and doesn't sit on top of an additional stack that's customized by the project.

With invoke, we'd have to firstly bootstrap a Python environment with the correct version of invoke - or as a new contributor understand what invoke even is. I think there's a better chance that people can copy-paste make commands without worrying about setting up an environment for make.

If the feeling is "let's not have Make", then I would rather have a shell script for running the docker-build-all target, since that's the only command that chains a few things together and should be scripted.

I'd still agree that we should stick with a single pattern and use Invoke here. Both Python and Invoke are a normal part of our development workflow and already in use in other repositories, so are reasonable requirements for contributors. The goal is not to make this repository universally accessible as possible, it's just to make a repeatable environment that is easy for the core team to maintain.

Bump to a valid version scheme, bump node-sass for Node 14 support

a08e3d6

benjaoming requested a review from a team as a code owner August 19, 2022 12:57

Docker configuration for dev environment

1997725

benjaoming force-pushed the dockerize branch from 94705c3 to 1997725 Compare August 19, 2022 13:07

benjaoming marked this pull request as draft August 19, 2022 13:15

benjaoming marked this pull request as ready for review August 19, 2022 13:29

benjaoming added this to the 1.1 milestone Aug 19, 2022

This was referenced Aug 19, 2022

Fix failing CI tests #1281

Closed

Dockerize development environment readthedocs/website#142

Closed

agjohnson reviewed Aug 19, 2022

View reviewed changes

docs/contributing.rst Outdated Show resolved Hide resolved

Remove some text @agjohnson

b50a451

agjohnson reviewed Aug 19, 2022

View reviewed changes

First updates migrating to docker-compose

ba65499

agjohnson requested a review from humitos August 19, 2022 22:33

Removes Makefile + double FROM in Dockerfile

83e1220

benjaoming force-pushed the dockerize branch from d8f117e to 83e1220 Compare August 19, 2022 22:38

benjaoming added 4 commits August 23, 2022 23:29

Fix node.js error in docker runtime, speed up build by not copying th…

208c84d

…e entire context

Merge remote-tracking branch 'upstream/master' into dockerize

445c923

Merge remote-tracking branch 'upstream/master' into dockerize

4393bd2

Wrapping up the Docker work, using docker cp again in order to copy a…

ba059c7

…ssets out of containers

benjaoming merged commit eb3fb18 into readthedocs:master Aug 24, 2022

benjaoming deleted the dockerize branch August 24, 2022 15:08

humitos reviewed Aug 25, 2022

View reviewed changes

agjohnson mentioned this pull request Aug 25, 2022

Revisit Docker configuration pattern #1337

Open

4 tasks

		rm -f .container_id
		docker-compose run -d sphinx_rtd_theme build > .container_id

		# Runs the development webserver
		$ docker-compose run sphinx_rtd_theme dev

Supplemental development process for Docker #1319

Supplemental development process for Docker #1319

Conversation

benjaoming commented Aug 19, 2022

agjohnson commented Aug 19, 2022

benjaoming commented Aug 19, 2022

agjohnson commented Aug 19, 2022

benjaoming commented Aug 19, 2022

benjaoming commented Aug 19, 2022

agjohnson commented Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjaoming commented Aug 24, 2022

humitos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

humitos Aug 29, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjaoming Aug 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agjohnson commented Aug 19, 2022 •

edited

Loading

humitos Aug 29, 2022 •

edited

Loading

benjaoming Aug 26, 2022 •

edited

Loading