Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

Brotli compression #7

Closed
wants to merge 3 commits into from
Closed

Brotli compression #7

wants to merge 3 commits into from

Conversation

trevyn
Copy link

@trevyn trevyn commented Mar 8, 2021

Draft PR: Only the compression side of Brotli so far, it generates a file of roughly the correct size. I'll attack the decompression side later, after which we will see if the compression side actually works. ;)

@trevyn
Copy link
Author

trevyn commented Mar 8, 2021

Ok, this actually works round-trip for me when rebuilding the stub by hand, but the stubs npm script fails with

go build github.com/google/brotli/go/cbrotli: build constraints exclude all Go files in /Users/eden/go/pkg/mod/github.com/google/brotli/go/cbrotli@v0.0.0-20210127140805-63be8a994019

I think this is because the Brotli library is actually in C, and CGO can't cross-compile. I'll take a closer look tomorrow.

@leafac
Copy link
Owner

leafac commented Mar 8, 2021

I appreciate the work so far 😃

Keep going…

@trevyn
Copy link
Author

trevyn commented Mar 9, 2021

@leafac I can think of two main options here:

  • https://github.com/andybalholm/brotli is a translation of the reference Brotli implementation into pure Go, which should cross-compile without a problem. A disadvantage is that it is not the official implementation. Decompression speed would also need to be verified.
  • We could set up a system to use GitHub Actions to compile the stubs. Having 3x 3MB binaries checked into the git repo seems a little awkward to me, so this could also be a larger effort that would have the packager download the stubs from GitHub Releases or similar. This also has the advantage of providing a verified supply chain for the binaries, which is probably a good idea anyway.

Do you have a preference?

@leafac leafac mentioned this pull request Mar 16, 2021
@leafac
Copy link
Owner

leafac commented Mar 21, 2021

@leafac I can think of two main options here:

Thanks for the investigation and for presenting the ideas so clearly 👍

I really like where this is going…

  • https://github.com/andybalholm/brotli is a translation of the reference Brotli implementation into pure Go, which should cross-compile without a problem. A disadvantage is that it is not the official implementation. Decompression speed would also need to be verified.
  • We could set up a system to use GitHub Actions to compile the stubs. Having 3x 3MB binaries checked into the git repo seems a little awkward to me, so this could also be a larger effort that would have the packager download the stubs from GitHub Releases or similar. This also has the advantage of providing a verified supply chain for the binaries, which is probably a good idea anyway.

Do you have a preference?

In principle I’d go with the reference implementation (cbrotli), because it seems to be the more principled approach. At this point you probably also know my feelings about cross-compilation: Even if we were able to cross-compile, we’d still have to test our stuff on the target platform, so we might as well just compile there. (Ironically, I built the current stubs by cross-compiling, as you can tell from the stubs npm script in package.json. But I did my due diligence and tested the stubs myself in all platforms (and run the tests on GitHub Actions on them as well).)

Putting binaries in the Git repository is how caxa currently works. It’s simple and effective, and I have nothing against it. Having the stubs in GitHub Releases and installing them on an postinstall npm script would also be fine, though it’s a bit more work.

One thing to keep in mind is that I don’t think building the stubs using GitHub Actions would be enough moving forward, as there’s a demand for ARM support (Raspberry Pi, specifically) and as far as I can tell GitHub Actions doesn’t support that.

My first idea here is to get a Raspberry Pi, build the stub myself, test everything, and then distribute the stub.

But I also have another idea that I want to run by you: We could have a postinstall npm script that would download Go for your architecture and compile the stub for you. At this point, caxa becomes a weird kind of native module itself, I guess. Maybe we could provide pre-built stubs for the most popular platforms to speed things up for most people.

What do you think about all this?

@maxb2
Copy link
Contributor

maxb2 commented Mar 21, 2021

One thing to keep in mind is that I don’t think building the stubs using GitHub Actions would be enough moving forward, as there’s a demand for ARM support (Raspberry Pi, specifically) and as far as I can tell GitHub Actions doesn’t support that.

It is possible to emulate arm on Github actions using QEMU and docker buildx. Example workflow and Dockerfile to compile for armv7 and arm64.

@leafac
Copy link
Owner

leafac commented Mar 21, 2021

Oh, that’s interesting. I was thinking about using QEMU for development until I got myself a Pi, but for some reason I never thought about doing it on GitHub Actions.

@leafac
Copy link
Owner

leafac commented Mar 21, 2021

Do you strictly need Docker for that, or could you run QEMU directly on Ubuntu as provided by GitHub Actions? Isn’t it weird to run an emulator inside an already containerized environment?

@maxb2
Copy link
Contributor

maxb2 commented Mar 21, 2021

Do you strictly need Docker for that

I don't think so, but it's really easy to set up this way.

We use docker because it makes it really easy to pull build tools for a particular architecture. For example, there are node docker images specifically made for armv7. You can use docker buildx build --platform linux/arm/v7 . to automatically use the arm images. You can even use docker to setup qemu for buildx: docker run --privileged --rm tonistiigi/binfmt --install arm64,arm.

The basic idea is to create a Dockerfile that compiles whatever you want, use qemu+docker buildx to build the image, then run the image as a container and copy the files you want.

In summary, with this example Dockerfile on your local machine:

FROM ubuntu

RUN your-compile-command outfile

You can run:

docker run --privileged --rm tonistiigi/binfmt --install arm     # Setup emulation for buildx
docker buildx build --platform linux/arm/v7 -t compiler-image .     # Compile your file in image
APP=$(docker run --rm -it -d compiler-image)     # Launch container
docker cp $APP:/path/to/outfile ./     # Copy file to host
docker kill $APP     # Kill container

Which will give you outfile in your current directory.

@maxb2 maxb2 mentioned this pull request Mar 22, 2021
4 tasks
@leafac
Copy link
Owner

leafac commented Apr 3, 2021

@maxb2 Thanks for the information. I played a little with Docker & ARM and I must admit it was a nice experience. I got more things working more quickly than when I tried to setup QEMU. In fact, all I did was:

$ docker run -it arm64v8/ubuntu /bin/bash

And I was able to download both Node.js & Go for ARM8 and they ran just fine 🙌

Next I think I should learn more about buildx…

@leafac
Copy link
Owner

leafac commented Apr 3, 2021

Okay, I think I learned all I need to know for now:

Regarding my own question: “Isn’t it weird to run an emulator inside an already containerized environment?”: That’s already how Docker works: https://www.docker.com/blog/multi-arch-images/ (see § How does it work?) In fact, as I mentioned above, running a Docker image for ARM was a breeze, while setting up QEMU by hand was a pain, so I can think of Docker as an easy-to-use emulator.

Regarding buildx and the like: That’s for when you want to build Docker images. We in caxa aren’t the business of building Docker images—we’re in the business of producing an npm package. In fact, not having to deal with Docker is a motivation to package a project using caxa in the first place. It’s amusingly ironic that Docker may help us get there. Same with Go, by the way: caxa exists so that I can have the Go deployment experience while continuing to write JavaScript—ironically, building the stub in Go was how we got there…

In conclusion: If we introduce CGO dependencies and lose the ability to cross-compile stubs, we can still use something like the arm64v8/ubuntu Docker image to build & test stuff (probably even on GitHub Actions?). We don’t need to deal with buildx for that.

Also, we may continue to distribute the stubs in pre-compiled form: It’s the simplest thing that can work and there are only a half dozen platforms we’d have to support:

  • Windows 32 bits.
  • Windows 64 bits.
  • macOS Intel.
  • macOS ARM.
  • Linux 64 bits.
  • Linux ARMv7.
  • Linux ARMv8.

This list is the intersection of the platforms advertised in Node’s and Go’s download pages, so it should cover most use cases.

@leafac leafac closed this Apr 3, 2021
@leafac leafac reopened this Apr 3, 2021
@maxb2
Copy link
Contributor

maxb2 commented Apr 4, 2021

Regarding buildx and the like: That’s for when you want to build Docker images.

I was unaware that the "u/arch" repos had such good coverage of useful tools. I was using buildx as a hack to access the other architectures in multi-arch manifests of official project images (arm64v8/node vs. node with buildx). So long as the u/arch repos are well maintained, that is a much better solution!

On my Ubuntu machine I tried running arm64v8/golang and it doesn't work. It's still trying to pull the host arch from the manifest.

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes # Setup qemu
$ docker pull arm64v8/golang
Using default tag: latest
latest: Pulling from arm64v8/golang
no matching manifest for linux/amd64 in the manifest list entries

I think the built-in emulation is only available in Docker Desktop. @leafac what is your setup?

@leafac
Copy link
Owner

leafac commented Apr 4, 2021

In the experiments above I was running Docker Desktop in macOS.

In any case, I’m happy with this setup. If you want to work on caxa for Linux/ARM you should:

  • Ideally, get yourself a Linux/ARM setup, like a Raspberry Pi.
  • Failing that, if you’re on Windows/macOS, use Docker Desktop.
  • Failing that, use Docker buildx.

This list, of course, is ordered by complexity.

We can build the stubs and test everything using these rigs. If GitHub Actions works with the ARM Docker images, we may add them to the test matrix. If this doesn’t work, we run the tests manually using the rigs mentioned above. We continue to distribute the stubs pre-compiled committed directly to the Git repository and included in npm package.

As much as I appreciate your work on #11 (I really do! 😃), I think that the solution we arrived at together here is simpler and easier to maintain in the long run. Do you see any issues I may be missing?

@maxb2
Copy link
Contributor

maxb2 commented Apr 4, 2021

It is possible to use the u/arch images using standard docker if you enable experimental features in the docker daemon and use the --platform option. This eliminates the need for buildx. I've updated #11 to reflect this and it should be much more straight forward now.

As far as maintaining the stubs, I think it would be better to use a github workflow as it transparently shows the compilation process. As an extreme example, someone could open a PR with changes to stub.go and bad binaries. There isn't an easy way to verify that the binary is actually from the source file. There is a potential for bad actors or just headaches for maintainers debugging a bad binary. Having a transparent compilation process would increase trust in the binaries. In addition, compiling on actual hardware creates a barrier for those that want to contribute. Using a workflow to compile for all platforms allows developers to quickly test their changes rather than manually synchronizing and testing their changes on all the different machines required (we can very easily add a smoke test in the workflow).

You would have to change how you distribute the binaries (not having them committed in the repo itself), but I think the benefits outweigh the inconvenience.

Sorry @trevyn for polluting your PR with all this docker talk 😄

@leafac
Copy link
Owner

leafac commented May 2, 2021

Some information that may be useful in this investigation about other compression solutions: #14 (comment)

@leafac
Copy link
Owner

leafac commented Nov 21, 2023

Hi everyone,

First of all, thank you very much for your contribution! I appreciate the time you took into using caxa, understanding the packaging architecture, and contributing with the conversation about a different compression mechanism to make caxa better.

I’ve been thinking about the broad strategy employed by caxa and concluded that there is a better way to solve the problem. It doesn’t include a self-extracting executable, so it subsumes this pull request.

It’s a different enough approach that I think it deserves a new name, and it’s part of a bigger toolset that I’m building, which I call Radically Straightforward · Package.

I’m deprecating caxa and archiving this repository. I invite you to continue the conversation in Radically Straightforward’s issues.

Best.

@leafac leafac closed this Nov 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants