Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yarn workspaces and upgrade to v2 #7446

Closed
wants to merge 27 commits into from

Conversation

clemsos
Copy link
Member

@clemsos clemsos commented Aug 24, 2021

Description

  • yarn workspaces
  • linter conf as package
  • fix various issues preventing install -- see yarn dlx @yarnpkg/doctor
  • upgrade to yarn v2

Issues

Refs #7427

Checklist:

  • 1 PR, 1 purpose: my Pull Request applies to a single purpose
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the docs to reflect my changes if applicable
  • I have added tests (and stories for frontend components) that prove my fix is effective or that my feature works
  • I have performed a self-review of my own code
  • If my code involves visual changes, I am adding applicable screenshots to this thread

@cla-bot cla-bot bot added the cla-signed label Aug 24, 2021
@clemsos
Copy link
Member Author

clemsos commented Aug 24, 2021

The recommendation for yarn v2 is to check .yarn/cache into the monorepo so the yarn time get down to a few seconds. In our case this is ~500M of files so not sure yet if this is the best way to go as it could also slow down things quite a lot ( cloning indexing etc). Need to assess

@julien51
Copy link
Member

Oh wow! Maybe we can discard some of the (heaviest) dependencies we have? any way to assess them?

@clemsos
Copy link
Member Author

clemsos commented Aug 25, 2021

Oh wow! Maybe we can discard some of the (heaviest) dependencies we have? any way to assess them?

to answer this, it seems that won't be necessary after all. What will be good is to rely on yarn cache on rebuild. There is sth called "plug n play" on yarn 2 and if we manage to activate the cache, that will be like few secs only to install packages once its setup.

The main thing now is to create a proper multi-stage docker image to build directly from monorepo, with stage 1 being the current core image, and other targets building different packages and apps (using docker buildkit ).

If we can use yarn cache properly then it will be much faster. Also we can also bundle all alpine/apk deps inside stage 1 so we dont get to rebuild all these either. Stages per services can then be used directly in compose, which is one less hassle.

There is a yarn prod-install plugin that helps package things for Docker.

@julien51 julien51 self-requested a review August 26, 2021 08:48
@julien51
Copy link
Member

Looks good to me!

@julien51
Copy link
Member

julien51 commented Sep 1, 2021

😱 all of the conflicts :(

@clemsos
Copy link
Member Author

clemsos commented Sep 1, 2021

yes that's a lot of conflicts, but again we can get rid of all the yarn locks so it is going to be fine.

I am in the middle of cleaning the deps anyway, running some utils to check deps. What works best so far

  • yarn workspaces foreach dlx depcheck
  • yarn dlx @yarnpkg/doctor

@clemsos
Copy link
Member Author

clemsos commented Sep 2, 2021

Ok time to write down what have been doing with yarn v2 + docker build workflow, to assess what we should do.

The monorepo approach with yarn relies on workspaces.

Cross dependencies

With workspaces, you can reference local packages from the monorepo itself. For instance, we are now loading the eslint conf by adding "@unlock-protocol/eslint-config" : "workspace:*" in the package.json. Yarn fetches it directly from the local shared/eslint-config folder where it is lives. This is very useful to manage cross-deps during development.

Node modules & Cache

When running yarn install, Yarn v2 creates a cache folder that stores a copy of all packages. When you install, it fetches directly from that local folder - which drastically reduce the exec time down to few sec usually. It will also generate one big yarn.lock with all packages version at the root level and create the node_modules for each workspace. (nb there is a more advanced option for yarn called Plug 'n play that removes the need for node_modules but we are not using it yet).

Yarn v2 + Docker

For the CI, we want to build a Docker image for each of the workspaces with only the relevant code and deps inside. To achieve this, there are two main issues :

  1. packaging : if we directly copy our package.json from yarn install, we have cross-deps refs that won't get resolved. This can be solved by copying shared deps, but it was still throwing errors. There is a yarn plugin that will pack everything properly but it is inteneded for production builds and will strip all dev deps - which we need on CI for tasks such as test, lint, etc.

  2. cache : In our monorepo, the first yarn install fetches all the packages and build the cache, which takes sth like ~15min. After that the next install takes only a few sec. Our goal is to prevent .yarn/cache from being fetched again at each new build. For that we use a standard multi-stage build where a first stage fetches all and the next one pick only the needed ones.

Export Yarn cache

So far I have used docker buildkit cache by using RUN --mount=type=cache,target=/home/unlock/yarn-cache,uid=1000,gid=1000 yarn install in our Dockerfile. This works well locally but wont on the CI, as the buildkit cache is cleared for each new environment. Currently, there is no direct feature to export buildkit cache - its under work moby/buildkit#1512 but nothing ready yet.

There are possible workarounds by exporting the cache to an archive with docker -o and mounting it back at start. However, this doesn't seem possible on Circle Ci (requires root access) but should work on Github Actions (from that SO post).

An alternative is what Yarn names Zero installs, which basically implies that you check .yarn/cache in the monorepo itself. In our case, thats 0.5GB added to the repo. There is a conversation here about pros and cons in yarnpkg/berry#180

Another idea is to create a simple yarn install image that is rebuilt separately every now and then (once a day?) to be used solely in CI as a volume to mount the cache.

There may be other solutions, I am still digging. More soon : )

@clemsos clemsos mentioned this pull request Sep 2, 2021
@julien51
Copy link
Member

julien51 commented Sep 2, 2021

Thanks for the clarification here! Maybe one thing we need to understand is see why we have more than 500MB of deps? Any clear culprit that we can remove or cleanup?

@clemsos
Copy link
Member Author

clemsos commented Sep 2, 2021

Thanks for the clarification here! Maybe one thing we need to understand is see why we have more than 500MB of deps? Any clear culprit that we can remove or cleanup?

yes thats a better approach. Here

$ du -sh ./node_modules/* | sort -nr | grep '\dM.*' | head -n20

142M    ./node_modules/@netlify
141M    ./node_modules/@storybook
113M    ./node_modules/truffle
 95M    ./node_modules/@openzeppelin
 87M    ./node_modules/@vue
 70M    ./node_modules/hardlydifficult-ethereum-contracts
 66M    ./node_modules/aws-sdk
 63M    ./node_modules/@truffle
 61M    ./node_modules/mcl-wasm
 61M    ./node_modules/ethers
 58M    ./node_modules/typescript
 58M    ./node_modules/microbundle-crl
 54M    ./node_modules/detective-typescript
 54M    ./node_modules/@firebase
 49M    ./node_modules/ethereumjs-testrpc
 46M    ./node_modules/solidity-coverage
 45M    ./node_modules/ganache-core
 43M    ./node_modules/hardhat
 33M    ./node_modules/hardlydifficult-eth
 30M    ./node_modules/next

what we can do

  1. cleanup smart-contracts-extension: remove @vue/cli(unused), migrate to hardhat (remove truffle, ganache, solidity-coverage, etc), remove deprec @openzeppelin/upgrades
  2. replace netlify deployment by docker cli image. It should be possible to dockerize the cli and build from an image instead of a node dep. That will remove the @netlify and detective-typescript
  3. remove unused ref @openzeppelin/cli in smart-contracts. Will also remove firebase

I think we can already cut down the size to <150M with this

@julien51
Copy link
Member

julien51 commented Sep 2, 2021

Sounds good!

Please, do move smart-contract-extensions to hardhat first! I think that's a low hanging fruit and you have experience with it already :)

I can look at the@netlify stuff and the @openzeppelin/cli.

I will also look into storybooks and see if we can make it ligther.

@clemsos
Copy link
Member Author

clemsos commented Sep 3, 2021

ok lots of learning here, also some complexity. I am gonna close this and split the problems into smaller PRs

@clemsos clemsos closed this Sep 3, 2021
@clemsos clemsos mentioned this pull request Sep 22, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants