Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache pip downloads in CI #9192

Merged
merged 4 commits into from
Nov 15, 2022
Merged

Conversation

Avasam
Copy link
Collaborator

@Avasam Avasam commented Nov 13, 2022

Addresses #9189

  • Let actions/setup-python know to cache pip.
  • Jobs that don't run pip install have no cache specified.
  • I've specified the relevant cache-dependency-path for every job.
    • mypy_primer is a bit of a special case, but it does so much downloading it should still be beneficial. Edit: To be reconsidered, we don't have a great way of specifying a cache key to invalidate.

Note that this only caches the download of distributions, not the install (for that, see: actions/setup-python#330 )
This means that any change in one of the files specified for cache-dependency-path will result in a different cache hash and all dependencies will be redownloaded (current behaviour).

Further improvements may include:

  • Caching of machine dependencies (apt, Homebrew, Choco).
  • Extracting the "skip" logic from stubtest_third_party so that the job can skip installs if stubtest is going to be skipped.

I wanna keep this PR simple and just targeting pip dependencies. I can create new issues for those.

@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@@ -55,6 +58,10 @@ jobs:
- uses: actions/setup-python@v4
with:
python-version: "3.9"
cache: 'pip'
cache-dependency-path: |
requirements-tests.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this contain METADATA.toml files? 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we'll be able to specify non-types dependencies, are they going to be specified in METADATA.toml ? If so, then yes cache-dependency-path should contain METADATA.toml for stubtest jobs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we'll be able to specify non-types dependencies, are they going to be specified in METADATA.toml ?

I think that's the idea, yup!

- name: Update pip
if: matrix.os == "macos-11"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this line? I usually update pip on all envs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, pip is always updated and included in runner images. Because of this, the latest version of pip is usually installed already.

actions/setup-python#506 (comment)

I can only assume python -m pip install -U pip was done because of "macos-11". As the only times that command is ran is when also using "macos-11. So only running it on that OS is a small optimization for the others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we have PIP_DISABLE_PIP_VERSION_CHECK: 1 set as an environment variable in this workflow, we probably don't need this step at all. It usually isn't essential to have the very latest version of pip installed; the only significant benefit from upgrading pip in CI is that you don't get the annoying "WARNING: a newer version of pip is available, you should upgrade NOW" message with every CI job.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even notice PIP_DISABLE_PIP_VERSION_CHECK. Should I remove the step? Leave it as it was before?

Copy link
Member

@AlexWaygood AlexWaygood Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon you can remove it. There's other workflows we have in this repo where we don't have it as a step, and removing the step fits with the theme of this PR ("speed up CI by getting pip to do less work").

Comment on lines 37 to 39
python-version: "3.9"
cache: 'pip'
cache-dependency-path: requirements-tests.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

          python-version: "3.9"
          cache: 'pip'
          cache-dependency-path: requirements-tests.txt

Consider this as a rant about yaml: these three lines have three different quotes 😒

Copy link
Collaborator Author

@Avasam Avasam Nov 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 What's the preferred quote style between " and ' here? Normally I use the redhat.vscode-yaml plugin which formats to "(non-configurable). But I've disabled all my auto-formatters on non-python files in this project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the preferred quote style between " and ' here?

I don't think we have one, and I for one don't care at all, as long as it works :)

Comment on lines 35 to 39
cache: 'pip'
cache-dependency-path: |
typeshed_to_test/requirements-tests.txt
typeshed_to_test/stubs/**/@tests/requirements-stubtest.txt
typeshed_to_test/stubs/**/METADATA.toml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this will actually speed anything up. The only thing this workflow directly pip-installs is mypy_primer, which has 0 dependencies. mypy_primer will then dynamically install loads of things during the course of the script's execution, but I don't think it'll ever install anything from typeshed's requirements-tests.txt file, any METADATA.toml files, or any requirements-stubtest.txt files.

Maybe just get rid of the changes to this workflow file, for now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more sensible to base the cache key on mypy_primer.py. But I can't really specify that. mypy_primer downloads so much it would make sense to cache those. But then how do we specify that the cache should be invalidated? Here I tried to target everything that may suggest a need to invalidate (ie version bump or dependency change of the stub).

Anyway, this can be reconsidered, I'll revert for now.

@Avasam Avasam requested review from sobolevn and AlexWaygood and removed request for sobolevn November 15, 2022 00:47
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@AlexWaygood AlexWaygood changed the title Cache pip downloads Cache pip downloads in CI Nov 15, 2022
@AlexWaygood AlexWaygood merged commit cfed3e1 into python:main Nov 15, 2022
@Avasam Avasam deleted the cache-pip-downloads branch November 15, 2022 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants