Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use cached file for wheel inspection #7916

Merged

Conversation

ralbertazzi
Copy link
Contributor

@ralbertazzi ralbertazzi commented May 13, 2023

Another PR that goes in the way of improving the Poetry experience when dealing with huge wheels (did anyone say PyTorch? 👀 #6409 )

I'm hosting huge wheels on a repository that is not PEP 658 compliant. What happens is that Poetry will rightfully want to get package metadata from the first wheel. The package is correctly cached on filesystem in the repository cache (the whole Authenticator + FileCache flow) so that future downloads are not needed again.

However, when a wheel is already cached on filesystem the inspection code performs an unnecessary data transfer operation from the cache location to a temporary directory. As the speed of this operation is controlled by the low chunk_size: int = 1024 value, we spend a gigantic amount of time waiting for something that we already have. Console logs also incorrectly show that a package is being downloaded, while in reality no network operation is happening.

This PR fixes this behaviour by accessing the cached file directly, and relying on the download to temporary file only if the file is not cached (or the cache is disabled). In order to do so it exploits the public url_to_file_path utility from cachecontrol. I thought it was best leaving any direct cachecontrol dependency inside the Authenticator, which is the reason why I'm touching two classes.

  • Added tests for changed code.
  • [ ] Updated documentation for changed code.

@ralbertazzi ralbertazzi force-pushed the perf/use-cached-weel-for-inspection branch from 14329f9 to 176b109 Compare May 13, 2023 12:48
@ralbertazzi ralbertazzi force-pushed the perf/use-cached-weel-for-inspection branch 2 times, most recently from 1dda952 to 45617e1 Compare May 13, 2023 14:29
@ralbertazzi ralbertazzi force-pushed the perf/use-cached-weel-for-inspection branch 2 times, most recently from 7aeb909 to 48df4ab Compare May 13, 2023 19:57
@radoering radoering mentioned this pull request May 14, 2023
@ralbertazzi ralbertazzi force-pushed the perf/use-cached-weel-for-inspection branch 3 times, most recently from 8570d06 to 06f8dd6 Compare May 14, 2023 16:35
@ralbertazzi ralbertazzi force-pushed the perf/use-cached-weel-for-inspection branch from 06f8dd6 to 9d68507 Compare May 15, 2023 15:13
@radoering radoering merged commit 9d03170 into python-poetry:master May 15, 2023
@ralbertazzi ralbertazzi deleted the perf/use-cached-weel-for-inspection branch May 16, 2023 14:07
mwalbeck pushed a commit to mwalbeck/docker-python-poetry that referenced this pull request May 23, 2023
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [poetry](https://python-poetry.org/) ([source](https://github.com/python-poetry/poetry), [changelog](https://python-poetry.org/history/)) | minor | `1.4.2` -> `1.5.0` |

---

### Release Notes

<details>
<summary>python-poetry/poetry</summary>

### [`v1.5.0`](https://github.com/python-poetry/poetry/blob/HEAD/CHANGELOG.md#&#8203;150---2023-05-19)

[Compare Source](python-poetry/poetry@1.4.2...1.5.0)

##### Added

-   **Introduce the new source priorities `explicit` and `supplemental`** ([#&#8203;7658](python-poetry/poetry#7658),
    [#&#8203;6879](python-poetry/poetry#6879)).
-   **Introduce the option to configure the priority of the implicit PyPI source** ([#&#8203;7801](python-poetry/poetry#7801)).
-   Add handling for corrupt cache files ([#&#8203;7453](python-poetry/poetry#7453)).
-   Improve caching of URL and git dependencies ([#&#8203;7693](python-poetry/poetry#7693),
    [#&#8203;7473](python-poetry/poetry#7473)).
-   Add option to skip installing directory dependencies ([#&#8203;6845](python-poetry/poetry#6845),
    [#&#8203;7923](python-poetry/poetry#7923)).
-   Add `--executable` option to `poetry env info` ([#&#8203;7547](python-poetry/poetry#7547)).
-   Add `--top-level` option to `poetry show` ([#&#8203;7415](python-poetry/poetry#7415)).
-   Add `--lock` option to `poetry remove` ([#&#8203;7917](python-poetry/poetry#7917)).
-   Add experimental `POETRY_REQUESTS_TIMEOUT` option ([#&#8203;7081](python-poetry/poetry#7081)).
-   Improve performance of wheel inspection by avoiding unnecessary file copy operations ([#&#8203;7916](python-poetry/poetry#7916)).

##### Changed

-   **Remove the old deprecated installer and the corresponding setting `experimental.new-installer`** ([#&#8203;7356](python-poetry/poetry#7356)).
-   **Introduce `priority` key for sources and deprecate flags `default` and `secondary`** ([#&#8203;7658](python-poetry/poetry#7658)).
-   Deprecate `poetry run <entry point>` if the entry point was not previously installed via `poetry install` ([#&#8203;7606](python-poetry/poetry#7606)).
-   Only write the lock file if the installation succeeds ([#&#8203;7498](python-poetry/poetry#7498)).
-   Do not write the unused package category into the lock file ([#&#8203;7637](python-poetry/poetry#7637)).

##### Fixed

-   Fix an issue where Poetry's internal pyproject.toml continually grows larger with empty lines ([#&#8203;7705](python-poetry/poetry#7705)).
-   Fix an issue where Poetry crashes due to corrupt cache files ([#&#8203;7453](python-poetry/poetry#7453)).
-   Fix an issue where the `Retry-After` in HTTP responses was not respected and retries were handled inconsistently ([#&#8203;7072](python-poetry/poetry#7072)).
-   Fix an issue where Poetry silently ignored invalid groups ([#&#8203;7529](python-poetry/poetry#7529)).
-   Fix an issue where Poetry does not find a compatible Python version if not given explicitly ([#&#8203;7771](python-poetry/poetry#7771)).
-   Fix an issue where the `direct_url.json` of an editable install from a git dependency was invalid ([#&#8203;7473](python-poetry/poetry#7473)).
-   Fix an issue where error messages from build backends were not decoded correctly ([#&#8203;7781](python-poetry/poetry#7781)).
-   Fix an infinite loop when adding certain dependencies ([#&#8203;7405](python-poetry/poetry#7405)).
-   Fix an issue where pre-commit hooks skip pyproject.toml files in subdirectories ([#&#8203;7239](python-poetry/poetry#7239)).
-   Fix an issue where pre-commit hooks do not use the expected Python version ([#&#8203;6989](python-poetry/poetry#6989)).
-   Fix an issue where an unclear error message is printed if the project name is the same as one of its dependencies ([#&#8203;7757](python-poetry/poetry#7757)).
-   Fix an issue where `poetry install` returns a zero exit status even though the build script failed ([#&#8203;7812](python-poetry/poetry#7812)).
-   Fix an issue where an existing `.venv` was not used if `in-project` was not set ([#&#8203;7792](python-poetry/poetry#7792)).
-   Fix an issue where multiple extras passed to `poetry add` were not parsed correctly ([#&#8203;7836](python-poetry/poetry#7836)).
-   Fix an issue where `poetry shell` did not send a newline to `fish` ([#&#8203;7884](python-poetry/poetry#7884)).
-   Fix an issue where `poetry update --lock` printed operations that were not executed ([#&#8203;7915](python-poetry/poetry#7915)).
-   Fix an issue where `poetry add --lock` did perform a full update of all dependencies ([#&#8203;7920](python-poetry/poetry#7920)).
-   Fix an issue where `poetry shell` did not work with `nushell` ([#&#8203;7919](python-poetry/poetry#7919)).
-   Fix an issue where subprocess calls failed on Python 3.7 ([#&#8203;7932](python-poetry/poetry#7932)).
-   Fix an issue where keyring was called even though the password was stored in an environment variable ([#&#8203;7928](python-poetry/poetry#7928)).

##### Docs

-   Add information about what to use instead of `--dev` ([#&#8203;7647](python-poetry/poetry#7647)).
-   Promote semantic versioning less aggressively ([#&#8203;7517](python-poetry/poetry#7517)).
-   Explain Poetry's own versioning scheme in the FAQ ([#&#8203;7517](python-poetry/poetry#7517)).
-   Update documentation for configuration with environment variables ([#&#8203;6711](python-poetry/poetry#6711)).
-   Add details how to disable the virtualenv prompt ([#&#8203;7874](python-poetry/poetry#7874)).
-   Improve documentation on whether to commit `poetry.lock` ([#&#8203;7506](python-poetry/poetry#7506)).
-   Improve documentation of `virtualenv.create` ([#&#8203;7608](python-poetry/poetry#7608)).

##### poetry-core ([`1.6.0`](https://github.com/python-poetry/poetry-core/releases/tag/1.6.0))

-   Improve error message for invalid markers ([#&#8203;569](python-poetry/poetry-core#569)).
-   Increase robustness when deleting temporary directories on Windows ([#&#8203;460](python-poetry/poetry-core#460)).
-   Replace `tomlkit` with `tomli`, which changes the interface of some *internal* classes ([#&#8203;483](python-poetry/poetry-core#483)).
-   Deprecate `Package.category` ([#&#8203;561](python-poetry/poetry-core#561)).
-   Fix a performance regression in marker handling ([#&#8203;568](python-poetry/poetry-core#568)).
-   Fix an issue where wildcard version constraints were not handled correctly ([#&#8203;402](python-poetry/poetry-core#402)).
-   Fix an issue where `poetry build` created duplicate Python classifiers if they were specified manually ([#&#8203;578](python-poetry/poetry-core#578)).
-   Fix an issue where local versions where not handled correctly ([#&#8203;579](python-poetry/poetry-core#579)).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS44Mi4wIiwidXBkYXRlZEluVmVyIjoiMzUuODIuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Reviewed-on: https://git.walbeck.it/walbeck-it/docker-python-poetry/pulls/717
Co-authored-by: renovate-bot <bot@walbeck.it>
Co-committed-by: renovate-bot <bot@walbeck.it>
ralbertazzi added a commit to ralbertazzi/poetry that referenced this pull request May 24, 2023
@ralbertazzi ralbertazzi mentioned this pull request May 24, 2023
2 tasks
poetry-bot bot pushed a commit that referenced this pull request May 24, 2023
radoering pushed a commit that referenced this pull request May 24, 2023
Copy link

github-actions bot commented Mar 3, 2024

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants