Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor: linkcheck builder: reduce the lifetime of the 'response' variable #11432

Conversation

jayaddison
Copy link
Contributor

Feature or Bugfix

  • Refactoring

Purpose

  • Reduce the runtime duration during which the response variable that holds the results of HTTP request results should be considered live and is non-garbage-collectable.

Detail

  • I think that delayed resource-collection of the response variable may be a contributing factor to timeouts that occur during our unit tests.

Relates

@AA-Turner AA-Turner merged commit e45fb5e into sphinx-doc:master Jul 20, 2023
@AA-Turner
Copy link
Member

Thanks @jayaddison!

A

@AA-Turner
Copy link
Member

AA-Turner commented Jul 20, 2023

https://github.com/sphinx-doc/sphinx/actions/runs/5615478788/job/15215837543 -- LaTeX timed out

FAILED tests/test_build_linkcheck.py::test_defaults - assert "Anchor 'top' not found" in "links.rst:6: [broken] http://localhost:7777/#top: HTTPConnectionPool(host='localhost', port=7777): Read timed out. (read timeout=0.05)\nlinks.rst:4: [broken] http://localhost:7777/#!bar: HTTPConnectionPool(host='localhost', port=7777): Read timed out. (read timeout=0.05)\nlinks.rst:9: [broken] path/to/notfound: \nlinks.rst:7: [broken] http://localhost:7777#does-not-exist: Anchor 'does-not-exist' not found\nlinks.rst:11: [broken] http://localhost:7777/image.png: 404 Client Error: Not Found for url: http://localhost:7777/image.png\nlinks.rst:13: [broken] http://localhost:7777/image2.png: 404 Client Error: Not Found for url: http://localhost:7777/image2.png\n"

@jayaddison
Copy link
Contributor Author

@jayaddison jayaddison deleted the issue-11324-prep/linkcheck-refactor-response-variable-lifetime branch July 22, 2023 08:51
Comment on lines +322 to +325
with retrieval_method(url=req_url, auth=auth_info, config=self.config,
**retrieval_kwargs, **kwargs) as response:
if response.ok and anchor and not contains_anchor(response, anchor):
raise Exception(__(f'Anchor {anchor!r} not found'))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I introduced a regression with the rewrite of these lines - and the regression is reported here: #11532 (comment)

Basically: when anchor checking is not enabled -- that is, self.linkcheck_anchors is False -- the first retrieval method returned is an HTTP HEAD request. It can have response.ok, and the anchor value may be non-empty -- because we continue to extract (partition) the anchor from the uri regardless of whether anchor linkchecking will be performed. The HTTP HEAD response doesn't contain a content body, so contains_anchor will return false, and the exception is raised.

I'll prepare a fixup/patch for this and some test coverage to cover this situation within the next few days.

cc @adamtheturtle

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants