Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add new builder that checks if hardcoded URLs can be replaced with crossrefs #9626

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

hoefling
Copy link
Contributor

Subject: (copied from pytest-dev/pytest#9082)

This PR adds a new no-op Sphinx builder named crossrefcheck that checks whether a hardcoded URL in the text can be replaced with a crossreference from one of the inventories configured in intersphinx_mapping.

Feature or Bugfix

  • Feature

Purpose

If a hardcoded link can be replaced with a crossref, crossrefcheck will emit a warning like this:

pytest/doc/en/reference/reference.rst:127: WARNING: hardcoded link 'https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertAlmostEqual' could be replaced by a cross-reference to 'python' inventory (try using ':py:meth:`unittest.TestCase.assertAlmostEqual`' instead)

If a replacement suggestion is printed, the hardcoded URL can be safely replaced with that suggestion.

Usage

$ sphinx-build -b crossrefcheck doc build

Caveats

The builder is not able to catch the URLs for the internal docs. This is because no information is known about the URL the docs will be hosted at. Thus, if e.g. links to https://docs.pytest.org should be checked as well in the docs for the pytest project, it is best to extend intersphinx_mapping. Example:

 intersphinx_mapping = {
    ...
    "pytest-latest": ("https://docs.pytest.org/en/latest", None),
    "pytest-stable": ("https://docs.pytest.org/en/stable", None),
    "pytest-6": ("https://docs.pytest.org/en/6.2.x", None),
 }

If a URL is reported without a suggestion, it usually indicates that either:

  1. a link points to a section or paragraph that doesn't have a Sphinx reference defined - the external docs should be updated then,
  2. or the reference was changed, so the link is outdated (but still valid!). In this case, it's usually best to check the docs source to find the new correct ref name.

Example output from running over Sphinx repository

sphinx/doc/development/tutorials/helloworld.rst:132: WARNING: hardcoded link 'https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH' could be replaced by a cross-reference to 'python' inventory (try using ':std:envvar:`PYTHONPATH`' instead)
sphinx/doc/development/tutorials/helloworld.rst:151: WARNING: hardcoded link 'https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH' could be replaced by a cross-reference to 'python' inventory (try using ':std:envvar:`PYTHONPATH`' instead)
sphinx/doc/development/tutorials/todo.rst:294: WARNING: hardcoded link 'https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH' could be replaced by a cross-reference to 'python' inventory (try using ':std:envvar:`PYTHONPATH`' instead)
sphinx/doc/examples.rst:95: WARNING: hardcoded link 'https://docs.python.org/3/' could be replaced by a cross-reference to 'python' inventory (no suggestion)
sphinx/doc/man/sphinx-apidoc.rst:17: WARNING: hardcoded link 'https://docs.python.org/3/library/fnmatch.html' could be replaced by a cross-reference to 'python' inventory (try using ':std:doc:`library/fnmatch`' instead)
sphinx/doc/usage/configuration.rst:570: WARNING: hardcoded link 'https://requests.readthedocs.io/en/master/' could be replaced by a cross-reference to 'requests' inventory (no suggestion)
sphinx/doc/usage/restructuredtext/directives.rst:1000: WARNING: hardcoded link 'https://docs.python.org/3/reference/lexical_analysis.html#identifiers' could be replaced by a cross-reference to 'python' inventory (try using ':std:ref:`identifiers`' instead)

@hoefling
Copy link
Contributor Author

hoefling commented Sep 11, 2021

The original proposal was made in pytest-dev/pytest#9082, where it was suggested to add the builder to Sphinx instead, so it can be reused in multiple projects. Please note that this PR isn't ready to be merged yet (e.g. missing tests and proper documentation). Its purpose is merely to propose the builder first and check whether @tk0miya would approve the general idea.

@tk0miya
Copy link
Member

tk0miya commented Sep 12, 2021

Agreed. This must be worthy. So +1 for merging this into the core. But I'm not sure this should be added as a standalone builder or an additional feature of linkcheck builder. Is there any reason to make this standalone?

@jakobandersen
Copy link
Contributor

This is pretty cool. Implementation-wise I'm wondering if it is possible to have a bit more indirection such that this doesn't access the intersphinx inventory directly, but asks the intersphinx extension to lookup an URL and return the necessary information.
The issue is that in order to update intersphinx to work in all cases the inventory format needs to change so it would be good if the rest of Sphinx doesn't assume a specific inventory format.

@Zac-HD
Copy link
Contributor

Zac-HD commented Sep 13, 2021

I'm not sure this should be added as a standalone builder or an additional feature of linkcheck builder. Is there any reason to make this standalone?

I would prefer a standalone crossref builder so I can run it offline in CI, where linkcheck can fail due to flaky websites or network issues.

@hoefling
Copy link
Contributor Author

hoefling commented Sep 25, 2021

Since the changes are added to Sphinx itself, the builder might not be necessary at all, depending on how deeply those warnings should be integrated. E.g. I could add a SphinxPostTransform directly to sphinx.ext.intersphinx and emit the warnings by default when the docs are built and the intersphinx extension is enabled. @tk0miya @Zac-HD what do you think? I have added a prototype in the latest commit where the warnings are emitted directly on doc build and no special builder is needed. This also simplifies the changes a lot (one new SphinxPostTransform, one extra utility function). The move of the code to intersphinx also addresses @jakobandersen's comment, since all the logic is now internal to intersphinx extension.

@hoefling hoefling force-pushed the crossrefcheck-builder branch from 58ae250 to 241a390 Compare September 25, 2021 12:53
@Zac-HD
Copy link
Contributor

Zac-HD commented Sep 25, 2021

That would be fantastic for all my use-cases!

@RonnyPfannschmidt
Copy link

follow-up idea - for linkcheck identify links that are in locations where a intersphinx ref could be used or added (for links to other docs that use sphinx but are not declared as intersphinx

@hoefling
Copy link
Contributor Author

@tk0miya gentle ping for a review.

@hoefling hoefling force-pushed the crossrefcheck-builder branch from 4273745 to e03424f Compare October 20, 2021 21:10
@hoefling hoefling force-pushed the crossrefcheck-builder branch 2 times, most recently from 4026ad1 to f2f07a3 Compare November 12, 2021 19:18
@hoefling hoefling changed the base branch from master to 4.x November 12, 2021 19:18
@hoefling hoefling force-pushed the crossrefcheck-builder branch from f13bbb9 to 6dead54 Compare December 9, 2021 08:56
@hoefling
Copy link
Contributor Author

hoefling commented Dec 9, 2021

@tk0miya gentle ping. Maybe it's still not too late for this to become part of 4.4?

@hoefling hoefling force-pushed the crossrefcheck-builder branch from aeafef9 to 48f8b2b Compare December 11, 2021 13:29
@hoefling hoefling force-pushed the crossrefcheck-builder branch from 48f8b2b to 327f381 Compare December 20, 2021 13:08
@jakobandersen jakobandersen added this to the 4.4.0 milestone Dec 29, 2021
@tk0miya tk0miya modified the milestones: 4.4.0, 4.5.0 Jan 15, 2022
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
…e in docstring

Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: Oleg Hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
@hoefling hoefling force-pushed the crossrefcheck-builder branch from 9423340 to 2d50d64 Compare August 29, 2023 13:52
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
Signed-off-by: oleg.hoefling <oleg.hoefling@gmail.com>
@hoefling
Copy link
Contributor Author

@AA-Turner done 😎

@@ -1063,6 +1063,8 @@ Features added
* #9075: autodoc: Add a config variable :confval:`autodoc_typehints_format`
to suppress the leading module names of typehints of function signatures (ex.
``io.StringIO`` -> ``StringIO``)
* #9626: intersphinx: Emit warning if a hardcoded link is replaceable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to the latest version now!

@@ -116,8 +116,8 @@

intersphinx_mapping = {
'python': ('https://docs.python.org/3/', None),
'requests': ('https://requests.readthedocs.io/en/latest/', None),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's minimize the changes (unless there is a reason to change this)

@@ -25,7 +25,7 @@
import sys
import time
from os import path
from typing import TYPE_CHECKING, cast
from typing import IO, TYPE_CHECKING, Any, cast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you need IO and Any as runtime types?

# Project: foo
# Version: 2.0
# The remainder of this file is compressed with zlib.
''' + zlib.compress(b'')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcode the value of zlib.compress(..) and put a comment.

Other suggestion: make these constants private or create a class testing find_replacements so that the logic is "contained" (with a class-based approach you could put the setup of your test in a method).

normalize_intersphinx_mapping(app, app.config)
load_mappings(app)

assert next(find_replacements(app, 'https://example.com'), None) is None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert not list(...)

load_mappings(app)

uri = 'https://example.com/foo.html#module-module1'
replacement = next(find_replacements(app, uri), None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert not list(...)

load_mappings(app)

uri = 'https://docs.python.org/foo.html#module-module1'
replacement = next(find_replacements(app, uri))
Copy link
Member

@picnixz picnixz Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replacements = list(...)
assert len(replacements) == ...
assert replacements[0] == ...
...

or

assert list(...) == [...]

@AA-Turner AA-Turner modified the milestones: 7.x, 8.x Jul 20, 2024
@AA-Turner AA-Turner modified the milestones: 8.x, some future version Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants