Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: add new docstring-code-format knob #8854

Merged
merged 2 commits into from
Dec 13, 2023
Merged

Conversation

BurntSushi
Copy link
Member

@BurntSushi BurntSushi commented Nov 27, 2023

This PR does the plumbing to make a new formatting option, docstring-code-format, available in the configuration for end users. It is disabled by default (opt-in). It is opt-in at least initially to reflect a conservative posture. The intent is to make it opt-out at some point in the future.

This was split out from #8811 in order to make #8811 easier to merge. Namely, once this is merged, docstring code snippet formatting will become available to end users. (See comments below for how we arrived at the name.)

Closes #7146

Test Plan

Other than the standard test suite, I ran the formatter over the CPython and polars projects to ensure both that the result looked sensible and that tests still passed. At time of writing, one issue that currently appears is that reformatting code snippets trips the long line lint: https://github.com/BurntSushi/polars/actions/runs/7006619426/job/19058868021

@BurntSushi BurntSushi mentioned this pull request Nov 27, 2023
3 tasks
Copy link
Contributor

github-actions bot commented Nov 27, 2023

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@BurntSushi
Copy link
Member Author

There's also a question of whether this ought to be a boolean option at all. @konstin raised this here: #8811 (comment)

That is, we might want the formatter to specifically raise an error if it comes across a snippet that is invalid Python. I somewhat think this would be better as a lint, but it would likely be much easier to implement in the formatter. If it were a lint, then we'd need to refactor the code snippet logic out of the formatter into something more generalized.

Base automatically changed from ag/fmt/docstrings to main November 27, 2023 16:14
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself looks good to me but I would prefer changing the option name (also on PyFormatOptions).

Rustfmt uses format_code_in_doc_comments which we could change to format_code_in_docstrings.

I prefer the name because

  • It's unclear to me what snippets means without prefixing it with code- which I find to verbose
  • format-embedded might be ambiguous if we start formatting embedded language, let's say SQL or CSS in your Python code (the other way round)
  • I'm okay with format-docstring-code, although I find rustfmt's naming clearer.

I'm okay with using a boolean. I don't think we should bail formatting just because of an incorrect example. Ideally we would emit a warning, but that's tracked as a separate issue.

The other question: Should this be gated behind preview mode (sorry if this is part of your other PR)? If so, should we emit a warning if it is enabled but the formatter is run without preview mode?

@zanieb
Copy link
Member

zanieb commented Nov 28, 2023

I like format-code-in-docstrings. I prefer clarity in settings, a little verbosity goes a long way. As I suggested before, I'm also into format-docstring-code. Another option is format-docstring-code-snippets. I think we can consider non-boolean options in the future, but we should start with the simple case.

Should this be gated behind preview mode (sorry if this is part of your other PR)? If so, should we emit a warning if it is enabled but the formatter is run without preview mode?

The idea is opt-in without preview opt-out (i.e. on by default) with preview. Having a "double" opt-in seems overkill.

Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format-code-in-docstrings sounds good

@BurntSushi
Copy link
Member Author

We have a winner. format-code-in-docstrings it is!

@charliermarsh
Copy link
Member

Works for me :)

@BurntSushi BurntSushi changed the title config: add new 'docstring-code' knob config: add new format-code-in-docstrings knob Dec 5, 2023
BurntSushi added a commit that referenced this pull request Dec 5, 2023
(This is not possible to actually use until
#8854 is merged.)

ruff_python_formatter: add reStructuredText docstring formatting support

This commit makes use of the refactoring done in prior commits to slot
in reStructuredText support. Essentially, we add a new type of code
example and look for *both* literal blocks and code block directives.
Literal blocks are treated as Python by default because it seems to be a
[common
practice](adamchainz/blacken-docs#195).

That is, literal blocks like this:

```
def example():
    """
    Here's an example::

        foo( 1 )

    All done.
    """
    pass
```

Will get reformatted. And code blocks (via reStructuredText directives)
will also get reformatted:


```
def example():
    """
    Here's an example:

    .. code-block:: python

        foo( 1 )

    All done.
    """
    pass
```

When looking for a code block, it is possible for it to become invalid.
In which case, we back out of looking for a code example and print the
lines out as they are. As with doctest formatting, if reformatting the
code would result in invalid Python or if the code collected from the
block is invalid, then formatting is also skipped.

A number of tests have been added to check both the formatting and
resetting behavior. Mixed indentation is also tested a fair bit, since
one of my initial attempts at dealing with mixed indentation ended up
not working.

I recommend working through this PR commit-by-commit. There is in
particular a somewhat gnarly refactoring before reST support is added.

Closes #8859
BurntSushi added a commit that referenced this pull request Dec 7, 2023
(This is not possible to actually use until
#8854 is merged.)

This commit slots in support for formatting Markdown fenced code
blocks[1]. With the refactoring done for reStructuredText previously,
this ended up being pretty easy to add. Markdown code blocks are also
quite a bit easier to parse and recognize correctly.

One point of contention in #8860 is whether to assume that unlabeled
Markdown code fences are Python or not by default. In this PR, we make
such an assumption. This follows what `rustdoc` does. The mitigation
here is that if an unlabeled code block isn't Python, then it probably
won't parse as Python. And we'll end up skipping it. So in the vast
majority of cases, the worst thing that can happen is a little bit of
wasted work.

Closes #8860

[1]: https://spec.commonmark.org/0.30/#fenced-code-blocks
@BurntSushi BurntSushi force-pushed the ag/fmt/user-config branch 2 times, most recently from 3d5d433 to a8f9c58 Compare December 8, 2023 18:24
BurntSushi added a commit that referenced this pull request Dec 11, 2023
…#9055)

## Summary

This does the light plumbing necessary to add a new internal option that
permits setting the line width of code examples in docstrings. The plan
is to add the corresponding user facing knob in #8854.

Note that this effectively removes the `same-as-global` configuration
style discussed [in this
comment](#8855 (comment)).
It replaces it with the `{integer}` configuration style only.

There are a lot of commits here, but they are each tiny to make review
easier because of the changes to snapshots.

## Test Plan

I added a new docstring test configuration that sets
`docstring-code-line-width = 60` and examined the differences.
@BurntSushi BurntSushi changed the title config: add new format-code-in-docstrings knob config: add new docstring-code-format knob Dec 11, 2023
@BurntSushi
Copy link
Member Author

I think this was talked about internally and not here on the PR, but we ended up settling on docstring-code-format for the name instead. I think we all originally liked format-code-in-docstrings, but once we realized we were going to add another option for controlling the line width in docstring code snippets, the naming became a little trickier. With docstring-code-format, there's a natural name for line width: docstring-code-line-width. That is, the docstring-code- prefix acts as a namespace of sorts with the noun first and the verb second. But format-code-in-docstrings doesn't really have that property, with the verb first and the noun coming afterwards.

@MichaReiser
Copy link
Member

Note: docstring-code-line-width should be named docstring-code-line-length to align with the global line-length setting. We only use line-width internally (and we may want to perform an internal rename to use consistent naming internally and externally)

@BurntSushi BurntSushi force-pushed the ag/fmt/user-config branch 2 times, most recently from 5405750 to 20ef1a4 Compare December 12, 2023 17:01
@BurntSushi
Copy link
Member Author

All righty, I think this PR is now in good spot where another review would be good. The probable changes since the last time you looked:

  • The docstring-code-line-length option has been added. It defaults to dynamic, but users can also set a fixed integer too.
  • I've added an integration-level test to sanity check that both docstring code options are actually working.
  • I've polished up the docs at the setting level and made some small edits to the existing prose formatter docs. I stopped short of adding a new section because I kind of felt like the docs on the setting itself might cover things, but happy to write more if folks think it would be useful.

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but can I request that we add a section like:

## Docstring formatting

The Ruff Formatter is capable of formatting Python code within docstrings...

...to formatter.md, just after its ## Configuration block?

@BurntSushi BurntSushi force-pushed the ag/fmt/user-config branch 2 times, most recently from f3e8073 to a6510f7 Compare December 12, 2023 17:49
@BurntSushi
Copy link
Member Author

@charliermarsh I've added prose docs and fixed up your comments. A review of the prose docs would be great! Thanks.

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks!

crates/ruff_workspace/src/options.rs Outdated Show resolved Hide resolved
crates/ruff_workspace/src/options.rs Show resolved Hide resolved
…namic'

I do feel like this is probably the right default and perhaps will be
the least surprising. It also has the benefit of avoiding the line
length lint in all cases.
…th' knobs

This commit does the plumbing to make a new formatting option,
'docstring-code-format', available in the configuration for end users.
It is disabled by default (opt-in). It is opt-in at least initially to
reflect a conservative posture. The intent is to make it opt-out at
some point in the future.

This also adds a compansion option, 'docstring-code-line-length', that
permits setting a line length for reformatted code examples that is
distinct from the global setting. Its default value is 'dynamic', which
means reformatted code will respect the global line width, regardless of
the indent level of the enclosing docstring.
@BurntSushi BurntSushi merged commit b6fb972 into main Dec 13, 2023
17 checks passed
@BurntSushi BurntSushi deleted the ag/fmt/user-config branch December 13, 2023 16:02
@cpcloud
Copy link

cpcloud commented Dec 18, 2023

This feature is most excellent, thank you for working on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using ruff to format code examples in docstrings
6 participants