-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend formatter IR to support Black's expression formatting #5596
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
d40b984
to
c025995
Compare
b104f22
to
7135143
Compare
PR Check ResultsEcosystem✅ ecosystem check detected no changes. BenchmarkLinux
Windows
|
This was referenced Jul 8, 2023
4c48749
to
4ffd817
Compare
This was referenced Jul 10, 2023
4ffd817
to
6d11701
Compare
konstin
approved these changes
Jul 11, 2023
6d11701
to
ff18d52
Compare
MichaReiser
added a commit
that referenced
this pull request
Jul 11, 2023
<!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses: ```python # We want if a + [ b, c ]: pass # Rather than if ( a + [b, c] ): pass ``` This is implemented by using the new IR elements introduced in #5596. * We give the group wrapping the optional parentheses an ID (`parentheses_id`) * We use `conditional_group` for the lower priority groups (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary) * We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized). We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free) ## Test Plan It increases the Jaccard index for Django from 0.915 to 0.917 ## Incompatibilites There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences). ### Long string literals I commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points. I think we should ignore this incompatibility for now ### Expressions on statement level I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width ```python aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd ``` But it would if the expression is used inside of a condition. What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR makes two extensions to support Black's expression formatting.
ConditionalGroup
The first extension adds a new
ConditionalGroup
IR element. The semantic is the same as forGroup
, except that it has a condition controlling whether the enclosing content should be grouped or not. Another way to think about this is: Only group this content if some condition is true, but the condition only gets evaluated when printing the IR.This is necessary to support Black's formatting for expressions where the enclosing parentheses are optional, for example Black only adds parentheses around the condition of an
if
statement if the expression, after breaking any lists, dicts, etc., does not fit on a line:which is invalid syntax. Meaning we only want to break after left-parens (
(
,[
,{
), but not before operators.However, we do want to break before operators when the whole expression does not fit:
If the whole expression does not fit. You can see, how this is the opposite of when not adding the parentheses.
The way this extension solves this problem is to remove the binary expression groups (gate them with a condition) when the whole expression fits, but keep them when the whole expression expands.
Another way to think about the new IR is that this is the same as the following, but as custom IR to avoid interning
content
.FitsExpanded
The way we implement the adding of the optional parentheses is by wrapping the whole
if
condition by a group. However, this creates a problem if an inner parenthesized expression expands, because this would automatically expand all enclosing groups, including the group that adds the optional parentheses. But we don't want the optional parentheses if a parenthesized expression expands.The way this PR solves this problem is by introducing a new
FitsExpanded
IR that, similar to best fitting, acts as an expands boundary. Meaning, it won't expand the enclosing parenthesesgroup
if any of its content expands (a parenthesized expression). Other than best fitting, it also has a differentfits
definition. The content inside aFitsExpanded
doesn't get measured in the "all flat" mode, instead, it gets measured assuming all its inner content will expand (what's the least space that is required, rather than what is the most space that it requires).FitsExpanded
also supports an optionalCondition
to control when this changed semantic applies or not. This is necessary because we want to keep the list expression together, if possible, when we break before an operatorTest Plan
I added a few doctests and use the new IR in the next IR to format expressions.
Reference
This behavior mimics Blacks
can_omit_invisible_parentheses
and transformer selection.Ruff does not test whether a parenthesized expression is at the start or end of line. This is done inside of the printer (expanding after an opening parentheses always has the consequence that the expression now is at the end of the line)