Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add formatter progress tracking to CI #5919

Merged
merged 2 commits into from
Jul 24, 2023
Merged

Conversation

konstin
Copy link
Member

@konstin konstin commented Jul 20, 2023

Summary Add a formatter progress testing script to CI. This script will 1) print the black compability on each run 2) catch regressions wrt to formatter stability, emitting invalid syntax and other kinds of bugs (e.g. #5917) before they land on main 3) have an additional layer of real world tests when implementing new nodes or other new formatter code.

This is currently a bash script, i'm not sure if we want to keep it that way, or switch to e.g. the regular ecosystem scripts. The output separation of format_dev could also use some polishing. We should also consider pinning commits so we don't get spurious regression when they change their code.

Test Plan The script extends CI.

@konstin konstin force-pushed the add_format_testing_script branch from 195624c to 9f5b6b8 Compare July 20, 2023 13:28
@konstin
Copy link
Member Author

konstin commented Jul 20, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 20, 2023

PR Check Results

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00     10.8±0.24ms     3.8 MB/sec    1.03     11.2±0.49ms     3.6 MB/sec
formatter/numpy/ctypeslib.py               1.02      2.2±0.15ms     7.5 MB/sec    1.00      2.2±0.08ms     7.6 MB/sec
formatter/numpy/globals.py                 1.00   255.2±14.15µs    11.6 MB/sec    1.00   255.2±15.23µs    11.6 MB/sec
formatter/pydantic/types.py                1.00      4.7±0.12ms     5.4 MB/sec    1.04      4.9±0.23ms     5.2 MB/sec
linter/all-rules/large/dataset.py          1.00     16.0±0.30ms     2.5 MB/sec    1.00     16.0±0.32ms     2.5 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      4.0±0.11ms     4.1 MB/sec    1.00      4.0±0.10ms     4.2 MB/sec
linter/all-rules/numpy/globals.py          1.01   540.5±20.84µs     5.5 MB/sec    1.00   537.3±23.58µs     5.5 MB/sec
linter/all-rules/pydantic/types.py         1.00      7.3±0.25ms     3.5 MB/sec    1.00      7.3±0.26ms     3.5 MB/sec
linter/default-rules/large/dataset.py      1.03      8.3±0.25ms     4.9 MB/sec    1.00      8.1±0.14ms     5.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.02  1756.3±48.37µs     9.5 MB/sec    1.00  1724.1±46.89µs     9.7 MB/sec
linter/default-rules/numpy/globals.py      1.03    207.2±7.33µs    14.2 MB/sec    1.00    200.5±8.05µs    14.7 MB/sec
linter/default-rules/pydantic/types.py     1.02      3.7±0.16ms     6.9 MB/sec    1.00      3.7±0.06ms     7.0 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.03     13.6±0.45ms     3.0 MB/sec    1.00     13.2±0.41ms     3.1 MB/sec
formatter/numpy/ctypeslib.py               1.00      2.6±0.10ms     6.3 MB/sec    1.06      2.8±0.25ms     6.0 MB/sec
formatter/numpy/globals.py                 1.02   296.8±12.56µs     9.9 MB/sec    1.00   289.9±16.22µs    10.2 MB/sec
formatter/pydantic/types.py                1.00      5.9±0.29ms     4.3 MB/sec    1.03      6.1±0.55ms     4.2 MB/sec
linter/all-rules/large/dataset.py          1.01     18.8±0.49ms     2.2 MB/sec    1.00     18.6±0.62ms     2.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      5.0±0.19ms     3.3 MB/sec    1.00      5.0±0.21ms     3.3 MB/sec
linter/all-rules/numpy/globals.py          1.00   583.6±26.43µs     5.1 MB/sec    1.02   597.0±51.20µs     4.9 MB/sec
linter/all-rules/pydantic/types.py         1.01      8.6±0.25ms     3.0 MB/sec    1.00      8.5±0.27ms     3.0 MB/sec
linter/default-rules/large/dataset.py      1.00      9.8±0.29ms     4.2 MB/sec    1.00      9.8±0.26ms     4.2 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.01      2.0±0.09ms     8.2 MB/sec    1.00      2.0±0.07ms     8.3 MB/sec
linter/default-rules/numpy/globals.py      1.02   238.2±13.29µs    12.4 MB/sec    1.00   233.8±10.81µs    12.6 MB/sec
linter/default-rules/pydantic/types.py     1.00      4.4±0.23ms     5.8 MB/sec    1.00      4.4±0.14ms     5.9 MB/sec

@konstin konstin force-pushed the add_format_testing_script branch from 2379138 to 804fc80 Compare July 23, 2023 13:30
@konstin
Copy link
Member Author

konstin commented Jul 23, 2023

I think the main question remaining for me is: Should we pin the repos to a specific commit?

pro: the score may otherwise change due to downstream changes unrelated to out ours
pro: correct PRs may break if they add changes that break our formatter
con: we won't see any new python syntax with an old commit pinned
con: we won't see any black updates with an old commit pinned

Effectively, i want lockfiles and dependabot for this, but that doesn't exist, so we'll have to go with either option

@konstin
Copy link
Member Author

konstin commented Jul 23, 2023

the result is now posted as github step summary (https://github.com/astral-sh/ruff/actions/runs/5636816309?pr=5919#summary-15269270283):

image

i wish i could show that without switching to the CI tab and scrolling down, but it's still good to have this in CI either way

@konstin konstin marked this pull request as ready for review July 23, 2023 13:59
@MichaReiser
Copy link
Member

MichaReiser commented Jul 24, 2023

I think the main question remaining for me is: Should we pin the repos to a specific commit?

I'm leaning toward pinning:

  • Updating is low effort. We can do this every week
  • It is otherwise unclear if the similarity index improve because of my changes or because of changes in the tested repositories
  • Having all formatter PRs blocked by a new stability issue would be unfortunate. It also reduces trust in the check because there's always the chance that your change didn't introduce the instability.

the result is now posted as github step summary (astral-sh/ruff/actions/runs/5636816309?pr=5919#summary-15269270283):

Nice: A future improvement could be integrating it into the same job that also posts the benchmark and ecosystem results

@@ -323,14 +323,19 @@ jobs:
name: "Check formatter stability"
runs-on: ubuntu-latest
needs: determine_changes
if: needs.determine_changes.outputs.formatter == 'true'
# if: needs.determine_changes.outputs.formatter == 'true'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merge: Revert

@@ -323,14 +323,19 @@ jobs:
name: "Check formatter stability"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Rename, considering that it now checks more than just the progress.

Comment on lines 23 to 28
# build 5800521541e5e749d4429617420d1ef8cdb40b46
# django 0016a4299569a8f09ff24053ff2b8224f7fa4113
# transformers 5bb4430edc7df9f9950d412d98bbe505cc4d328b
# typeshed 57c435cd7e964290005d0df0d9b5daf5bd2cbcb1
# warehouse e72cca94e7ac0dbe095db5c2942ad9f2f51b30cc
# zulip 1cd587d24be1d668fcf6d136172bfec69e35cb75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this?

Comment on lines 30 to 47
cargo run --bin ruff_dev -- format-dev --stability-check --error-file "$target/progress_projects_errors.txt" \
--multi-project "$dir" >"$target/progress_projects_report.txt"
grep "similarity index" "$target/progress_projects_report.txt" | sort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: An alternative to grepping could have been to add a --markdown flag to the CLI that enables a markdown rendered summary

@konstin konstin force-pushed the add_format_testing_script branch from a73c03e to f046dc5 Compare July 24, 2023 08:51
@konstin konstin enabled auto-merge (squash) July 24, 2023 09:05
@konstin konstin merged commit 8a7dcb7 into main Jul 24, 2023
@konstin konstin deleted the add_format_testing_script branch July 24, 2023 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants