Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot save many files at once because process buffers are tied to formatters #64

Closed
mohkale opened this issue Dec 25, 2021 · 1 comment · Fixed by #65
Closed

Cannot save many files at once because process buffers are tied to formatters #64

mohkale opened this issue Dec 25, 2021 · 1 comment · Fixed by #65

Comments

@mohkale
Copy link
Contributor

mohkale commented Dec 25, 2021

Okay, so I've been using apheleia pretty heavily over the past few days/weeks and more than once I've run into issues with formatters failing or even worse my buffers being overwritten with the contents of both that buffer and another buffer.

Reproduction Instructions

  1. Create a temporary directory foo.
  2. In foo create 3 files: a.py, b.py, c.py.
  3. Open all of them in emacs and enable apheleia-mode.
  4. Insert into each of them the following src-code block.
  5. Save all of them at once (example using :wa with evil-mode).
import csv

# need to define cmp function in Python 3
def cmp(a, b):
    return (a > b) - (a < b)

# write stocks data as comma-separated values
with open('stocks.csv', 'w', newline='') as stocksFileW:
    writer = csv.writer(stocksFileW)
    writer.writerows([
        ['GOOG', 'Google, Inc.', 505.24, 0.47, 0.09],
        ['YHOO', 'Yahoo! Inc.', 27.38, 0.33, 1.22],
        ['CNET', 'CNET Networks, Inc.', 8.62, -0.13, -1.4901]
    ])

# read stocks data, print status messages
with open('stocks.csv', 'r') as stocksFile:
    stocks = csv.reader(stocksFile)

    status_labels = {-1: 'down', 0: 'unchanged', 1: 'up'}
    for ticker, name, price, change, pct in stocks:
        status = status_labels[cmp(float(change), 0.0)]
        print ('%s is %s (%.2f)' % (name, status, float(pct)))

Chances are at least one of these attempts to format the buffer on save will fail. In my case they both failed Failed to run isort: exit status 2 (see hidden buffer *apheleia-isort-stderr*) [2 times] and the stderr buffer for isort contains:

Traceback (most recent call last):
  File "/home/mohkale/.local/bin/isort", line 5, in <module>
    from isort.main import main
  File "/home/mohkale/.local/lib/python3.10/site-packages/isort/main.py", line 2, in <module>
    import argparse
  File "/usr/lib/python3.10/argparse.py", line 92, in <module>
    from gettext import gettext as _, ngettext
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 674, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 577, in module_from_spec
  File "<frozen importlib._bootstrap>", line 541, in _init_module_attrs
KeyboardInterrupt

Note: this example also shows how the output of two buffers could end up mixed. I run a formatter, it starts formatting and outputting, then it's interrupted and another formatter is begun outputting to the same stdout buffer, then apheleia concatenates the partial output from the first and the full output of the second into the second buffer.

Analysis

I think this is pretty clearly because apheleia uses the same buffer-names for every formatters stderr and stdout buffer & we explicitly kill any process in that buffer when running the formatter.

I like that apheleia keeps the stderr and stdout buffer around after the process finishes, it helps with debugging, but the current approach doesn't scale well to many files or workflows that touch many files before saving them. What I'd suggest is 2 things:

  1. Don't keep the stdout buffer around. I don't really see the point of this at least from a user POV. Better to use it to format the current buffer and then delete it.
  2. Create separate buffers for stdout and stderr whenever you run a formatter and then insert any stderr output to a shared stderr buffer for that formatter.

So we still keep the *apheleia-isort-stderr* buffer, but instead of connecting it directly to a process, we give each process a unique stderr buffer and then append the output to this buffer at the end. If there's a worry it's getting too large in the background we can even only add to it if the formatter fails.

@mohkale
Copy link
Contributor Author

mohkale commented Dec 25, 2021

Looking into it further, it looks like this behaviour is intentional. We use apheleia--current-process to access the formatter and this is assigned globally.

mohkale added a commit to mohkale/apheleia that referenced this issue Dec 25, 2021
…re#64) (radian-software#65)

Previously when apheleia formatted a buffer it created a stdout and
stderr buffer for each formatter, but it reused this buffer each time
that formatter would run. This makes sense if we only ever format one
buffer at a time (meaning we don't format a new buffer until the
previous buffer has been formatted) such as when calling
`apheleia-format-buffer` interactively (since the interval for running a
formatter is likely far below hitting a key combination for this
command). But this assumption falls apart when using `apheleia-mode` and
`apheleia--format-after-save`.
Now a lot of files could be saved, triggering the same formatters again
and again, within a short period of each other. Apheleia used to keep
track of the current formatter process and kill it when a newer
formatter is attempted, but this also kills all but the last buffer
called by `apheleia--format-after-save`.

With this commit we still have separate stdout and stderr buffers for
each formatter, but we *always* create a new one when attempting a
format. There is a new buffer type, a log buffer, which is populated
with a formatter processes stderr when it fails. We also still have a
`apheleia--current-process` variable, but instead of being global, it's
local to the current buffer being formatted. We now kill it if starting
a new format in the current buffer, but two separate buffers can call
the same formatter with no issue.
mohkale added a commit to mohkale/apheleia that referenced this issue Dec 25, 2021
…re#64) (radian-software#65)

Previously when apheleia formatted a buffer it created a stdout and
stderr buffer for each formatter, but it reused this buffer each time
that formatter would run. This makes sense if we only ever format one
buffer at a time (meaning we don't format a new buffer until the
previous buffer has been formatted) such as when calling
`apheleia-format-buffer` interactively (since the interval for running a
formatter is likely far below hitting a key combination for this
command). But this assumption falls apart when using `apheleia-mode` and
`apheleia--format-after-save`.
Now a lot of files could be saved, triggering the same formatters again
and again, within a short period of each other. Apheleia used to keep
track of the current formatter process and kill it when a newer
formatter is attempted, but this also kills all but the last buffer
called by `apheleia--format-after-save`.

With this commit we still have separate stdout and stderr buffers for
each formatter, but we *always* create a new one when attempting a
format. There is a new buffer type, a log buffer, which is populated
with a formatter processes stderr when it fails. We also still have a
`apheleia--current-process` variable, but instead of being global, it's
local to the current buffer being formatted. We now kill it if starting
a new format in the current buffer, but two separate buffers can call
the same formatter with no issue.
mohkale added a commit to mohkale/apheleia that referenced this issue Dec 25, 2021
…re#64) (radian-software#65)

Previously when apheleia formatted a buffer it created a stdout and
stderr buffer for each formatter, but it reused this buffer each time
that formatter would run. This makes sense if we only ever format one
buffer at a time (meaning we don't format a new buffer until the
previous buffer has been formatted) such as when calling
`apheleia-format-buffer` interactively (since the interval for running a
formatter is likely far below hitting a key combination for this
command). But this assumption falls apart when using `apheleia-mode` and
`apheleia--format-after-save`.
Now a lot of files could be saved, triggering the same formatters again
and again, within a short period of each other. Apheleia used to keep
track of the current formatter process and kill it when a newer
formatter is attempted, but this also kills all but the last buffer
called by `apheleia--format-after-save`.

With this commit we still have separate stdout and stderr buffers for
each formatter, but we *always* create a new one when attempting a
format. There is a new buffer type, a log buffer, which is populated
with a formatter processes stderr when it fails. We also still have a
`apheleia--current-process` variable, but instead of being global, it's
local to the current buffer being formatted. We now kill it if starting
a new format in the current buffer, but two separate buffers can call
the same formatter with no issue.
raxod502 added a commit that referenced this issue Dec 26, 2021
* Run the same formatter in multiple buffers in parallel (#64) (#65)

Previously when apheleia formatted a buffer it created a stdout and
stderr buffer for each formatter, but it reused this buffer each time
that formatter would run. This makes sense if we only ever format one
buffer at a time (meaning we don't format a new buffer until the
previous buffer has been formatted) such as when calling
`apheleia-format-buffer` interactively (since the interval for running a
formatter is likely far below hitting a key combination for this
command). But this assumption falls apart when using `apheleia-mode` and
`apheleia--format-after-save`.
Now a lot of files could be saved, triggering the same formatters again
and again, within a short period of each other. Apheleia used to keep
track of the current formatter process and kill it when a newer
formatter is attempted, but this also kills all but the last buffer
called by `apheleia--format-after-save`.

With this commit we still have separate stdout and stderr buffers for
each formatter, but we *always* create a new one when attempting a
format. There is a new buffer type, a log buffer, which is populated
with a formatter processes stderr when it fails. We also still have a
`apheleia--current-process` variable, but instead of being global, it's
local to the current buffer being formatted. We now kill it if starting
a new format in the current buffer, but two separate buffers can call
the same formatter with no issue.

* Mark change as bugfix in changelog

* Add to docstring

* Remove no longer needed code

* Re-wrap docstring

* Remove newline

* Change spelling

* Use correct buffer when checking (buffer-size)

Co-authored-by: Radon Rosborough <radon.neon@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

1 participant