Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

:reflow does not recognize \n\n as paragraph end/start #2419

Open
getreu opened this issue May 6, 2022 · 11 comments
Open

:reflow does not recognize \n\n as paragraph end/start #2419

getreu opened this issue May 6, 2022 · 11 comments
Labels
A-core Area: Helix core improvements C-bug Category: This is a bug

Comments

@getreu
Copy link
Contributor

getreu commented May 6, 2022

In markup languages \n\n marks the end and start of a new paragraph. The current word wrap implementation does not recognise paragraph endings. Instead they are interpreted as usual white-space and the text is formatted as one big block.

@getreu
Copy link
Contributor Author

getreu commented May 7, 2022

Workaround

(Maybe this is the intended way to format text in Helix?)

  1. Select the whole text to format.
  2. Split in paragraphs with: Normal-Mode S (capital S), type \n\n, then [Enter]
  3. Format text: :reflow, then [Enter]

@archseer archseer added the C-bug Category: This is a bug label May 8, 2022
@kirawi kirawi added the A-core Area: Helix core improvements label May 8, 2022
@getreu
Copy link
Contributor Author

getreu commented May 9, 2022

The above workaround does not work e.g. with block quotes >.

A more general approach would be to refer to Treesitter to analyse the text's structure.

@vlmutolo
Copy link
Contributor

vlmutolo commented Jun 3, 2022

This is a known shortcoming of the current approach. I'm not sure how to get textwrap, the underlying crate powering the feature, to recognize "blank" lines.

We could try to manually do it and call textwrap on the individual paragraphs only, but then we'd have to re-implement the prefix detection to handle things like the following scenario:

/// # title
///
/// some paragraph text

@mgeisler Any ideas? Is this something textwrap already handles and I just didn't find it?

@getreu
Copy link
Contributor Author

getreu commented Jun 3, 2022

It seems to me that, WrapAlgorithm in textwrap::wrap_algorithms - Rust has a notion of "paragraph".

This wrapping algorithm considers the entire paragraph to find optimal line breaks. When wrapping text, “penalties” are assigned to line breaks based on the gaps left at the end of lines. See [Penalties](https://docs.rs/textwrap/latest/textwrap/wrap_algorithms/struct.Penalties.html) for details.

@mgeisler
Copy link

mgeisler commented Jun 3, 2022

Hi @vlmutolo and @getreu, you're correct that refill doesn't recognize paragraphs currently. It will consider all lines as belonging to the same paragraph, regardless of blank lines and so on.

It seems to me that, WrapAlgorithm in textwrap::wrap_algorithms - Rust has a notion of "paragraph".

The notion of a paragraph is quite simple (primitive): functions like textwrap::wrap will split the input on \n and wrap each line as it's own paragraph.

Basically, Textwrap would originally disregard all whitespace and put all words into a single wrapped paragraph. However, people expect newlines to be preserved, so Textwrap now splits on \n, wraps the lines, and then joins everything together with \n again.

That is true for textwrap::wrap and fill, but I neglected to implement this for refill. We should fix this, so I would appreciate it if one of you could open an issue for it in the Textwrap repository.

@getreu
Copy link
Contributor Author

getreu commented Jun 4, 2022

What about defining a custom WrapAlgorithm in textwrap::wrap_algorithms - Rust that honours paragraph boundaries?

@getreu
Copy link
Contributor Author

getreu commented Jun 4, 2022

We probably need a custom wrap algorithm (WrapAlgorithm) for other unsoundness anyway: e.g. we do not want wrap in the middle of URLs and eventually not even after - in the middle of the word.

@mgeisler
Copy link

mgeisler commented Jun 4, 2022

What about defining a custom WrapAlgorithm in textwrap::wrap_algorithms - Rust that honours paragraph boundaries?

The job of the wrap algorithm is to turn a slice of words into wrapped lines. This is done via wrap.

Under the hood, wrap will call wrap_first_fit (the greedy algorithm) or wrap_optimal_fit. These functions operate on "fragments". A fragment is a block of something which has a width plus some whitespace.

We probably need a custom wrap algorithm (WrapAlgorithm) for other unsoundness anyway: e.g. we do not want wrap in the middle of URLs and eventually not even after - in the middle of the word.

It's the job of other functions to prepare these fragments — and these other functions decide if - should split words or not. Concretely, the WordSplitter enum implements a few different ways to split a word into, well, smaller words. The code there works on words represented by an actual &str.

@getreu
Copy link
Contributor Author

getreu commented Jun 11, 2022

Another workaround

  1. Select text
  2. Type | (pipe)
  3. fmt, [Enter]

@getreu
Copy link
Contributor Author

getreu commented Jun 11, 2022

What about defining a custom WrapAlgorithm in textwrap::wrap_algorithms - Rust that honours paragraph boundaries?

The job of the wrap algorithm is to turn a slice of words into wrapped lines. This is done via wrap.

Under the hood, wrap will call wrap_first_fit (the greedy algorithm) or wrap_optimal_fit. These functions operate on "fragments". A fragment is a block of something which has a width plus some whitespace.

We probably need a custom wrap algorithm (WrapAlgorithm) for other unsoundness anyway: e.g. we do not want wrap in the middle of URLs and eventually not even after - in the middle of the word.

It's the job of other functions to prepare these fragments — and these other functions decide if - should split words or not. Concretely, the WordSplitter enum implements a few different ways to split a word into, well, smaller words. The code there works on words represented by an actual &str.

Well, your explanation confirms, that textwrap - Rust is highly customisable?

@mgeisler
Copy link

Well, your explanation confirms, that textwrap - Rust is highly customisable?

Yes, I would say so :-) The most elaborate example of this is the WebAssembly demo, where I could put in toggles for all the options: https://mgeisler.github.io/textwrap. It also shows how Textwrap can wrap non-console text (it uses f64 internally now for it's width computations).

I wrote up a blurb of text about it in #136 in a reply to @cessen. I'm sorry for ending up with a discussion split across issues like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-core Area: Helix core improvements C-bug Category: This is a bug
Projects
None yet
Development

No branches or pull requests

5 participants