You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The YARD::Templates::Helpers::HtmlHelper#parse_codeblocks method is responsible for post processing the HTML output produced by markup processor. It enables code highlighting, adds links to particular Ruby methods and classes, and finally normalizes the HTML document so that it can be styled with CSS:
Unfortunately, the regular expression in this method fails to match code listings produced by Asciidoctor when they are annotated with name of programming language.
Other considerations
The most obvious solution would be to make that regular expression more liberal. However:
This regular expression is overcomplicated already.
HTML attributes may contain other things than language names. For example, <pre class="highlight"> in aforementioned example. Also, language name may be decorated with some additional specifiers, for example <code class="language-ruby"> in the same example.
Therefore, some smarter detection should be discussed. I have two proposals.
Option 1: Regular expression should be markup-processor-specific.
Method parse_codeblocks(html) is called from method htmlify(text, markup = options.markup) only. Therefore, an additional argument can be added: parse_codeblocks(html, markup).
Then, a brand new regular expression can be introduced for the Asciidoctor markup processor. Other markup processors can have their own dedicated regular expressions if they need any. The current regular expression will be used as a default generic one.
Option 2: Single regular expression with much smarter detection
A very generic regular expression can recognize different HTML attributes and prioritize them, either by value relevance (looking for a value which resembles a programming language name, like ruby, language-ruby, code-ruby), or by name relevance (looking for attributes with names data-lang, lang, class in that order).
Possibly the same problem as in #781, although that bug report was very unclear.
I can help implementing either, but discussion is required IMO. Personally, I am leaning towards option 1.
Consider following example output (a border was added by me for clarity):
Rendering issues are apparent:
The problematic document
A following piece of AsciiDoc was used to generate the first section of above document:
It translates to following HTML:
Full document is in this Gist: https://gist.github.com/skalee/fe1a52a797f7c821ae354a59f7812fd9.
Why it fails
The
YARD::Templates::Helpers::HtmlHelper#parse_codeblocks
method is responsible for post processing the HTML output produced by markup processor. It enables code highlighting, adds links to particular Ruby methods and classes, and finally normalizes the HTML document so that it can be styled with CSS:yard/lib/yard/templates/helpers/html_helper.rb
Lines 624 to 643 in 12f56cf
Unfortunately, the regular expression in this method fails to match code listings produced by Asciidoctor when they are annotated with name of programming language.
Other considerations
The most obvious solution would be to make that regular expression more liberal. However:
<pre class="highlight">
in aforementioned example. Also, language name may be decorated with some additional specifiers, for example<code class="language-ruby">
in the same example.Therefore, some smarter detection should be discussed. I have two proposals.
Option 1: Regular expression should be markup-processor-specific.
Method
parse_codeblocks(html)
is called from methodhtmlify(text, markup = options.markup)
only. Therefore, an additional argument can be added:parse_codeblocks(html, markup)
.Then, a brand new regular expression can be introduced for the Asciidoctor markup processor. Other markup processors can have their own dedicated regular expressions if they need any. The current regular expression will be used as a default generic one.
Option 2: Single regular expression with much smarter detection
A very generic regular expression can recognize different HTML attributes and prioritize them, either by value relevance (looking for a value which resembles a programming language name, like
ruby
,language-ruby
,code-ruby
), or by name relevance (looking for attributes with namesdata-lang
,lang
,class
in that order).Possibly the same problem as in #781, although that bug report was very unclear.
I can help implementing either, but discussion is required IMO. Personally, I am leaning towards option 1.
I have read the Contributing Guide.
The text was updated successfully, but these errors were encountered: