Inline PUML #968

crphang · 2020-01-01T08:41:00Z

Resolves #912

Extremely raw implementation here. This also breaks existing PUML which will be addressed in later commits.

Here's a demo:

Problems with current implementation:

The markdown-it library was hacked and was doing file generation of plantUML and I really don't think it should be doing it.
Currently I commented the file based PlantUML plugin handler as it removes all puml tags.
The logic of file based and inline puml are handled in very different places. This can be made better.
This uses Fenced Code blocks to input PUML but not written in an extensible way.

Going forward:

I believe I would still make use on fenced code blocks to retrieve the content of the PUML file. This helps to ensure that the content is properly escaped. A pluggable handler which allows fenced blocks to handle other kind of content, with PlantUML being only 1 type of it.

This fits in nicely considering fenced block already handling other special types of content like diffs:

+ new
- old

Merge the user experience of both PUML files:

<puml src="filename" />

<puml inline>
```puml
...
</puml>

Allow user to specify output file name for diagrams, and also generate temporary but consistent file names (MD5 hash of content).
Refactor to perform the generation of PlantUML diagram in a seperate. Currently it is done in the markdown-it library itself. Markdown-it will only be used to return the content of the generated HTML and metadata if necessary.

yamgent · 2020-01-02T02:07:57Z

Thanks for working on this. I think the code block thing is a nice trick for us, but it does make the syntax less usable on the author side (to them it seems like an addition that is not intuitive at first look, since it is different from how they use components).

Is it possible for us to do this "code block" thing behind the scenes, so that the author does not have to manually do it themselves, while still allowing us to escape it properly for later processing by markdown-it?

damithc · 2020-01-02T03:16:09Z

Is it possible for us to do this "code block" thing behind the scenes, so that the author does not have to manually do it themselves, while still allowing us to escape it properly for later processing by markdown-it?

Perhaps something like this?

```puml-inline
...
```

yamgent · 2020-01-02T03:51:59Z

Ok I just went back and read through the old puml PRs and realise the difficulty in doing the "behind the scene" method that I suggested, because of how HTML just generally behaves.

Perhaps something like this?
```puml-inline
...
```

We can explore this way. The HTML tag seems to be rather redundant non-aesthetic wise, if we have to escape the content anyway (which makes it seem un-HTML).

damithc · 2020-01-02T04:28:02Z

Here's a demo:

If alternatives are too hard, I don't mind resorting to this solution; while it is a bit more work for the author, it is still easier than having to create an extra puml file.

crphang · 2020-02-03T09:58:41Z

Checked that this PR resolves: #936 on top of #912.

Refactored code to handle both file PUML and inline PUML. Currently inline PUML works well without any new lines. For example, the below inline PUML would render fine:

<puml>
```puml
@startuml
alice -> bob ++ : hello
bob -> bob ++ : self call
bob -> bib ++  #005500 : hello
bob -> george ** : create
return done
return rc
bob -> george !! : delete
return success
@enduml
</puml>

However, there would be issues when we add newlines in the PUML as codefences are not completely escaped when nested as an immediate children under any element as there would be <p> tags added in the middle replacing the newlines and also <pre><code>... at the end.

This behavior is consistent for all codefences when nested under a tag. For example:

<span>
```
Some code fenced

Newline
```
</span>

will not work properly as well.

@yamgent @damithc is this a known issue?

Currently I am thinking to improve the behavior by either:

Resolving the above issue separately if possible.
Change the approach from using markdown-it-attr to inject additional user provided inputs instead (like outputFileName in this case), which removes the problematic tag for inline-puml. For example, the user experience becomes just:

```puml
...
```

The change here should be minimal in terms of the main logic, but I need to investigate further on how to pass in additional context (cwf, resultPath) and registering widgetHandlers to markdown-it renderer in a more generic way.

acjh · 2020-02-03T12:29:22Z

Add an empty line between HTML tag and opening code fence.

crphang · 2020-02-03T14:58:40Z

Add an empty line between HTML tag and opening code fence.

Yes, that workaround usually works, but I think it won't work when we rely on a syntax like below to inject user-input (as with this PR as well):

<tag user-input>
```tag

```

This is also discussed #473 (comment).

Was wondering whether it is a design choice/known issue to use this workaround? I guess if so, do you think it might be better to inject context (user-input) through the markdown-it-attr?

acjh · 2020-02-03T15:22:39Z

You mean the following doesn't work?

<tag user-input>

```tag
```
</tag>

It is in the GitHub Flavored Markdown Spec: https://github.github.com/gfm/#example-157

Example 157

`<div> Emphasized text. </div>`	`<div> <p><em>Emphasized</em> text.</p> </div>`

Example 158

`<div> Emphasized text. </div>`	`<div> Emphasized text. </div>`

crphang · 2020-02-04T12:24:20Z

Huge thanks for pointing that out. Leave a space does not work out of the box. I added a additional handler to make it work.

  static extractAndGenerateCodeBlock(element, codeLanguage = '') {
    let codeBlockElement;
    element.children.forEach((child) => {
      if (child.name === 'pre') {
        const nestedChild = child.children[0];
        [codeBlockElement] = nestedChild.children;
        codeBlockElement.data = `\`\`\`${codeLanguage}\n${codeBlockElement.data}\n\`\`\``;
      }
    });

    return codeBlockElement;
  }

This is because when leaving a space, initial md.render has already rendered the code fence as <pre><code>..., so to make this work, we retrieve the code block and regenerate it with the correct context and handler.

This keeps the behavior consistent as described https://github.github.com/gfm/#example-157 and also allows us to add additional context as well.

What do you think about this approach? If it is okay, I would proceed to refactor, add more edge case handling, additional tests and user guide for this.

openorclose · 2020-02-04T12:28:49Z

Currently I am thinking to improve the behavior by either:

1. Resolving the above issue separately if possible.

2. Change the approach from using markdown-it-attr to inject additional user provided inputs instead (like outputFileName in this case), which removes the problematic tag for inline-puml. For example, the user experience becomes just:

```puml
...


The change here should be minimal in terms of the main logic, but I need to investigate further on how to pass in additional context (cwf, resultPath) and registering widgetHandlers to markdown-it renderer in a more generic way.

Can we try approach 2 instead?

crphang · 2020-02-05T14:40:48Z

Can we try approach 2 instead?

I'm on the fence for both approaches.

Approach 1
<tag user-input>

```tag
...
```
</tag>

Approach 2
```tag {user-input=x}
...
```

While approach 2 is less verbose, approach 1 is a more familiar way to add user-input rather than introducing markdown-it-attr for components (especially when comparing inline and src puml).

@yamgent @acjh what do you guys think?

crphang · 2020-02-11T16:48:50Z

Ready for review. Keeping this PR simpler as it primarily aims to resolve #936 and #912.

Future work: I explored adding PUML to be handled in componentParser, but decided to keep that separate for now. The reason is because I believe each "widget" (PUML/Latex/...) can be made more abstract, defining preprocess and parsing interface and this PR might not be best to introduce this refactor.

ang-zeyu · 2020-02-15T11:41:37Z

Hi @crphang,
I'm working on an api that would allow plugins to define certain html tag types which the parser would ignore, without the use of markdown code blocks.
This should allow us to keep plugin code to plugins strictly for better cohesion, which should be nice for #984 ( we can easily add more plugins for more sorts of diagrams, etc., without touching preprocess or anything else )

It seems to be pretty feasible for both htmlParser2 and markdown-it.
Would you want to base your changes on top of it?

The essential idea is to treat these tags the same way these parsers treat <script> tags, which allows all sorts of weird symbols anyway.

Its in a very early stage though, so if this gives you some ideas on how to do so, do let me know if you want to do it instead!

crphang · 2020-02-15T12:27:24Z

Hey @ang-zeyu, such an API would definitely help! One of the main problems I faced with plugins was that it introduced quite a bit of side effects, and we dont have as much control over the components as we do in parser.

I don't exactly think that introducing new diagrams will require "polluting" preprocess going forward. Definitely like your work on componentParser and refactoring preprocess. For adding new widgets, im looking to abstract it more and adding preprocess/parse handlers to each widget (will be in the scope of another PR #984)

Could you share the branch? I could take a further look. My current belief is that widgets as defined going forward are more easily maintained and beneficial for us developers to provide official supported components. Plugins offer extensibility in user defined components and functionality.

ang-zeyu · 2020-02-15T13:12:38Z

Sure, i'll try to clean it up and push a functional one for htmlparser2 over tonight

ang-zeyu · 2020-02-15T16:31:45Z

ang-zeyu@3f1e5dc These are the changes to htmlparser2 required roughly ( still very much a WIP, but it works for simple use cases like <abc> <thisShouldAppearAsText> </abc> )

It should be much simpler for markdown-it due to its plugin system and use of regex patterns

In addition to this, what needs to be added would be

an additional field in plugin options allowing plugins to return an array of tag names to "ignore". These tag names would be set as the specialTagNames array in the commit above during initial site generation
markdown-it plugin to patch its behaviour
much more polishing and fixing for htmlparser2 patch

I don't exactly think that introducing new diagrams will require "polluting" preprocess going forward. Definitely like your work on componentParser and refactoring preprocess. For adding new widgets, im looking to abstract it more and adding preprocess/parse handlers to each widget (will be in the scope of another PR #984)

Hmm, reason I mentioned this was because I saw changes were needed in your code to _preprocess and _parse; As we want to add more and more diagrams, e.g. <line-graph> ... </line-graph> the amount of code inside these core markbind functions may be quite a lot.

We can alleviate this with something like #1026, but I feel its better to provide a sufficiently robust and consistent api to plugins which should make plugin development easier.
( we would have a clearly defined interface for plugins to operate, and an equally clear api contract for the core code to fufill )

This would also consolidate all optional functionality in plugin files, rather having such optional functionality coupled to the internal processes of parser which is easily subject to change.

One of the main problems I faced with plugins was that it introduced quite a bit of side effects, and we dont have as much control over the components as we do in parser.

Plugins certainly don't have as much control, but we can always add more to the api cautiously when really needed. ( such as "escaping" text inside the certain tags like this PR )

I'm not really aware of any side effects though 😮 I think this is one of the main benefits of a plugin system, where plugins may only interact with the provided content. A plugin can return a malformed html content string, but that would be the fault of the plugin development, not the core code I think. This could just as easily occur when plugins are coupled to the internal processes of the core code as well.

Just my 2 cents though, might be good to get some more opinions in 😄

crphang · 2020-02-16T06:07:06Z

I'm not really aware of any side effects though

What I meant was that the manipulation of the DOM is rather orthogonal from the main parsing process with plugins.

Hmm, reason I mentioned this was because I saw changes were needed in your code to _preprocess and _parse; As we want to add more and more diagrams, e.g. <line-graph> ... </line-graph> the amount of code inside these core markbind functions may be quite a lot.

Yes, certainly not a good idea to keep adding to the core logic. I just made a refactor to make it clearer what I was trying to achieve going forward.

ang-zeyu/markbind@3f1e5dc

Looks really good if it properly escapes from htmlParser and we can go with:

<component>
content
</component>

Certainly something we could explore further. Any rough estimate how long it would take for that POC to be completed? I am thinking of decoupling the two PR. I believe that the user experience without using code blocks to escape content will be better, but would want to start adding the other components. Perhaps we could add this change without modifying the user guide (since the changes are not backward compatible), so that other components in #984 have something to build on since both are rather pluggable. And migrating back to plugins once it is tested should not be too difficult.

ang-zeyu · 2020-02-16T07:01:18Z

What I meant was that the manipulation of the DOM is rather orthogonal from the main parsing process with plugins.

Just looked at your refactor, I think I have a better idea now, thanks!

I'm all for this as well, I think it'd be good to have plugins tie in directly to the per element _parse / _preprocess process. This should keep performance good as we add more and more plugins as well, since we don't have to parse the entire content another round in each plugin.

Since its a major architectural addon for plugins, might be really good to have more opinions in first though @yamgent @acjh @openorclose @yash-chowdhary

And migrating back to plugins once it is tested should not be too difficult.

Just to clarify this, would the user still be able to turn off widgets eventually ( site config )?

Any rough estimate how long it would take for that POC to be completed? I am thinking of decoupling the two PR.

It should take about a week

crphang mentioned this pull request Jan 1, 2020

Minor complication with PlantUML diagrams in a sub-site #936

Open

crphang force-pushed the inline-puml branch from 6449643 to dd78626 Compare January 30, 2020 02:20

crphang mentioned this pull request Feb 2, 2020

Support annotations in code examples #473

Closed

crphang force-pushed the inline-puml branch from 0dd3965 to 0cdf92c Compare February 3, 2020 09:03

crphang changed the title ~~WIP: Inline PUML~~ Inline PUML Feb 3, 2020

crphang force-pushed the inline-puml branch 2 times, most recently from 48145c8 to 1e5d0b0 Compare February 11, 2020 15:21

crphang mentioned this pull request Feb 15, 2020

Strong support for diagrams #984

Open

Add suport for Inline PUML and fix PUML subsite problem

da3f0bb

crphang force-pushed the inline-puml branch from 6d321de to f0105ea Compare February 16, 2020 05:58

Refactor to use pluggable widgets

c7084f7

crphang force-pushed the inline-puml branch from f0105ea to c7084f7 Compare February 16, 2020 06:20

ang-zeyu mentioned this pull request Feb 17, 2020

Implement an api to ignore text in certain tags #1047

Merged

crphang closed this Mar 8, 2020

crphang mentioned this pull request Mar 8, 2020

Inline puml #1100

Merged

ang-zeyu mentioned this pull request Apr 10, 2020

Add more hooks for plugins #1188

Closed

ang-zeyu mentioned this pull request Feb 26, 2022

Support basic charts #1663

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline PUML #968

Inline PUML #968

crphang commented Jan 1, 2020 •

edited

Loading

yamgent commented Jan 2, 2020

damithc commented Jan 2, 2020

yamgent commented Jan 2, 2020

damithc commented Jan 2, 2020

crphang commented Feb 3, 2020 •

edited

Loading

acjh commented Feb 3, 2020

crphang commented Feb 3, 2020

acjh commented Feb 3, 2020

crphang commented Feb 4, 2020 •

edited

Loading

openorclose commented Feb 4, 2020

crphang commented Feb 5, 2020

crphang commented Feb 11, 2020

ang-zeyu commented Feb 15, 2020 •

edited

Loading

crphang commented Feb 15, 2020

ang-zeyu commented Feb 15, 2020

ang-zeyu commented Feb 15, 2020 •

edited

Loading

crphang commented Feb 16, 2020

ang-zeyu commented Feb 16, 2020

Inline PUML #968

Inline PUML #968

Conversation

crphang commented Jan 1, 2020 • edited Loading

yamgent commented Jan 2, 2020

damithc commented Jan 2, 2020

yamgent commented Jan 2, 2020

damithc commented Jan 2, 2020

crphang commented Feb 3, 2020 • edited Loading

acjh commented Feb 3, 2020

crphang commented Feb 3, 2020

acjh commented Feb 3, 2020

crphang commented Feb 4, 2020 • edited Loading

openorclose commented Feb 4, 2020

crphang commented Feb 5, 2020

crphang commented Feb 11, 2020

ang-zeyu commented Feb 15, 2020 • edited Loading

crphang commented Feb 15, 2020

ang-zeyu commented Feb 15, 2020

ang-zeyu commented Feb 15, 2020 • edited Loading

crphang commented Feb 16, 2020

ang-zeyu commented Feb 16, 2020

crphang commented Jan 1, 2020 •

edited

Loading

crphang commented Feb 3, 2020 •

edited

Loading

crphang commented Feb 4, 2020 •

edited

Loading

ang-zeyu commented Feb 15, 2020 •

edited

Loading

ang-zeyu commented Feb 15, 2020 •

edited

Loading