Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure void elements have closing slash in mermaid svg #15661

Merged
merged 23 commits into from
Feb 2, 2024

Conversation

bollwyvl
Copy link
Contributor

@bollwyvl bollwyvl commented Jan 18, 2024

References

Code changes

  • transform <br> tags into closed <br/> when generating the img tag containing mermaid SVG
    • the regular expression will actually find all void elements, and preserves attributes

  • test
    • add a test/notebook with examples of all (but one) of the upstream documentation examples
    • add screenshots

User-facing changes

  • users will be able to use expressive more typographic features in mermaid diagrams

Security/Compatibility considerations

  • some implementations (e.g. GitHub) will not even parse diagrams with any void elements (or probably any fancy HTML) except hr and br
    • currently, in this PR, as in the mermaid playground, one can do really interesting/outrageous things, for example:
      • the style="border-color: red" attribute can be used on an hr to change its color
      • <input type="checkbox" /> and radio can be used to do weird things
    • in the playground, these are actually clickable, etc. but sadly not feasible with the (otherwise robust and portable) <img/> approach
      • though text labels may contain further img tags
  • no implementation render <script> tags
  • onclick="javascript:alert('pwnd')" attributes do appear to pass through on the playground, and in 4.1.0b1 if a user goes through right-click, open in new tab and then interacts with the diagram

Backwards-incompatible changes

  • if we decide to do more aggressive sanitizing, this could break some diagrams/examples in the wild

Copy link

Thanks for making a pull request to jupyterlab!
To try out this branch on binder, follow this link: Binder

@bollwyvl bollwyvl changed the title Force self-closing breaking space tags in generated mermaid svg Ensure void tags have closing slash in mermaid svg Jan 18, 2024
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
packages/mermaid/src/manager.ts Fixed Show fixed Hide fixed
@bollwyvl bollwyvl force-pushed the gh-15660-mermaid-newlines branch from a6cc997 to 66b9da2 Compare January 18, 2024 23:47
@bollwyvl bollwyvl changed the title Ensure void tags have closing slash in mermaid svg Ensure void elements have closing slash in mermaid svg Jan 18, 2024
@krassowski krassowski added this to the 4.1.0 milestone Jan 19, 2024
@krassowski krassowski added the bug label Jan 19, 2024
@krassowski
Copy link
Member

krassowski commented Jan 22, 2024

Upstream issue: mermaid-js/mermaid#1766

What effect would the flowchart.htmlLabels=false suggested by mermaid team have on compatibility? Can we use it instead of manual HTML substitution?

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Jan 22, 2024 via email

@bollwyvl
Copy link
Contributor Author

htmlLabels=false

This does not solve the problem for \n in node labels.

sandbox

This resulted in strictly worse results e.g. &gt;br&lt; as these are "magic" entities, not defined in the SVG spec.

The <iframe srcdoc> approach is very attractive, as, for example, HTML (and therefore markdown) images can do icky things like:

This is totally valid markdown:

<img src="/api/shutdown" alt="shutdown"/>

And pretty much all frontends will render it, including (at least in preview) GitHub

This is totally valid markdown:

shutdown

But marked communicates over strings, and filters out iframes... we probably don't want to change that posture, today, as they can very much be used for deceptive things, not just their "white hat" security techniques).

@krassowski
Copy link
Member

But marked communicates over strings, and filters out iframes... we probably don't want to change that posture, today, as they can very much be used for deceptive things, not just their "white hat" security techniques).

Could we use the same splicing approach as we do with MathJax? See:

// Separate math from normal markdown text.
const parts = removeMath(source);
// Convert the markdown to HTML.
html = await markdownParser.render(parts['text']);
// Replace math.
html = replaceMath(html, parts['math']);

@krassowski
Copy link
Member

Kicking this PR to check if the labeller for mermaid now works (#15717), also FYI once you merge/rebase the tests for mermaid should now run on CI.

@bollwyvl
Copy link
Contributor Author

Thanks. Alas, it might be a tick before I can get back to this...

Comment on lines 85 to 86
svg = svg.replace(Private.RE_VOID_ELEMENT, Private.replaceVoidElement);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see this unit tested. For example input '<br/>' (which I understand we do not expect from the current version of marked but I guess user or future versions could introduce it) results in '<br / >' which strictly speaking is invalid XHTML. Basically:

new DOMParser().parseFromString('<br>'.replace(RE_VOID_ELEMENT, replaceVoidElement), 'image/svg+xml')

works, but:

new DOMParser().parseFromString('<br/>'.replace(RE_VOID_ELEMENT, replaceVoidElement), 'image/svg+xml')

fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add some coverage.

I'm semi-disappointed we still don't have any property-based testing or overarching frontend coverage that combines jest and puppeteer outputs, but certainly won't be wading into the stack any more than i have to on this PR.

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 1, 2024

Ok, all of the screenshots are now working (and less sensitive to scroll skew). Will merge upstream, and see if anything pops out, but no further work planned.

);

const clean = MermaidManager.cleanMermaidSvg(raw);
const parsed = parser.parseFromString(clean, SVG_MIME);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would now error if the cleaned string is not valid XML... it's harder to verify that the browser would go into the "fallback" rendering mode (the most likely failure case).

Because of that, it's probably not worth roundtripping inside the implementation, which is already pretty heavy.

* for elements larger than viewport, derived from:
* https://github.com/microsoft/playwright/issues/13486#issuecomment-1112012053
*/
async function resizePageAndScreenshot(locator: Locator) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is something that should be hoisted up outside of a deeply-nested utils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could add it as a galata helper in the future if it turns out useful elsewhere too. For now it's easier to do it here because this way we don't need to worry about it becoming a public API yet ;)

@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 2, 2024

hooray all 🟢

Copy link
Member

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @bollwyvl!

@krassowski krassowski merged commit e51ff33 into jupyterlab:main Feb 2, 2024
79 checks passed
@bollwyvl
Copy link
Contributor Author

bollwyvl commented Feb 2, 2024

Thanks. Alas, the wheels turn: since this was merged, https://github.com/mermaid-js/mermaid/releases/tag/v10.8.0 😿

I don't see any hard blockers/features that would require going out, but will take a look, especially to kick the tires on the screenshot approach.

@bollwyvl bollwyvl mentioned this pull request Feb 2, 2024
18 tasks
@bollwyvl bollwyvl deleted the gh-15660-mermaid-newlines branch February 2, 2024 14:11
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Feb 3, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Feb 5, 2024
krassowski added a commit to bollwyvl/jupyterlab that referenced this pull request Jul 19, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Jul 19, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Sep 9, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Dec 12, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Dec 12, 2024
bollwyvl added a commit to bollwyvl/jupyterlab that referenced this pull request Dec 12, 2024
ImpSy pushed a commit to spotinst/jupyterlab that referenced this pull request Jan 7, 2025
)

* force self-closing breaking space tags in generated mermaid svg

* expand void element regular expression

* use explicit securityLevel

* add more diagram examples to browser test

* remove MIME string from test mermaid test notebook

* roll back existing test change, add diagram gallery test

* rework test selectors

* add screenshots from ci

* fix mermaid snapshot naming

* remove gitgraph screenshots because random hashes

* more mermaid screenshot work

* more isolated, verbose diagram screenshot names

* more screenshots

* add unit tests for void element cleaning

* another unclosed tag

* Fix snapshots for large diagrams

* Scale the diagram up to avoid having a small blurry blob

* Update Playwright Snapshots

---------

Co-authored-by: krassowski <5832902+krassowski@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants