Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace "inner_contents" config option with "symbol" in Heading Permalinks #505

Closed
markhalliwell opened this issue Jun 20, 2020 · 3 comments · Fixed by #506
Closed

Replace "inner_contents" config option with "symbol" in Heading Permalinks #505

markhalliwell opened this issue Jun 20, 2020 · 3 comments · Fixed by #506
Labels
enhancement New functionality or behavior implemented Change has been implemented
Milestone

Comments

@markhalliwell
Copy link
Contributor

With the recent request and decision made in #504, it brought up the point that the inner_contents config option essentially allows the same thing.

It currently defaults to HeadingPermalinkRenderer::DEFAULT_INNER_CONTENTS which is an Octicon link SVG element.

Problems

  • It creeps into front-end territory where decisions regarding visual aspects like this should live.
  • It bakes the custom markup into the rendered output. This can have serious consequences in CMSes where content is typically only ever rendered at the time of edit/save. This means the rendered output can be archived or cached indefinitely; causing issues during a site redesign where the markup provided may no longer work or in fact, may cause serious display issues.
  • Allows any arbitrary HTML and potential future security vulnerability that may arise due to the planned allowance of runtime configuration overrides via front matter ([2.0] Front matter extension #497).

Proposed solution

  • Introduce a new symbol config option that is limited to a single character and defaults to a paragraph symbol ().
  • Deprecate the inner_contents config option and remove the default value.
  • Only use inner_contents if it was explicitly specified in config.
  • Update documentation to reflect this change and give additional CSS examples on how to provide a custom icon similar to https://commonmark.thephpleague.com/1.5/extensions/external-links/#adding-icons
@colinodell
Copy link
Member

I mostly agree with the proposed solution, but with two caveats:

  1. We should consider whether the default insert option needs to change.
  2. The new symbol option should not be limited to a single character

It creeps into front-end territory where decisions regarding visual aspects like this should live.

This is a very fair point. To give some background as to why 1.4 has the implementation it does:

GFM's implementation adds this HTML inside the link: <span aria-hidden class="octicon octicon-link">. This is not technically part of the GFM spec, but rather their HTML rendering pipeline. However, I believe this library should produce HTML output that doesn't need CSS to function or look nice, which is why the Octicon link SVG was embedded as a constant instead of outputting an empty span.

I'm not opposed to switching the default value to in 2.0 and providing the Octicon SVG as a documentation example instead. If we do that, would we also want to change the default value of the insert from before to after?

With this proposed solution, headings would render like this by default:

¶ Hello World!

Does that look good or would this be better?

Hello World! ¶

I'm not sure myself. Thoughts?


It bakes the custom markup into the rendered output. This can have serious consequences in CMSes where content is typically only ever rendered at the time of edit/save. This means the rendered output can be archived or cached indefinitely; causing issues during a site redesign where the markup provided may no longer work or in fact, may cause serious display issues.

There are a number of configurable settings that add HTML classes to the output - they would also suffer from the same issue.

Ideally, the rendering should happen within a cache layer that is easy to clear and allows for the Markdown to be re-converted when the cache is missing, but I get this isn't always easy or feasible.

#431 would allow an alternate strategy where the parsed Document AST is serialized to XML and then stored in the database. The process to render it would then look something like this:

$finalHtml = $cache->get($page->getId());
if ($finalHtml === null) {
    $finalHtml = $xmlRenderer->convertToHtml($page->getParsedXML());
    $cache->set($page->getId(), $finalHtml);
}

But I'm getting too off-topic here :P Basically just trying to say this "problem" isn't limited to just this one feature and there are some ways around this, including some potential future approaches.


Allows any arbitrary HTML and potential future security vulnerability that may arise due to the planned allowance of runtime configuration overrides via front matter (#497).

I'm not worried about this one as there are much greater risks, like somebody changing html_input from escape to allow. We'll put a big warning on the docs about the security implications of enabling the option to allow front matter to override config settings.

@markhalliwell
Copy link
Contributor Author

2. The new symbol option should not be limited to a single character

Then what's the point of introducing a new option that does the exact same thing as inner_contents? The entire reason behind this deprecation and introduction of a single character symbol is to prevent custom HTML from being injected there... on purpose. There is no need for more than one text-based symbol here. Its sole purpose is simply to make the permalink visible on the page.

This is not technically part of the GFM spec, but rather their HTML rendering pipeline. However, I believe this library should produce HTML output that doesn't need CSS to function or look nice, which is why the Octicon link SVG was embedded as a constant instead of outputting an empty span.

Which is why this doesn't belong here. It's a part of their specific HTML rendering, not anyone elses. Given that one can easily remove a default text-based symbol and then style the entire link via CSS using a pseusdo-element to whatever icon they want, doesn't inherently mean it isn't "nice". It's just what's provided out the box. If someone decides they need to use their own custom icon set or whatever to make it appear "nice" on their site, then that is the type of customization that should be left to the developer to implement on their own.

If we do that, would we also want to change the default value of the insert from before to after?
With this proposed solution, headings would render like this by default:

¶ Hello World!

Does that look good or would this be better?

Hello World! ¶

I'm not sure myself. Thoughts?

Personally, I think if it's before, it should be # Hello World! and if it's after, it should be Hello World! ¶

Regardless, this is purely cosmetic and configurable; which is the entire point. Ultimately, I don't think that warrants changing insert just because it might not "look good" to some (which is completely subjective in the first place).

In order to simplify this specific inner_contents deprecation, I think we shouldn't worry about insert at this time. If we really want to revisit it at another time, that's certainly doable and should have its own issue/discussion/bikeshed.

There are a number of configurable settings that add HTML classes to the output - they would also suffer from the same issue.

True, however as you state, they're classes, not arbitrary HTML markup (which isn't processed or filtered at all). This is the only configuration option that literally allows whatever is passed to be rendered as-in in the final markup. Classes can be more easily ignored (or quickly fixed to a previous BC implementation) if a FE implementation has changed. Adding extra arbitrary HTML elements that can potentially have no styling whatsoever, can physically alter the layout of a page if there isn't appropriate CSS to handle it. I'm not saying that the other configuration doesn't suffer from this issue, but this one exacerbates the problem on a much higher level.

I'm not worried about this one as there are much greater risks, like somebody changing html_input from escape to allow.

I understand what you're saying. But inner_contents isn't filtered based on those settings either. As I said above, this is injected literally as-is. Even if we were to suddenly passing this through filters (and even in CMSes like Drupal where we override this and filter all output after), it doesn't change the fact that the only rationale behind allowing this option is to "provide an icon". Something that can be done in a less potentially destructive/insecure way via other methods. There's no real reason to allow arbitrary HTML here.

Nevermind that this is the only setting I've had to actively "parse" on Drupal's side just so it can allow whatever is provided there actually through the filtering system of "allowed HTML". Which, inherently also presents its own issues as this would allow any HTML passed here anywhere in the document; not strictly related to header permalinks.

Anytime there is a way to "configure", "supply", or "inject" full HTML somewhere, it runs a much higher risk of being exploited down the road in some fashion. Regardless of its intended purpose or perceived isolation; it can and will be eventually exploited somehow.

I don't see how allowing a potentially vulnerable configuration option like this can be justified for the sake of a strictly visual cosmetic enhancement that should be handled on the FE level in the first place.

@colinodell
Copy link
Member

I do think your use case is a little different than most, as Drupal is allowing potentially untrusted users to modify settings that could impact security, whereas most applications are going to "hard-code" these settings to their particular needs. For example, if Reddit or StackOverflow were to change their implementations over to PHP and use this lib, they're not going to expose this setting to their users - they're going to choose what works best for their particular application.

That's not to say we shouldn't consider your use case (we absolutely should), but if a codebase chooses to expose a sensitive setting like html_filter or inner_contents to end-users, I think the responsibility to check and enforce any filtering or permissions checks should also remain within that codebase. Developers who understand the security risks and choose to allow raw HTML or inject embedded SVGs should be able to do so as long as we explain those risks and provide a safer default for those who don't want it.

@colinodell colinodell self-assigned this Jun 21, 2020
@colinodell colinodell added this to the v1.5 milestone Jun 21, 2020
@colinodell colinodell removed their assignment Jun 21, 2020
@close-label close-label bot added the implemented Change has been implemented label Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality or behavior implemented Change has been implemented
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants