Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple renderers #35

Closed
aleemb opened this issue Dec 25, 2014 · 13 comments
Closed

Support for multiple renderers #35

aleemb opened this issue Dec 25, 2014 · 13 comments
Labels
enhancement New functionality or behavior feedback wanted We need your input! up-for-grabs Please feel free to take this one!
Milestone

Comments

@aleemb
Copy link
Contributor

aleemb commented Dec 25, 2014

My use case is that I need custom renderers for images and depending on the image URL, these renders will render the image as an iframe for YouTube, Vimeo, etc. At present I use the Sundown renderer and I have stuffed all these renderers into one big file but would like to move over to this lib and cleanup the mess while I'm at it.

I also imagine that a lot of contributors can then share their own YoutubeRenderer (or generic VideoRenderer). This would make extensibility a lot more fun and build an ecosystem around it. Maybe even give way to an extensions repo.

Based on your suggestion in #32, I currently render my custom Renderer as

$environment->addInlineRenderer('League\CommonMark\Inline\Element\Image', new MediaRenderer());

However, this only allows attaching a single renderer. It would be nice if the renderer can work sort of like conventional event handlers with the possibility to either chain the output as input to other renderers or stop propagation. The easiest approach would be to have symmetric input/output for renderers that would make chaining a breeze. Currently the input/output is (AbstractInline, HtmlRenderer)/HtmlElement.

@aleemb
Copy link
Contributor Author

aleemb commented Dec 25, 2014

For clarity, there are two use-cases.

  1. Define multiple renderers such as YouTubeRenderer, TwitterRenderer, SpellcheckRenderer, etc which may operate on the same entity such as images, text, raw html, links, etc. Currently I found it possible to only define a single renderer for an image and others.

  2. Chaining renderers. I haven't run into this yet but I think this might be relevant for an HTML sanitiser to sanitise links or raw html before passing off to other renderers. HTML sanitisation is probably going to be the most common scenario for rendering links, images and similar untrusted user input:

    A note on security: Neither implementation attempts to sanitize link attributes or raw HTML. If you use these libraries in applications that accept untrusted user input, you must run the output through an HTML sanitizer to protect against XSS attacks.

@colinodell colinodell added the enhancement New functionality or behavior label Dec 25, 2014
@colinodell
Copy link
Member

These are some great thoughts. Let me think about this and play with the code.

Thanks!

@aleemb
Copy link
Contributor Author

aleemb commented Dec 25, 2014

Thanks! With the 0.5 release I find this lib is quite ready for production use. I bumped the repo on HN https://news.ycombinator.com/item?id=8796457

@aleemb
Copy link
Contributor Author

aleemb commented Dec 26, 2014

I haven't delved into the framework so it's based on a cursory glance at the code but the simplest route that wouldn't involve a lot of code juggling and might even simplify the abstraction layers would involve combining AbstractInline and HtmlRenderer and adding an HtmlElements collection property to the HtmlRenderer, so roughly:

class HtmlRenderer
{
    public $data = array(); // from AbstractInline
    public $htmlElements = array();
    // same implementation as before
}

The InlineRendererInterface would be redefined as:

interface InlineRendererInterface
{
    // returns HtmlRenderer
    public function render(HtmlRenderer $htmlRenderer);
}

and the implementation would be:

class ImageRenderer extends HtmlRenderer implements InlineRendererInterface
{
    public function render(HtmlRenderer $htmlRenderer)
    {
        $inline = $htmlRenderer->inline;

        // ...

        $htmlElement = new HtmlElement('img', $attrs, '', true);
        // or something like
        $htmlElement = $this->htmlElement;

        // how would this line work? does order need to be preserved? associative array or what?
        $this->htmlElements[] = $htmlElement;
        // ? or just
        $this->htmlElement = $htmlElement;

        return $this; // exposes chaining and paves way for multiple renderers
    }
}

Something along those lines maybe?

EDIT: Code correction

@colinodell
Copy link
Member

I appreciate all the examples and your patience while I think through this.

Chaining renderers. I haven't run into this yet but I think this might be relevant for an HTML sanitiser to sanitise links or raw html before passing off to other renderers. HTML sanitisation is probably going to be the most common scenario for rendering links, images and similar untrusted user input

I think the renderer should be responsible for all aspects of converting the AST element into HTML, including any sanitization or escaping that needs to happen. The chaining approach sounds cool but I feel it's unnecessary since renderers should make these changes themselves. For example, the TextRenderer knows it needs to escape HTML but the HtmlRenderer shouldn't. I guess I don't see a compelling use case where chaining provides more benefits than the current approach.

My use case is that I need custom renderers for images and depending on the image URL, these renders will render the image as an iframe for YouTube, Vimeo, etc.

Wouldn't it be better to parse out YouTube videos as a new entity/element type with it's own dedicated renderer? For example, given this input:

My favorite video is: ![this one](https://www.youtube.com/watch?v=dQw4w9WgXcQ)

You'd get an AST roughly like this:

But you really want an embedded video, so why not create a Video element instead with its own dedicated VideoRenderer to handle it? You could create an inline processor (not currently documented) to replace all Images with YouTube URLs with a Video element:

class MediaProcessor implements InlineProcessorInterface
{
    public function processInlines(ArrayCollection $inlines, DelimiterStack $delimiterStack, Delimiter $stackBottom = null)
    {
        foreach ($inlines as $index => $inline) {
            if ($inline instanceof Image && $this->isYouTubeUrl($inline->getUrl()) ) {
                $inlines->set($index, new Video($inline->getUrl());
            }
        }
    }
}

That's very rough, untested pseudo-code but something like that should work.

Or you could extend/replace the existing ImageParser to just create Videos instead - whatever works best for you.

Define multiple renderers such as YouTubeRenderer, TwitterRenderer, SpellcheckRenderer, etc which may operate on the same entity such as images, text, raw html, links, etc. Currently I found it possible to only define a single renderer for an image and others.

I'm not necessarily opposed to multiple renderers per element type. The YouTube/Vimeo/etc example is a very compelling reason to support this. OTOH, this introduces another level of complexity that really isn't needed for CommonMark support.

I feel like there are 3 different approaches we could take:

Option 1 - Simple loop, first returner "wins"

The simplest implementation for multiple renderer support would work similar to the parsers - we iterate through them until we get a successful result - nobody else is called or has the opportunity to overwrite the previous result.

Option 2 - Event-based approach

A more-complex implementation could use an event-based approach instead where all renderers get called with an HtmlRenderEvent exposing things like:

  • The inline being parsed
  • The current result (string, HtmlElement, or HtmlElement[])
  • Method to stop propogation (no other renderers get called)

Each renderer would have the ability to modify the current result as needed.

I think this would be the "most correct" way to implement multiple renderers with overwrite capability, but I'm still not convinced we need it.

Options 3 - Composite renderer

An alternative would be creating a "composite renderer" which works like option 1, except you only create/use it when needed.

class CompositeInlineRenderer implements InlineRendererInterface
{
    protected $renderers;

    /**
     * @param InlineRendererInterface[]
     */
    public function __construct($renderers)
    {
        $this->renderers = $renderers;
    }

    public function render(AbstractInline $inline, HtmlRenderer $htmlRenderer)
    {
        foreach ($this->renderers as $subRenderer) {
            if ($result = $subRenderer->render($inline, $htmlRenderer)) {
                return $result;
            }
        }
    }
}

$mediaRenderer = new CompositeInlineRenderer(array(
    new YouTubeRenderer(),
    new VimeoRenderer(),
    // ...
));

$environment->addInlineParser('Image', $mediaRenderer);

This lets you implement whatever approach works best for you and doesn't require any changes to the codebase - you could start using this today.


What are your thoughts on all of this?

@colinodell colinodell added this to the Version 1.0 milestone Dec 30, 2014
@colinodell colinodell added the feedback wanted We need your input! label Dec 30, 2014
@aleemb
Copy link
Contributor Author

aleemb commented Dec 30, 2014

Thanks very much for taking the time to think this through.

Option 3 doesn't follow the open-close principle since each extension would require modification of the composite renderer. My point was more along along the lines of a framework feature--composability and code organisation within plugins is a non-issue.

Option 1 seems pretty useful since it allows pass-through which is a small but powerful feature for plugins. However the custom renders will need to be invoked before the primary renderer so they can render prematurely if needed.

Option 2 would probably require a significant rewrite. I suspect, however, it will be more performant since the processing pipeline needs to run just once and raise events along the way.

The inline processor approach is something I wasn't aware of but it seems this would need to inspect every inline element over again? If so, this will degrade performance with each additional inline processor. If not, this is a pretty good approach and this matter could be closed.

Rant: I really don't mean to complicate the issue and option 1 is more than sufficient here. The point of raising this issue was that developing a plugin ecosystem is something I hold as strongly as a strategy to gain wide adoption. I think PHP-CommonMark is a good contender for the same. WordPress led the way here by exposing thousands of hooks in their platform and then letting developer creativity take over. NPM, Gulp, Grunt, Composer, etc are all built around a strong plugin community. JS implementations of CommonMark such as Remarkable/Markdown-It are also focusing on something similar (taking a route similar to Gulp/Grunt). I like the idea of installing via a single Composer command, things like tables support, HTML sanitiser, Embed.ly providers and get on with my day.

@colinodell
Copy link
Member

The point of raising this issue was that developing a plugin ecosystem is something I hold as strongly as a strategy to gain wide adoption.

That's a great point and I completely agree with you :) Plugin support is definitely a priority and I want to ensure we're using the best approach possible. I'd hate to implement one approach only to discover it's too slow, easily-abused, not flexible enough, or whatever.

I think we can eliminate option 1 as it requires too much work to ensure that your renderer runs first. Although... maybe it could work if we provided methods like addInlineRendererBefore('LinkRenderer', new MyCustomRenderer()) so you can set that order? It would probably work but feels a little dirty to me.

Option 3 wouldn't necessarily violate the open-closed principle. Think of it like a final class that simply encapsulates multiple renderers into the existing structure.

The biggest change in option 2 would require rewriting the method signatures of all renderers. The general structure and rendering processes would remain the same, so it shouldn't be too bad to implement.

Let me try implementing that second option and perhaps some other ideas too. I think we'll have a better grasp of the pros/cons once there's some working code to look at and test.

In the mean time, I'd appreciate any feedback on #42 and #43 :)

Thanks!

@colinodell colinodell self-assigned this Dec 30, 2014
@colinodell colinodell modified the milestones: Version 0.7, Version 0.6 Jan 9, 2015
@colinodell colinodell modified the milestones: Version 0.8, Version 0.7 Feb 16, 2015
@indrora
Copy link

indrora commented Feb 27, 2015

Taking on my $0.002.

I'm attempting to implement a new kind of link type (internal references) and I'm having a hard time. I'm trying to make it such that ^[some page](some/where/inside) can be parsed into what would be [some page](/some/path/some/where/inside)

Having a way to mangle links like this would be fantastic -- it would give the ability to have what OP wants (Totally override the link process) but also fit several other use cases (e.g. a custom one to make your internal documentation, say $[this CVE](2008-4250) could become the equivalent of [this CVE](http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2008-4250)

@colinodell colinodell added the up-for-grabs Please feel free to take this one! label Apr 10, 2015
@colinodell colinodell removed this from the Version 0.8 milestone Apr 29, 2015
@colinodell colinodell removed their assignment May 8, 2015
@0b10011
Copy link
Contributor

0b10011 commented Jul 10, 2015

I implemented YouTube video rendering using a processor that would replace any Link that has a YouTube video URL with a YouTubeVideo element that extends Link, leaving the url, label, and link alone as they're passed through. Then I added a YouTubeVideoRenderer that ensures the link is by itself in a paragraph, pulls the video ID out of the URL, and creates the necessary elements to display the video.

This way, if the YouTube processor/render are ever removed, it will fallback to a link to the YouTube video in it's own paragraph, which doesn't look half bad and still serves the same purpose.

See https://gist.github.com/bfrohs/9017f38586404f161298 for code.

@skyzyx
Copy link

skyzyx commented Jul 22, 2015

IMO, Option 2 is "the right way" to do it. Fire an event when something happens in the parsing/rendering, and register listeners in a chain that can modify the subject as necessary. Subject/Observer Pattern.

@colinodell
Copy link
Member

My primary objection to Option 1 was ensuring that developers could successfully influence the order in which the renderers executed. However, once #210 is merged, extensions will be loaded in the order they were added (instead of by name), thus giving the control needed for this to work. So if you wanted to override a core renderer, you'd have to ensure your extension is loaded before the code extension.

(This would require additional work outside of #210; that PR simply sets some of the ground work)

@colinodell
Copy link
Member

Please review #341 when you get a chance - I'd love to hear your feedback!

@colinodell colinodell added this to the v0.19 milestone Mar 11, 2019
@colinodell
Copy link
Member

Implemented via #341

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality or behavior feedback wanted We need your input! up-for-grabs Please feel free to take this one!
Projects
None yet
Development

No branches or pull requests

5 participants