Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unescaped HTML inside shortcode attributes breaks block validation #13399

Closed
karmatosed opened this issue Jan 21, 2019 · 9 comments
Closed

Unescaped HTML inside shortcode attributes breaks block validation #13399

karmatosed opened this issue Jan 21, 2019 · 9 comments
Labels
[Feature] Paste [Type] Plugin Interoperability Incompatibilities between a specific plugin and the block editor. Close with workaround notes.
Milestone

Comments

@karmatosed
Copy link
Member

karmatosed commented Jan 21, 2019

Note, this is a direct port of a ticket filed on Trac and which can be found here: https://core.trac.wordpress.org/ticket/45684

(edited by mcsf to use the term "quotes" throughout)

When inserting shortcode containing both single and double quotes (like so: [plugin query="a='b' c=1"] ) using a Shortcode Block, Gutenberg will show a warning next time you load the page, saying that the Block contains unexpected or incorrect code. You can choose to "restore" the code by converting it to a Block (whatever that means in this context), completely wrecking your code, or you can convert it to Custom HTML in which case the single quotes will be converted to double quotes, breaking the HTML. This also happens when you insert the code into the page using Custom HTML right away. Either way, the shortdcode will not work properly and there's no way around it, using vanilla WP 5.0.

The only solution I have found, is to install the Classic Editor Plug-In and then switch to Text View. That basically rules out the possibility of any of my editors being able to make and maintain pages containing such shortcodes, which is extremely not cool. (I'm classifying this as a Bug of Critical Severity, since it effectively breaks an essential plug-in of my site.)

When saving the above code as a Reusable Block (I'm using WP with a Dutch translation so the exact term may differ), you get to see the resulting HTML. Everything between the single quotes is replaced by a single line-break. Ouch.

@mcsf mcsf added [Feature] Paste [Feature] Shortcodes Related to shortcode functionality labels Jan 21, 2019
@mcsf
Copy link
Contributor

mcsf commented Jan 21, 2019

I can't reproduce this. Is there a specific series of steps we could have to better test?

@mcsf mcsf added the [Status] Needs More Info Follow-up required in order to be actionable. label Jan 21, 2019
@chunkyRice
Copy link

chunkyRice commented Jan 21, 2019

The error occurs when trying to activate the plugin xili Post in Post. The original shortcode looks like this:

[xilipostinpost query="tag=dossier" beforeeach="<div class='xi-item'>" aftereach="</div>"]

When the classic editor is in visual mode when opening the post, the shortcode is rewritten to:

<!-- wp:html -->
<div>[xilipostinpost query="dossier" beforeeach="
<div class="xi-item">" aftereach="</div>
"]</div>
<!-- /wp:html -->

When opening the post using Gutenberg, it is automatically put in a Classic Editor block and the code is replaced with:

<div>[xilipostinpost query="dossier" beforeeach="<div class="xi-item">" aftereach="</div>
<p>"]</p></div>

When using Gutenberg from scratch and using the Shortcode block, I get the following warning upon reopening the post:

This block contains illegal or unexpected code

(Original warning in Dutch, the exact text may differ)

Converting it to HTML by clicking on the button, it produces:

<!-- wp:html -->
xilipostinpost query="dossier" beforeeach="<div class="xi-item">" aftereach="</div>"
<!-- /wp:html -->

Using an HTML block produces the same result.
Using Gutenberg from scratch in text editor mode, produces this upon saving and reopening:

<p>xilipostinpost query="dossier" beforeeach="</p>
<div class="xi-item">" aftereach="</div>
<p>"</p>

My best option is to use the classic editor in code mode. But whenever I save any post in visual mode and then open this post (and it's a post that is supposed to be updated regularly, so that happens quite often) it trashes the code again and I have to rebuild it.
If the editors would allow the nesting of quotes, the problem would be solved. I don't understand why that's considered to be an error in the frist place. I tried to escape the double quotes with a backslash, but that doesn't work either.

@mcsf
Copy link
Contributor

mcsf commented Jan 22, 2019

Thanks for the report, @chunkyRice. From your account and what I then reproduced, this seems unrelated to quotes, but related to how escaped and unescaped HTML entities are handled.

This also affects the classic editor. For instance:

classic-text-to-visual

This is because any rich-text editor (including classic) will be looking for HTML elements in post content. Inside the shortcode there is such an element (<div class="xi-item">…</div>), which interferes with the rest of content. In order for the shortcode attributes to be ignored by the editor, the HTML entities would have to be escaped, meaning we would have:

[xilipostinpost query="tag=dossier" beforeeach="&lt;div class='xi-item'&gt;" aftereach="&lt;/div&gt;"]

Incidentally, the above snippet is exactly what we obtain if we paste the shortcode when in Visual mode instead of Text mode:

classic-visual-to-text

When this is the case, Gutenberg has no problem parsing the content and helping with the conversion:

classic-to-shortcode


Note: I understand that the shortcode in question, xilipostinpost, may not even support attributes whose HTML has been escaped. (If that's the case, I would bring up this issue with the plugin maintainer.) However I would personally argue that shortcodes shouldn't operate on unescaped HTML. At the very least, it's a grey area.

There may be room for improvement somewhere in Gutenberg to make situations like these easier on users, but I'm not sure which.

@mcsf mcsf added [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable and removed [Feature] Shortcodes Related to shortcode functionality [Status] Needs More Info Follow-up required in order to be actionable. labels Jan 22, 2019
@mcsf mcsf changed the title Gutenberg breaks shortcode with nested quotes Unescaped HTML inside shortcode attributes breaks block validation Jan 22, 2019
@mcsf mcsf added the [Type] Plugin Interoperability Incompatibilities between a specific plugin and the block editor. Close with workaround notes. label Jan 22, 2019
@ellatrix
Copy link
Member

@mcsf How does this relate to the RichTex component?

@mcsf
Copy link
Contributor

mcsf commented Jan 29, 2019

@iseulde: Not very specifically, this is more about the serialisation layer. Fine to remove whatever labels you like.

@chunkyRice, does the explanation in this issue help you in any way? Is there a course of action you see for yourself or for core Gutenberg? Otherwise, I'm inclined to say there are no action items for Gutenberg and that we can close this issue. Let me know! Thanks.

@ellatrix ellatrix removed the [Feature] Rich Text Related to the Rich Text component that allows developers to render a contenteditable label Jan 29, 2019
@chunkyRice
Copy link

I've tested what @mcsf suggested, but it only works a single time. When you save the page and reopen it, everything is a mess again.

I expected the snippet block to handle raw ascii text, but apparently that's not the case. Doing so would solve this problem, but from your answers I understand that it would also be a potential safety hazard.

I've submitted the problem to the support forum of the plugin.

@mcsf
Copy link
Contributor

mcsf commented Jan 30, 2019

When you save the page and reopen it, everything is a mess again.

Yeah, never mind what I said. The block-validation process runs the serialised HTML through the block parser, where attributes are parsed according to their type:

/**
* Returns an hpq matcher given a source object.
*
* @param {Object} sourceConfig Attribute Source object.
*
* @return {Function} A hpq Matcher.
*/
export function matcherFromSource( sourceConfig ) {
switch ( sourceConfig.source ) {
case 'attribute':
let matcher = attr( sourceConfig.selector, sourceConfig.attribute );
if ( sourceConfig.type === 'boolean' ) {
matcher = toBooleanAttributeMatcher( matcher );
}
return matcher;
case 'html':
return html( sourceConfig.selector, sourceConfig.multiline );
case 'text':
return text( sourceConfig.selector );
case 'children':
return children( sourceConfig.selector );
case 'node':
return node( sourceConfig.selector );
case 'query':
const subMatchers = mapValues( sourceConfig.query, matcherFromSource );
return query( sourceConfig.selector, subMatchers );
case 'tag':
return flow( [
prop( sourceConfig.selector, 'nodeName' ),
( value ) => value.toLowerCase(),
] );
default:
// eslint-disable-next-line no-console
console.error( `Unknown source type "${ sourceConfig.source }"` );
}
}

The block type for Shortcode defines that the text attribute is of source 'text'. This makes the parser parse the attribute via hpq's text(), which ultimately uses Node#textContent instead of Element#innerHTML to return the parsed shortcode. This is where &lt; becomes <.

The discrepancy between the serialised escaped HTML (&lt;) form and the "expected" unescaped forms (<) is what makes the block invalid upon editor initialisation.

Changing Shortcode's text attribute to be of source 'html' does solve your particular case. But I can't speak to any kind of instability that that would bring. @aduth, do you have any insight there? The Code block is the only other core block using source: 'text', since it also deals with plain-text content.

@aduth
Copy link
Member

aduth commented Jan 30, 2019

To me, particularly because it saves using RawHTML, there's no reason the shortcode block shouldn't also define its attribute as being sourced via 'html'. I expect the original implementation didn't expect HTML to be embedded within a shortcode, so it defaulted to text.

It's worth pointing out that regardless of whether the source is changed, the default WordPress unfiltered_html capability will apply when the post is saved; i.e. the text will be escaped anyways server-side for any user who is not an administrator.

@mcsf
Copy link
Contributor

mcsf commented Jan 30, 2019

Thanks for the input. Opened #13609.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Paste [Type] Plugin Interoperability Incompatibilities between a specific plugin and the block editor. Close with workaround notes.
Projects
None yet
Development

No branches or pull requests

6 participants