Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip HTML from gallery image alt text #26574

Closed
wants to merge 9 commits into from
Closed

Conversation

WunderBart
Copy link
Member

@WunderBart WunderBart commented Oct 29, 2020

Fixes #26549

Description

When a gallery image's alt attribute value is empty, we use what's in the caption prop to fill it out. The issue is that caption can contain links, whereas the alt attribute should not contain any HTML. This PR makes sure the alt attribute when created from caption, receives a value stripped out of HTML.

How has this been tested?

Manually:

  1. Add the Image Gallery block.
  2. Add some images.
  3. Fill the caption with some text and a link.
  4. Preview the site and inspect the image element.
  5. See that the alt attribute doesn't contain any HTML.

To run unit tests covering the change:

npm run test-unit -- test/integration/full-content/full-content.test.js

Types of changes

Bugfix

Checklist:

  • My code is tested.
  • My code follows the WordPress code style.

TODO

  • Figure out a way to access 'implicit' attributes (that aren't stored in the comment, but inferred through a query) in migrate
  • Migration: Unescape HTML in alt attribute.

@WunderBart WunderBart added [Status] In Progress Tracking issues with work in progress [Block] Gallery Affects the Gallery Block - used to display groups of images labels Oct 29, 2020
@WunderBart WunderBart self-assigned this Oct 29, 2020
@github-actions
Copy link

github-actions bot commented Oct 29, 2020

Size Change: +37 B (0%)

Total Size: 1.21 MB

Filename Size Change
build/block-library/index.js 146 kB +35 B (0%)
build/blocks/index.js 48.1 kB +1 B
build/components/index.js 172 kB +1 B
ℹ️ View Unchanged
Filename Size Change
build/a11y/index.js 1.14 kB 0 B
build/annotations/index.js 3.78 kB 0 B
build/api-fetch/index.js 3.45 kB 0 B
build/autop/index.js 2.84 kB 0 B
build/blob/index.js 665 B 0 B
build/block-directory/index.js 8.72 kB 0 B
build/block-directory/style-rtl.css 943 B 0 B
build/block-directory/style.css 942 B 0 B
build/block-editor/index.js 130 kB 0 B
build/block-editor/style-rtl.css 11.1 kB 0 B
build/block-editor/style.css 11.1 kB 0 B
build/block-library/editor-rtl.css 8.97 kB 0 B
build/block-library/editor.css 8.97 kB 0 B
build/block-library/style-rtl.css 7.86 kB 0 B
build/block-library/style.css 7.86 kB 0 B
build/block-library/theme-rtl.css 837 B 0 B
build/block-library/theme.css 838 B 0 B
build/block-serialization-default-parser/index.js 1.88 kB 0 B
build/block-serialization-spec-parser/index.js 3.1 kB 0 B
build/components/style-rtl.css 15.2 kB 0 B
build/components/style.css 15.2 kB 0 B
build/compose/index.js 9.81 kB 0 B
build/core-data/index.js 12.3 kB 0 B
build/data-controls/index.js 772 B 0 B
build/data/index.js 8.77 kB 0 B
build/date/index.js 31.8 kB 0 B
build/deprecated/index.js 768 B 0 B
build/dom-ready/index.js 571 B 0 B
build/dom/index.js 4.46 kB 0 B
build/edit-navigation/index.js 11.2 kB 0 B
build/edit-navigation/style-rtl.css 881 B 0 B
build/edit-navigation/style.css 885 B 0 B
build/edit-post/index.js 306 kB 0 B
build/edit-post/style-rtl.css 6.41 kB 0 B
build/edit-post/style.css 6.39 kB 0 B
build/edit-site/index.js 22.1 kB 0 B
build/edit-site/style-rtl.css 3.88 kB 0 B
build/edit-site/style.css 3.88 kB 0 B
build/edit-widgets/index.js 26.4 kB 0 B
build/edit-widgets/style-rtl.css 3.13 kB 0 B
build/edit-widgets/style.css 3.13 kB 0 B
build/editor/editor-styles-rtl.css 480 B 0 B
build/editor/editor-styles.css 482 B 0 B
build/editor/index.js 43.1 kB 0 B
build/editor/style-rtl.css 3.85 kB 0 B
build/editor/style.css 3.85 kB 0 B
build/element/index.js 4.65 kB 0 B
build/escape-html/index.js 735 B 0 B
build/format-library/index.js 7.7 kB 0 B
build/format-library/style-rtl.css 547 B 0 B
build/format-library/style.css 548 B 0 B
build/hooks/index.js 2.13 kB 0 B
build/html-entities/index.js 623 B 0 B
build/i18n/index.js 3.57 kB 0 B
build/is-shallow-equal/index.js 712 B 0 B
build/keyboard-shortcuts/index.js 2.52 kB 0 B
build/keycodes/index.js 1.94 kB 0 B
build/list-reusable-blocks/index.js 3.11 kB 0 B
build/list-reusable-blocks/style-rtl.css 476 B 0 B
build/list-reusable-blocks/style.css 476 B 0 B
build/media-utils/index.js 5.34 kB 0 B
build/notices/index.js 1.79 kB 0 B
build/nux/index.js 3.42 kB 0 B
build/nux/style-rtl.css 671 B 0 B
build/nux/style.css 668 B 0 B
build/plugins/index.js 2.56 kB 0 B
build/primitives/index.js 1.43 kB 0 B
build/priority-queue/index.js 791 B 0 B
build/redux-routine/index.js 2.85 kB 0 B
build/reusable-blocks/index.js 3.06 kB 0 B
build/rich-text/index.js 13.2 kB 0 B
build/server-side-render/index.js 2.77 kB 0 B
build/shortcode/index.js 1.69 kB 0 B
build/token-list/index.js 1.27 kB 0 B
build/url/index.js 4.06 kB 0 B
build/viewport/index.js 1.84 kB 0 B
build/warning/index.js 1.14 kB 0 B
build/wordcount/index.js 1.22 kB 0 B

compressed-size-action

@WunderBart WunderBart force-pushed the fix/img-caption-html branch from 37d0b33 to 898b5f8 Compare October 29, 2020 16:36
@ockham
Copy link
Contributor

ockham commented Oct 29, 2020

The current error

    Expected mock function not to be called but it was called with:
    ["Block successfully updated for `%s` (%o).·
    New content generated by `save` function:·
    [...]

seems to indicate that a block migration is run, even though it shouldn't be run 🤔

BTW how did you create the fixtures? https://github.com/WordPress/gutenberg/blob/710373b254fbcd15d524afdeb31da0d93c4defd4/packages/e2e-tests/fixtures/blocks/README.md#creating-fixtures says you only need to manually create core__gallery__image-with-caption.html, and can then auto-generate the other fixtures through

npm run fixtures:regenerate test/integration/full-content/full-content.test.js

I had a look at core__gallery__image-with-caption.html. I think that the alt img attribute cannot be empty there, since we're assuming that this block was saved using the current (as of this PR) block code -- which would set the alt img attribute to caption's (HTML-stripped) value.

@ockham
Copy link
Contributor

ockham commented Oct 29, 2020

I had a look at core__gallery__image-with-caption.html. I think that the alt img attribute cannot be empty there, since we're assuming that this block was saved using the current (as of this PR) block code -- which would set the alt img attribute to caption's (HTML-stripped) value.

Yeah, setting those (and regenerating the other fixtures) seems to do the trick 👍

@WunderBart
Copy link
Member Author

I had a look at core__gallery__image-with-caption.html. I think that the alt img attribute cannot be empty there, since we're assuming that this block was saved using the current (as of this PR) block code -- which would set the alt img attribute to caption's (HTML-stripped) value.

Yeah, setting those (and regenerating the other fixtures) seems to do the trick 👍

I did use the npm command to create the remaining files. Missed the fact that the alt attribute should be filled with the output value there, though! 🙌. Thanks for fixing the tures, Bernie! 😄

@WunderBart WunderBart added [Type] Bug An existing feature does not function as intended and removed [Status] In Progress Tracking issues with work in progress labels Oct 30, 2020
@WunderBart WunderBart marked this pull request as ready for review October 30, 2020 11:37
@ockham
Copy link
Contributor

ockham commented Oct 30, 2020

Writing a deprecation here is currently blocked by the fact that apparently, 'implicit' attributes (i.e. the ones that aren't stored as part of the block comment, but inferred through a queue) -- in this case: the images array -- aren't accessible to isEligible and migrate.

I think that's a feature that should be added to deprecations, although it might be a bit tricky to get right, since serializing the updated attribute back to HTML would involve calling the right save method (not the current deprecation's). We can probably assume it's the 'next' one in the stack of deprecations (including the up-to-date save of the block).

cc/ @youknowriad @mcsf @gziolo

@youknowriad
Copy link
Contributor

Writing a deprecation here is currently blocked by the fact that apparently, 'implicit' attributes (i.e. the ones that aren't stored as part of the block comment, but inferred through a queue) -- in this case: the images array -- aren't accessible to isEligible and migrate.

How did you arrive to this conclusion. That's not true I think or unexpected.

anchor: true,
align: true,
},
isEligible( { images } ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need isElligible here? I believe this particular deprecation doesn't really need it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of what you said in your other comment? I guess if we don't need a migrate, we won't need an isEligible. But if we need a migrate, we'd need a way to know that the alt includes HTML, no?

};
} ),
};
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure the migrate function is needed here because if I understand the PR properly, the attribute is not really rewritten, the same attribute will produce another output, that's it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the alt attribute -- which is implicit -- is being changed; so yeah, that attribute is part of the output (and not an explicit attribute in the block comment). Are you saying that a change like that doesn't need a migration? 🤔 If we don't add one, it means the block will load with HTML in the alt attribute -- it will only be removed when saving it again, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, correct. but now that I think about it, there's something weird in the implementation of the gallery save function regardless.

If the image doesn't have any "alt", it will get assigned an "alt" text after saving because it will be parsed from the fallback. I wonder if we should remove the fallback personally and instead apply the fallback in edit function instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, correct. but now that I think about it, there's something weird in the implementation of the gallery save function regardless.

If the image doesn't have any "alt", it will get assigned an "alt" text after saving because it will be parsed from the fallback. I wonder if we should remove the fallback personally and instead apply the fallback in edit function instead.

You mean weird in terms of UX? So right now, after saving and reloading, the alt, if otherwise empty, would be prefilled with the caption. Moving to edit would essentially do the same though, no?

Whatever we do, let's please make sure to strip the HTML, and provide a working migration path. This issue (when combined with some other HTML mangling/sanitization) is causing havoc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the block saving is not idempotent. If you save the same post twice without changes, you get different attributes.

What if we do this:

  • When adding an image, use the caption as a fallback for the alt property. (and escape it)
  • Remove the fallback in save

Would that work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add that I believe that we need some logic like this in the save anyway, for the purpose of the pseudo-migration for alt attributes that were touched with GB 9.2 active, and had HTML inserted. (I don't think we can solve that solely at edit level.) And since the change (at save) level is sufficient also for newly created galleries, there's nothing left to do at edit level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add that I believe that we need some logic like this in the save anyway, for the purpose of the pseudo-migration for alt attributes that were touched with GB 9.2 active, and had HTML inserted.

For me personally, I don't see that as a big issue. I see the conceptual issue as more important: (the conceptual issue being the save function not idempotent creating different block attributes without touching the editor).

How many persons are affected by this issue and for Core, it's still a development plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also note that in all cases, the user has to open and save the post anyway since migrations don't run automtically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me personally, I don't see that as a big issue. I see the conceptual issue as more important: (the conceptual issue being the save function not idempotent creating different block attributes without touching the editor).

I'm all for purity of code and all that, but it's the impact of this issues that worries me: Post content actually gets broken, so people will have to revert to an earlier revision (and might lose the changes they were working on). OTOH, the feature that was added by the PR that caused this is an (arguably minor) a11y improvement. IMO, those aren't strong enough reasons to keep around a feature that has such negative side effects.

How many persons are affected by this issue and for Core, it's still a development plugin.

Please refer to Automattic/wp-calypso#46813. I'm counting ~15 so far (some of those comments refer to the WP.com forums, where each thread can potentially contain multiple reports. Here's a few:

https://wordpress.com/forums/topic/gallery-block-problem/
https://wordpress.com/forums/topic/galleries-not-showing-up-right/
https://wordpress.com/forums/topic/unreadable-appearance-of-gallery-page-after-editing/
https://wordpress.com/forums/topic/cant-make-changes-to-website-and-drop-down-menus-are-not-working/
https://wordpress.com/forums/topic/galleries-not-showing-up-right/

I'd also note that in all cases, the user has to open and save the post anyway since migrations don't run automtically.

That's correct, but that's how migrations work, isn't it? This could be addressed by adding additional tooling to e.g. 'fix' the database upon plugin upgrade, but let's not get the lack of a perfect solution in the way of a sufficient one.

Needing to open and save can at least help with the cases where users with GB 9.2 were editing their galleries and because of the issue lost content, revert the post to a previous revision, and would like to re-apply those changes.


Anyway, is the compromise to simply remove the image.caption fallback for alt? I'd be okay with that, for the sake of at least preventing any further breakage into 9.3, even if it means that we won't unbreak existing content.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, is the compromise to simply remove the image.caption fallback for alt? I'd be okay with that, for the sake of at least preventing any further breakage into 9.3, even if it means that we won't unbreak existing content.

Filed the alternative PR: #26676

@ockham
Copy link
Contributor

ockham commented Oct 30, 2020

Writing a deprecation here is currently blocked by the fact that apparently, 'implicit' attributes (i.e. the ones that aren't stored as part of the block comment, but inferred through a queue) -- in this case: the images array -- aren't accessible to isEligible and migrate.

How did you arrive to this conclusion. That's not true I think or unexpected.

I put a console.log into both functions to output their first two arguments, and the first one only contained the attributes that were listed in the block comment.

diff --git a/packages/block-library/src/gallery/deprecated.js b/packages/block-library/src/gallery/deprecated.js
index 8542eb5bfc..b0ed9d7ea1 100644
--- a/packages/block-library/src/gallery/deprecated.js
+++ b/packages/block-library/src/gallery/deprecated.js
@@ -99,7 +99,9 @@ const deprecated = [
                        anchor: true,
                        align: true,
                },
-               isEligible( { images } ) {
+               isEligible( props, blcks ) {
+                       console.log( 'isElig', props, blcks );
+                       const { images } = props;
                        return (
                                images &&
                                images.some(
@@ -107,7 +109,8 @@ const deprecated = [
                                )
                        );
                },
-               migrate( attributes ) {
+               migrate( attributes, blcks ) {
+                       console.log( 'migrate', attributes, blcks );
                        return {
                                ...attributes,
                                images: attributes.images.map( ( image ) => {

image

@ockham ockham added this to the Gutenberg 9.3 milestone Nov 2, 2020
Copy link
Contributor

@ockham ockham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM (but we should look into the failing tests -- maybe a rebase is enough).

I'd like to get final approval by @youknowriad since we ended up going with a save-based approach; see the conversation starting here for the rationale.

Thanks a lot for this fix, @WunderBart!

@ockham ockham force-pushed the fix/img-caption-html branch from 57986d8 to 138e640 Compare November 2, 2020 19:28
@talldan talldan removed this from the Gutenberg 9.3 milestone Nov 4, 2020
@gziolo
Copy link
Member

gziolo commented Nov 4, 2020

See my comment on the alternative PR proposed by @ockham #26676 (comment).

@ockham
Copy link
Contributor

ockham commented Nov 4, 2020

Obsoleted by #26676.

@ockham ockham closed this Nov 4, 2020
@ockham
Copy link
Contributor

ockham commented Nov 4, 2020

Thanks again for working on this and exploring different options to solve it, @WunderBart!

@ockham ockham deleted the fix/img-caption-html branch November 4, 2020 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Block] Gallery Affects the Gallery Block - used to display groups of images [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gallery block: Strip HTML from caption before using as alt attribute.
5 participants