-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor serializer: rearrange code for clarity and introspection #1148
Conversation
As discussed elsewhere (#844 for example), this is pretty worrisome. We've chosen a data storage mechanism (
This is more like it. We should be validating inputs, not detecting changes. |
blocks/api/serializer.js
Outdated
Object.keys( realAttributes ), | ||
Object.keys( expectedAttributes ) | ||
); | ||
export function attributesToSave( allAttributes, fromContent ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think getCommentAttributes
was a better name. I would say the signature of this function should be:
function getCommentAttributes( allAttributes, attributesFromContent ) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine with me. I found it confusing (which is why I changed it) but I don't care too much
blocks/api/serializer.js
Outdated
|
||
return memo + `${ key }="${ value }" `; | ||
}, '' ); | ||
const transform = key => escapeDoubleQuotes( allAttributes[ key ] ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transforming should not really be the responsibility of the function that goes from all attributes to attributes saved in comments. This should be done during the serialization-to-string step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point. the reason I did this was to avoid iterating twice. granted, that's an optimization
it did however seem reasonable to do here since this function was taking "attributes from a block" and returning "attributes that need to be saved in the comment header." if we don't transform here, we have a function which returns a set of attributes in a state that should never exist: "the group of attributes which need to serialize in the comment header but which can't be saved as-is because of potential serialization bugs"
thoughts?
blocks/api/serializer.js
Outdated
// Iterate over attributes and produce the set to save | ||
return reduce( | ||
Object.keys( allAttributes ), | ||
( toSave, key ) => Object.assign( toSave, isValid( key ) && { [ key ]: transform( key ) } ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit too magical. What does Object.assign( toSave, false )
actually do?
Combined with the above comment, how about filter( allAttributes, ( value, key ) => isValid( key ) )
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do we really need isValid
at all? Isn't it enough to say simply !! attributesFromContent[ key ]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit too magical.
I can see where it feels magical, but it's just the behavior of Object.assign()
by specification.
how about filter
this seems very reasonable. I'll update that.
Also, do we really need isValid at all? Isn't it enough to say simply !! attributesFromContent[ key ] ?
well if that's all we said we'd be wrong in the case where allAttributes[ key ]
is undefined
and it doesn't exist in attributesFromContent
. trying to pull out the actual operations that are happening to determine if a key is valid is part of why I created isValid()
since before it was spread across different lines.
@nylen this is the whole point. the goal isn't to slap people on the write for using an external editor, but it's to determine if any external edits caused any change in the structural content of the block and if so, present the user with some indication that decay has occurred. |
b4317d1
to
1ad4c77
Compare
blocks/api/post.pegjs
Outdated
{ return keyValue( name, value ) } | ||
/ name:HTML_Attribute_Name _* "=" _* "'" value:$(("\\'" . / !"'" .)*) "'" | ||
/ name:HTML_Attribute_Name _* "=" _* "'" value:$(("\\" "'" . / !"'" .)*) "'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More on these changes later…
blocks/api/serializer.js
Outdated
return memo + `${ key }="${ value }" `; | ||
}, '' ); | ||
return ! ( contentValue !== undefined || allValue === undefined ) | ||
? Object.assign( toSave, { [ key ]: escapeDoubleQuotes( allValue ) } ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit weird to me that we're encoding the attributes values in this function but we generate the key="value"
in the other function. Serialization of the comment attributes is split between two methods.
I'd think we should avoid encoding here, or generate the complete string here because encoding the values without generating a string makes no sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rearranged in fdd4aa72
eb87c72
to
a0a8351
Compare
These changes are intended to make the flow of transformation from in-memory block data to serialized `post_content` clearer. Additionally they are intended to ease testing and create points for easy trapping and transformation of the data through the pipeline.
01d9635
to
f97ee6a
Compare
@@ -1,8 +1,14 @@ | |||
{ | |||
|
|||
function untransformValue( value ) { | |||
return 'string' === typeof value | |||
? value.replace( /\\-/g, '-' ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? What if you have the string "\\\\-"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to pair with the serializer to make sure we don't get here. we have have to face escaping in some manner. doing it here I think is the easiest way to accomplish this: "You must escape hyphens. You need not escape anything else."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, see #1088 for background on the issues with a double-hyphen in an HTML comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means we should escape \
as well?
blocks/api/serializer.js
Outdated
@@ -36,35 +36,94 @@ export function getSaveContent( save, attributes ) { | |||
return wp.element.renderToString( rawContent ); | |||
} | |||
|
|||
const escapeDoubleQuotes = value => value.replace( /"/g, '\"' ); | |||
const replaceHyphens = value => value.replace( /-/g, '\\-' ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should be escapeHyphens
for consistency?
blocks/api/serializer.js
Outdated
* @returns {*} transformed value | ||
*/ | ||
const serializeValue = value => | ||
'string' === typeof value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it is an array or object that contains strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they shouldn't be. we should already be JSON-serializing data if not a number or string.
this will change if we stop using name="value"
pairings, which is being worked on in another PR at the moment (not yet published)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@@ -127,7 +127,7 @@ describe( 'full post content fixture', () => { | |||
} | |||
} | |||
|
|||
expect( serializedActual ).to.eql( serializedExpected ); | |||
expect( serializedActual.trim() ).to.eql( serializedExpected.trim() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want (or need) to be forgiving with leading/trailing whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion we shouldn't be testing for the specific presence of absence of it.
@see #948 for background exploration
These changes are intended to make the flow of transformation from
in-memory block data to serialized
post_content
clearer. Additionallythey are intended to ease testing and create points for easy trapping
and transformation of the data through the pipeline.
As I'm working in the serializer file to make harmonious updates with the parser changes I'm finding it unclear what is and what should be going on in serialization. Hopefully this PR will serve as a base for future work.
It makes subjective calls on style; I know. Please let me know if you don't like it.
Performance shouldn't be an issue and if you think it will please respond with data backing that up.Unlike in #948 I tried to give a reasonable 👁 to performance in this incarnation and I don't anticipate it being reasonably less-performant than the current implementation in production.My plans are to continue to augment this (and update the tests) so that we can have more control over the serialization process. This additional control might give us more ability to automatically detect if posts have been altered out of Gutenberg and if so, if they have been altered in a deleterious manner.