-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSONP: Always escape U+2028 and U+2029 #37
Conversation
Great find, Magnus. UTF is a tricky bitch! :) So, we're basically still handing them the same values, just escaping them first? That sounds like a great solution, as opposed to incorrectly returning a 400 Bad Request. Looks good, I'll merge it in and push it up. |
JSONP: Always escape U+2028 and U+2029
# causes a ParseError in the browser. We work around this issue by | ||
# replacing them with the escaped version. This should be safe because | ||
# according to the JSON spec, these characters are *only* valid inside | ||
# a string and should therefore not be present any other places. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job commenting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just about to commit this (without comments) when I said to myself: "WTF is this? It doesn't make any sense at all? What's so special about U+2028/9?" Then I wrote the comment and all was good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, and it clarified why escaping made the most sense (when my initial thought was to reject it instead). :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it's not really a 400 Bad Request either. The request from the client is perfectly fine, it's the response from the server that's wrong…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, but it's only obvious when those UTF-8 characters are explained :)
This needs to use octets to be compatible with 1.8. See #44. |
Transforming "\u2028" to '\u2028' seems weird because '\u2028' is only a string of 6 which makes no sense IMO. |
I just discovered a "bug" in JSON:
JSON is not a true subset of JavaScript because of two tiny whitespace unicode characters: U+2028 and U+2029. In ECMA-262 they are defined as "Line Terminator Characters" (see 7.3 Line Terminator) and are therefore equivalent of \n and \r. That means they are not valid in the middle of a string.
According to JSON U+2029 and U+2029 are just two regular Unicode characters and are therefore valid in the middle of a string. This is usually not a problem as long as you use a proper JSON parser, but in the case of JSONP the browser is the JSON parser.
This pull request will simply escape any U+2028/9 (to
\u2028/9
) and fixes this issue.For a "real-world" example of this issue, you can try to call the JSONP-API of GitHub to the YARD-repository (whose description happens to include U+2028):