Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSONP: Always escape U+2028 and U+2029 #37

Merged
merged 1 commit into from
May 6, 2011
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion lib/rack/contrib/jsonp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,17 @@ def valid_callback?(callback)
# since JSON is returned as a full string.
#
def pad(callback, response, body = "")
response.each{ |s| body << s.to_s }
response.each do |s|
# U+2028 and U+2029 are allowed inside strings in JSON (as all literal
# Unicode characters) but JavaScript defines them as newline
# seperators. Because no literal newlines are allowed in a string, this
# causes a ParseError in the browser. We work around this issue by
# replacing them with the escaped version. This should be safe because
# according to the JSON spec, these characters are *only* valid inside
# a string and should therefore not be present any other places.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job commenting this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just about to commit this (without comments) when I said to myself: "WTF is this? It doesn't make any sense at all? What's so special about U+2028/9?" Then I wrote the comment and all was good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, and it clarified why escaping made the most sense (when my initial thought was to reject it instead). :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it's not really a 400 Bad Request either. The request from the client is perfectly fine, it's the response from the server that's wrong…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, but it's only obvious when those UTF-8 characters are explained :)

body << s.to_s.gsub("\u2028", '\u2028').gsub("\u2029", '\u2029')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cool to summarize why this gsub technique works (are the single and double quotes meaningful?) for folks who aren't deep rubyists. Also, I'm curious what the original was doing with response.each{ |s| body << s.to_s }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cool to summarize why this gsub technique works (are the single and double quotes meaningful?) for folks who aren't deep rubyists.

In Ruby "\u2028" is a string that contains a single character: U+2028. '\u2028' on the other hand contains 6 characters:

# Unpack the Unicode codepoints:
>> "\u2028".unpack("U*")
=> [8232]
>> '\u2028'.unpack("U*")
=> [92, 117, 50, 48, 50, 56]

JSON specifies that \u followed by 4 hex-digits is the same as a single, literal Unicode character.

Also, I'm curious what the original was doing with response.each{ |s| body << s.to_s }

In Ruby the response isn't actually a string; it's something you can iterate. The reason for this is so you can create lazy/streaming responses (so you don't have to put everything into memory at the same time). This code on the other hand will put everything in memory at the same time, but it's quite simple to change it to a streaming version. So far, nobody has requested that, and it only matters if you serve huge JSONP responses.

end

["#{callback}(#{body})"]
end

Expand Down
11 changes: 10 additions & 1 deletion test/spec_rack_jsonp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,15 @@
headers = Rack::JSONP.new(app).call(request)[1]
headers['Content-Type'].should.equal('application/javascript')
end

specify "should not allow literal U+2028 or U+2029" do
test_body = "{\"bar\":\"\u2028 and \u2029\"}"
callback = 'foo'
app = lambda { |env| [200, {'Content-Type' => 'application/json'}, [test_body]] }
request = Rack::MockRequest.env_for("/", :params => "foo=bar&callback=#{callback}")
body = Rack::JSONP.new(app).call(request).last
body.join.should.not.match(/\u2028|\u2029/)
end

context "but is empty" do
specify "should " do
Expand Down Expand Up @@ -122,4 +131,4 @@ def assert_bad_request(response)
body.should.equal [test_body]
end

end
end