Skip to content

Commit

Permalink
Merge pull request #1746 from sparklemotion/flavorjones-revert-libxml…
Browse files Browse the repository at this point in the history
…-unescaped-attrs

revert libxml unescaped attrs
  • Loading branch information
flavorjones authored Mar 29, 2018
2 parents 41c46be + 57ba668 commit ce93a72
Show file tree
Hide file tree
Showing 4 changed files with 182 additions and 0 deletions.
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,22 @@
# unreleased

## Security Notes

[MRI] Behavior in libxml2 has been reverted which caused CVE-2018-8048 (loofah gem), CVE-2018-3740 (sanitize gem), and CVE-2018-3741 (rails-html-sanitizer gem). The commit in question is here:

> https://github.com/GNOME/libxml2/commit/960f0e2
and more information is available about this commit and its impact here:

> https://github.com/flavorjones/loofah/issues/144
This release simply reverts the libxml2 commit in question to protect users of Nokogiri's vendored libraries from similar vulnerabilities.

If you're offended by what happened here, I'd kindly ask that you comment on the upstream bug report here:

> https://bugzilla.gnome.org/show_bug.cgi?id=769760

## Dependencies

* [MRI] libxml2 is updated from 2.9.7 to 2.9.8
Expand Down
2 changes: 2 additions & 0 deletions Manifest.txt
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,7 @@ lib/xalan.jar
lib/xercesImpl.jar
lib/xml-apis.jar
lib/xsd/xmlparser/nokogiri.rb
patches/libxml2/0001-Revert-Do-not-URI-escape-in-server-side-includes.patch
patches/sort-patches-by-date
suppressions/README.txt
suppressions/nokogiri_ruby-2.supp
Expand Down Expand Up @@ -301,6 +302,7 @@ test/html/sax/test_parser.rb
test/html/sax/test_parser_context.rb
test/html/sax/test_parser_text.rb
test/html/sax/test_push_parser.rb
test/html/test_attributes.rb
test/html/test_builder.rb
test/html/test_document.rb
test/html/test_document_encoding.rb
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
From c5538465c08a8ea248a370bf55bc39cd3385e4af Mon Sep 17 00:00:00 2001
From: Mike Dalessio <mike.dalessio@gmail.com>
Date: Thu, 29 Mar 2018 14:09:00 -0400
Subject: [PATCH] Revert "Do not URI escape in server side includes"

This reverts commit 960f0e275616cadc29671a218d7fb9b69eb35588.
---
HTMLtree.c | 49 +++++++++++--------------------------------------
1 file changed, 11 insertions(+), 38 deletions(-)

diff --git a/HTMLtree.c b/HTMLtree.c
index 2fd0c9c..67160c5 100644
--- a/HTMLtree.c
+++ b/HTMLtree.c
@@ -717,49 +717,22 @@ htmlAttrDumpOutput(xmlOutputBufferPtr buf, xmlDocPtr doc, xmlAttrPtr cur,
(!xmlStrcasecmp(cur->name, BAD_CAST "src")) ||
((!xmlStrcasecmp(cur->name, BAD_CAST "name")) &&
(!xmlStrcasecmp(cur->parent->name, BAD_CAST "a"))))) {
+ xmlChar *escaped;
xmlChar *tmp = value;
- /* xmlURIEscapeStr() escapes '"' so it can be safely used. */
- xmlBufCCat(buf->buffer, "\"");

while (IS_BLANK_CH(*tmp)) tmp++;

- /* URI Escape everything, except server side includes. */
- for ( ; ; ) {
- xmlChar *escaped;
- xmlChar endChar;
- xmlChar *end = NULL;
- xmlChar *start = (xmlChar *)xmlStrstr(tmp, BAD_CAST "<!--");
- if (start != NULL) {
- end = (xmlChar *)xmlStrstr(tmp, BAD_CAST "-->");
- if (end != NULL) {
- *start = '\0';
- }
- }
-
- /* Escape the whole string, or until start (set to '\0'). */
- escaped = xmlURIEscapeStr(tmp, BAD_CAST"@/:=?;#%&,+");
- if (escaped != NULL) {
- xmlBufCat(buf->buffer, escaped);
- xmlFree(escaped);
- } else {
- xmlBufCat(buf->buffer, tmp);
- }
-
- if (end == NULL) { /* Everything has been written. */
- break;
- }
-
- /* Do not escape anything within server side includes. */
- *start = '<'; /* Restore the first character of "<!--". */
- end += 3; /* strlen("-->") */
- endChar = *end;
- *end = '\0';
- xmlBufCat(buf->buffer, start);
- *end = endChar;
- tmp = end;
+ /*
+ * the < and > have already been escaped at the entity level
+ * And doing so here breaks server side includes
+ */
+ escaped = xmlURIEscapeStr(tmp, BAD_CAST"@/:=?;#%&,+<>");
+ if (escaped != NULL) {
+ xmlBufWriteQuotedString(buf->buffer, escaped);
+ xmlFree(escaped);
+ } else {
+ xmlBufWriteQuotedString(buf->buffer, value);
}
-
- xmlBufCCat(buf->buffer, "\"");
} else {
xmlBufWriteQuotedString(buf->buffer, value);
}
--
2.9.5

85 changes: 85 additions & 0 deletions test/html/test_attributes.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
require "helper"

module Nokogiri
module HTML
class TestAttr < Nokogiri::TestCase
unless Nokogiri::VersionInfo.instance.libxml2? && Nokogiri::VersionInfo.instance.libxml2_using_system?
#
# libxml2 >= 2.9.2 fails to escape comments within some attributes. It
# wants to ensure these comments can be treated as "server-side includes",
# but as a result fails to ensure that serialization is well-formed,
# resulting in an opportunity for XSS injection of code into a final
# re-parsed document (presumably in a browser).
#
# the offending commit is:
#
# https://github.com/GNOME/libxml2/commit/960f0e2
#
# we'll test this by parsing the HTML, serializing it, then
# re-parsing it to ensure there isn't any ambiguity in the output
# that might allow code injection into a browser consuming
# "sanitized" output.
#
# complaints have been made upstream about this behavior, notably at
#
# https://bugzilla.gnome.org/show_bug.cgi?id=769760
#
# and multiple CVEs have been declared and fixed in downstream
# libraries as a result, a list is being kept up to date here:
#
# https://github.com/flavorjones/loofah/issues/144
#
[
#
# these tags and attributes are determined by the code at:
#
# https://git.gnome.org/browse/libxml2/tree/HTMLtree.c?h=v2.9.2#n714
#
{tag: "a", attr: "href"},
{tag: "div", attr: "href"},
{tag: "a", attr: "action"},
{tag: "div", attr: "action"},
{tag: "a", attr: "src"},
{tag: "div", attr: "src"},
{tag: "a", attr: "name"},
#
# note that div+name is _not_ affected by the libxml2 issue.
# but we test it anyway to ensure our logic isn't modifying
# attributes that don't need modifying.
#
{tag: "div", attr: "name", unescaped: true},
].each do |config|

define_method "test_uri_escaping_of_#{config[:attr]}_attr_in_#{config[:tag]}_tag" do
html = %{<#{config[:tag]} #{config[:attr]}='examp<!--" unsafeattr=unsafevalue()>-->le.com'>test</#{config[:tag]}>}

reparsed = HTML.fragment(HTML.fragment(html).to_html)
attributes = reparsed.at_css(config[:tag]).attribute_nodes

assert_equal [config[:attr]], attributes.collect(&:name)
if Nokogiri::VersionInfo.instance.libxml2?
if config[:unescaped]
#
# this attribute was emitted wrapped in single-quotes, so a double quote is A-OK.
# assert that this attribute's serialization is unaffected.
#
assert_equal %{examp<!--" unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
else
#
# let's match the behavior in libxml < 2.9.2.
# test that this attribute's serialization is well-formed and sanitized.
#
assert_equal %{examp<!--%22%20unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
end
else
#
# yay for consistency in javaland. move along, nothing to see here.
#
assert_equal %{examp<!--%22 unsafeattr=unsafevalue()>-->le.com}, attributes.first.value
end
end
end
end
end
end
end

0 comments on commit ce93a72

Please sign in to comment.