Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport tests and fix for CVE-2018-3740 to 2.x branch. Resolves #187 #188

Merged
merged 7 commits into from
Sep 30, 2018

Conversation

dometto
Copy link

@dometto dometto commented Sep 29, 2018

See #187

I'm getting six test failures locally, but upon inspection it seems that I already get these when I checkout the 2.x branch unmodified. The failures are all of this form:

-"<b>Lorem</b> ipsum <strong>dolor</strong> sit amet script&gt;alert(\"hello world\");"
+"<b>Lorem</b> ipsum <strong>dolor</strong> sit amet &lt;script&gt;alert(\"hello world\");"

That is, there is an unexpected &lt; before "script". We'll see what happens on Travis.

The newly backported tests in test_malicious_html.rb and test_clean_element.rb all pass, though.

# Leading and trailing whitespace around URLs is ignored at parse
# time. Stripping it here prevents it from being escaped by the
# libxml2 workaround below.
attr.value = attr.value.strip
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems at this point it is already possible that attr has been unlinked, but the do loop continues with the same attr. Should the new code somehow be made conditional on whether attr has been unlinked yet, for performance reasons? (No use doing the strip and the gsub if attr is already sanitized, I take it?)

@dometto
Copy link
Author

dometto commented Sep 29, 2018

Getting the same failures on Travis. As I said I'm getting the same failures on a fresh checkout of 2.x, so I don't think we should let this be a showstopper -- but of course it would be nice to get rid of the failures.

On a closer look, there seems to be one additional failure on this branch than on the 2.x branch:

 3) Failure:
Config::RELAXED#test_0020_should not allow protocol-based JS injection: spaces and entities [/home/travis/build/rgrove/sanitize/test/test_sanitize.rb:276]:
Expected: "<img src=\"\">"
  Actual: "<img src>"

EDIT: just changing the expected result is not acceptable because <img src> is (I assume) not valid HTML.

@dometto
Copy link
Author

dometto commented Sep 29, 2018

It's the attr.value = attr.value.strip line that's causing the single additional failure. In this test, before the strip call, attr is this:

#<Nokogiri::XML::Attr:0x3fcaf1c1bd98 name="src" value=" ">

After the strip call, it becomes:

#<Nokogiri::XML::Attr:0x3fcc0417ee30 name="src">

So it seems that in nokogiri, setting value to "" causes value to be removed from the Attr entirely.

I am now omitting the strip if attr.value.strip.empty?. After doing that, the actual output from this test becomes "<img src=\"%20\">", and the test still fails. So I have further omitted the gsub in case attr.value.strip.empty?, which passes the test. @rgrove: is this an acceptable way of dealing with the edge case?

The five remaining failures are those that are already present on vanilla 2.x.

@dometto dometto changed the title Backport tests and fix for CVE-2018-3740 to 2.x branch Backport tests and fix for CVE-2018-3740 to 2.x branch. Resolves #187 Sep 29, 2018
Copy link
Owner

@rgrove rgrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The test failures on the 2.x branch are most likely due to changes in the output of newer versions of Nokogiri that didn’t exist when the tests were written. As long as the output produced by modern Nokogiri is safe and sane, it should be fine to update the expected output to match Nokogiri’s current output.

EDIT: just changing the expected result is not acceptable because <img src> is (I assume) not valid HTML.

<img src> isn’t technically valid, but it is safe, and safety is more important than validity. I would prefer this output over skipping stripping/escaping on space-only values. It would also be fine to remove the empty src attribute, since it’s useless (more on this in my diff comments).

lib/sanitize/transformers/clean_element.rb Outdated Show resolved Hide resolved
lib/sanitize/transformers/clean_element.rb Outdated Show resolved Hide resolved
lib/sanitize/transformers/clean_element.rb Outdated Show resolved Hide resolved
test/common.rb Outdated Show resolved Hide resolved
test/test_clean_element.rb Outdated Show resolved Hide resolved
test/test_clean_element.rb Outdated Show resolved Hide resolved
test/test_clean_element.rb Outdated Show resolved Hide resolved
test/test_malicious_html.rb Outdated Show resolved Hide resolved
@dometto
Copy link
Author

dometto commented Sep 30, 2018

Thanks for the comments, @rgrove! Will get to work on them.

@dometto
Copy link
Author

dometto commented Sep 30, 2018

Green lights. Please have a look at the new implementation in clean_element.rb.

I took the liberty of bumping version and updating HISTORY for your convenience (though do change the release date if you don't get round to cutting the gem today).

@rgrove rgrove merged commit 9bc8497 into rgrove:2.x Sep 30, 2018
@rgrove
Copy link
Owner

rgrove commented Sep 30, 2018

Thanks @dometto! I've pushed 2.1.1 to RubyGems.

@dometto dometto deleted the 2x_security_fix branch October 1, 2018 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants