Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Selenium webdriver attributes from DOM #108

Merged
merged 7 commits into from
Nov 23, 2016

Conversation

englehardt
Copy link
Collaborator

@englehardt englehardt commented Nov 23, 2016

See commit comments for more information.

@gunesacar Let me know what you think! Thanks for the work on #105.

Also includes automated tests. It seems the order of content script
execution is non-deterministic. Sometimes selenium will run first and
sometimes openwpm's will run first. We should be able to handle this by
monitoring the DOM for changes, but need to confirm the performance
degredation isn't too high.
The document attribute is removed with a DOMAttrModified eventListener
that removes itself after the first call. The navigator attribute is
prevented from being set by altering Object.defineProperty until
Selenium attempts to set the attribute (at which point the alteration is
reversed).
driver = kwargs['driver']

# Check if document element has `webdriver` attribute
assert 'true' != driver.execute_script(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@englehardt these asserts fail when you have disable_webdriver_self_id=false, right?
Just wanted to make sure since they are called within a function that is passed around.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added a pathway in 994b080 to handle and reraise exceptions from the child process in the main thread/process. In particular, it handles AssertionError used by py.test.

As a sanity check, I just made a new commit that tests both conditions.

assert 'true' != driver.execute_script(
'return document.documentElement.getAttribute("webdriver")')
# Check if navigator has webdriver property
assert not driver.execute_script('return navigator.webdriver')
Copy link
Contributor

@gunesacar gunesacar Nov 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to match the exact response from a standard (non-Selenium) browser, I'd do all these checks in JS, using strict equality (=== or !==). Something like:

assert driver.execute_script('return undefined === navigator.webdriver')
assert driver.execute_script('return null === document.documentElement.getAttribute("webdriver")')

value: originalDefineProperty
});
delete originalDefineProperty;
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use return undefined; instead of return;?
The former would be more readable.

Object.defineProperty(Object, 'defineProperty', {
value: originalDefineProperty
});
delete originalDefineProperty;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't understand why we have to delete originalDefineProperty. A comment would be great.

Also, what happens when Object.defineProperty is called more than once with the arguments navigator and "webdriver"? Wouldn't it set Object.defineProperty to undefined since originalDefineProperty is deleted in the first call?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to prevent that variable from staying around the global namespace. However I tested it and I think that's handled anyway with the way we inject scripts. I've removed the delete statement.

@gunesacar
Copy link
Contributor

This looks way better than my naive approach, it has the resilience we need to cope with various race conditions.

My high-level concern is the overwriting of object.defineProperty. This function is so fundamental that, there's no room for even a tiny glitch.

Please see my other, more specific comments next to the code.

Added a bunch of new tests to ensure Object.defineProperty still works
as expected after our instrumentation runs and removes Selenium's
webdriver property. Other tests refactored to better handle a few
conditions.
@englehardt
Copy link
Collaborator Author

englehardt commented Nov 23, 2016

Thanks for the detailed review. Just pushed a couple commits to address your comments.

I tried to avoid messing with Object.defineProperty but my other approaches didn't work:

  1. I tired using Object.watch(). It will detect property assignments but won't detect property creation with Object.defineProperty so it's not an option.
  2. I tried to set window.navigator = new Proxy(window.navigator, {...}) and proxy property creation there, but it doesn't seem to be possible to overwrite window.navigator.
  3. I tried creating my own webdriver attribute on the navigator with a custom get and set property that would destroy the attribute the first time it changes, but subsequent Object.defineProperty calls don't make use of the current getters and setters.

Extending from (3) I think it might be possible to define our own property and have some special handling on deletion that prevents the subsequent re-creation of the property...but my current approach seems cleaner.

@gunesacar
Copy link
Contributor

Thanks for addressing all the comments, @englehardt.
Looks good to me.

Conflicts:
	automation/Extension/firefox/openwpm.xpi
@englehardt englehardt merged commit 87f74a7 into master Nov 23, 2016
@englehardt englehardt deleted the selenium_bot_detection branch November 23, 2016 23:15
@RonnieBlade
Copy link

RonnieBlade commented Jun 20, 2018

Hi! @gunesacar Is it possible to remove webdriver attributes from Firefox controlled by Selenium using C#?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants