Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for random_attributes and browser_settings #745

Closed
englehardt opened this issue Sep 11, 2020 · 25 comments · Fixed by #775
Closed

Remove support for random_attributes and browser_settings #745

englehardt opened this issue Sep 11, 2020 · 25 comments · Fixed by #775
Assignees
Labels
good-first-bug Bugs that are good for a first-time committer to tackle task Doesn't change any behaviour

Comments

@englehardt englehardt added task Doesn't change any behaviour good-first-bug Bugs that are good for a first-time committer to tackle labels Sep 11, 2020
@ankushduacodes
Copy link
Contributor

Hi @englehardt, I am interested in working on this task, would you please guide me what needs to be done here...

@vringar
Copy link
Contributor

vringar commented Sep 14, 2020

Hi @ankushduacodes,
At first I'd suggest you familiarize yourself with OpenWPMs general workings, by running demo.py and using your favourite SQL Client to inspect the SQLite database that will be put onto your desktop.
Afterwards I'd search for browser_params and try to understand their purpose and how they are used.
Once you have a big of context, search for ua_string and screen_res to see all places where the attributes you want to remove are used.
Delete these pieces of code and run the tests to see if/what you broke along the way.
If you have any questions please feel free to come into our matrix channel and ask questions.

@ankushduacodes
Copy link
Contributor

@vringar, i did clone it but when i ran demo.py it game me some module errors, and i could not find any requirements.txt is there some other way to install the dependencies?

@ankushduacodes
Copy link
Contributor

Hi @ankushduacodes,
At first I'd suggest you familiarize yourself with OpenWPMs general workings, by running demo.py and using your favourite SQL Client to inspect the SQLlite database that will be put onto your desktop.
Afterwards I'd search for browser_params and try to understand their purpose and how they are used.
Once you have a big of context, search for ua_string and screen_res to see all places where the attributes you want to remove are used.
Delete these pieces of code and run the tests to see if/what you broke along the way.
If you have any questions please feel free to come into our matrix channel and ask questions.

What are you trying to achieve with this issue and why do we need to remove some attributes and what are those attributes @vringar

@vringar
Copy link
Contributor

vringar commented Sep 14, 2020

@vringar, i did clone it but when i ran demo.py it game me some module errors, and i could not find any requirements.txt is there some other way to install the dependencies?

I'd suggest reading the installation section of our README.md on how to get setup.

What are you trying to achieve with this issue and why do we need to remove some attributes and what are those attributes

These attributes were originally intended to help us obfuscate the fact that we were running an automated crawl with Firefox.
However we have discovered our attempts to be insufficient and we don't have the resources to build out and maintain such evasion techniques in a reasonable manner.
As such we want to move these attributes/config options to reduce the size of the code base and reduce unnecessary complexity.

@ankushduacodes
Copy link
Contributor

hi @vringar, I tried running demo.py multiple times, it runs properly except for I get two exceptions

Exception in thread OpenWPM-watchdog:
Traceback (most recent call last):
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/site-packages/psutil/_psosx.py", line 342, in wrapper
    return fun(self, *args, **kwargs)
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/site-packages/psutil/_psosx.py", line 484, in memory_full_info
    uss = cext.proc_memory_uss(self.pid)
PermissionError: [Errno 13] Access denied (originated from task_for_pid)

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/ankushdua/Documents/GitHub/OpenWPM/automation/TaskManager.py", line 232, in _manager_watchdog
    mem_bytes += child.memory_full_info().uss
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/site-packages/psutil/__init__.py", line 1094, in memory_full_info
    return self._proc.memory_full_info()
  File "/Users/ankushdua/opt/anaconda3/envs/openwpm/lib/python3.8/site-packages/psutil/_psosx.py", line 349, in wrapper
    raise AccessDenied(self.pid, self._name)
psutil.AccessDenied: psutil.AccessDenied (pid=69556)

I can see that these exceptions are interrelated to each other, any idea what this is about?

(PS: I do get the database on my desktop tho)

@vringar
Copy link
Contributor

vringar commented Sep 17, 2020

Hey @ankushduacodes,
Unfortunately I have no idea what causes this but seeing as it's "only" the watchdog dying the actual working of the platform should be unaffected.
Could you please file a bug for this and point out that you are using macOS. I unfortunately don't have the hardware to investigate this issue, but maybe we can hack around it.

@vringar
Copy link
Contributor

vringar commented Sep 28, 2020

@ankushduacodes How is this coming along? Do you need any kind of support from our side?

@ankushduacodes
Copy link
Contributor

@vringar, I haven't been able to get around this issue, I have been quite busy the last two week and will be for another few weeks. I will ask for more info on this whenever I get around to dealing with this issue. I hope it's fine

@vringar
Copy link
Contributor

vringar commented Sep 28, 2020

Of course that fine! Sorry if I seemed pushy. I just wanted to check in and make sure that you weren't blocked on something from our side

@ankushduacodes
Copy link
Contributor

@vringar oh no no, dont worry about it, you weren't pushy.
I really appreciated that gesture of making sure i wasn't stuck. Thank you for that 😇

@ankushduacodes
Copy link
Contributor

@vringar Hi, Thank you for your patience on this issue, I will start working on it soon.
I was wondering if it is okay to use Python3.9 with this issue, Is it okay to use new stuff introduced in 3.9.

@vringar
Copy link
Contributor

vringar commented Oct 28, 2020

While I would like to say yes to using Python3.9 features, however I think it's a bit too early to use this version, as conda (our package manager of choice) is currently using Python3.8, so I think you shouldn't use these features yet.

@ankushduacodes
Copy link
Contributor

Okay @vringar got it, I was also wondering that are we just removing wherever there are variables named "random_attributes", "browser_settings", "browser_params", "ua_string" and "screen_res"? or are we modifying the use of them?

And also Can you please point me to the file which is the entry to the whole program? (looking for the main function here)

@vringar
Copy link
Contributor

vringar commented Oct 29, 2020

We want to remove all pieces of code, that use "random_attributes", "ua_string" and screen_res", the browser_params as a whole will stay.
We have a demo.py for demonstrating the usage of OpenWPM.
Since we only provide a plattform there isn't a main function per se, but I think
https://github.com/mozilla/OpenWPM/blob/e989ce5f74c1737185c89a33bde4015b55df066a/automation/TaskManager.py#L620-L671
is a good entry point.

@ankushduacodes
Copy link
Contributor

@vringar is there a way to run all the tests at once or do I have to run all the tests manually

(PS: I am sorry if this seems to be a stupid question, as this is my first time contributing to a big project like this)

@vringar
Copy link
Contributor

vringar commented Oct 31, 2020

@ankushduacodes don't worry about asking questions! I'm happy to help.
To run the tests you first need to:

  • Have the conda environment activated
  • Change into the test directory
  • Run EDGE_PORT=9999 pytest -vv

Please note that our tests are quite slow, on my machine it can take up to 45 min, so it's completely reasonable to push and see if CI passes since the tests are parallelized on there

@ankushduacodes
Copy link
Contributor

@vringar I have been running the tests on my local machine and so far one of the tests have failed which is test_crawl.py::TestCrawl::test_browser_profile_coverage XFAIL
Any insight?

@vringar
Copy link
Contributor

vringar commented Oct 31, 2020

XFAIL means that the test is expected to fail so you don't have to worry about it.
There was a lot of functionality we used to support, but now don't anymore. We kept the tests around and hope to restore it someday

@ankushduacodes
Copy link
Contributor

@vringar I had just made a pull request to run tests on CI as per your recommendation... Multiple tests have failed, Could you please have a look at what's going on.

@vringar
Copy link
Contributor

vringar commented Oct 31, 2020

Our CI configuration also enforces that the code is formatted uniformly and the tests are formatted uniformly.
I'd suggest that you run
pre-commit install and then pre-commit run --all-files

@ankushduacodes
Copy link
Contributor

@vringar All the tests have passed, Please review the pull request #775

@ankushduacodes
Copy link
Contributor

@vringar Do you have any more beginner-friendly issues that I can work on, not just in OpenWPM, any projects that you are looking over? I would love to work on them

@vringar
Copy link
Contributor

vringar commented Nov 2, 2020

@ankushduacodes Thank you for expressing interest in further working with me. I currently spend most of my time working on OpenWPM, so most issues I can offer you will be here.
I don't have something I could immediately assign you once your PR is done, but I'm planning on landing a rather big change (#753) soon. Once that PR has landed there will be a lot of new tests and documentation to be written, where I think you could learn a lot of new things and help make OpenWPM a better tool.
For now please have a look at the comments I left on your PR.

@ankushduacodes
Copy link
Contributor

ankushduacodes commented Nov 2, 2020

@vringar I would love to take part in more issues related to OpenWPM. Please ping me whenever there is something I can help with.
I will also keep an eye for issues that I can help with.
Thank you

vringar pushed a commit that referenced this issue Nov 4, 2020
Zaxeli pushed a commit to Zaxeli/OpenWPM that referenced this issue Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good-first-bug Bugs that are good for a first-time committer to tackle task Doesn't change any behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants