-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explore running smaller, automated crawls in CI to detect regressions #49
Comments
To minimize crawl-by-crawl variation while still resembling somewhat of a real crawl (without actually crawling real URLs as that would be inpolite), we ought to set up some isolated mirrored snapshot of a set of urls to crawl. Ideally the size of this set should be fairly large, at least 1k, and executed in GCP with kubernetes+docker to resemble the way production crawls are executed. |
I think we should explore running the webserver that we use in local testing in it's own docker container and then have the crawlers run against it in a docker-compose enviroment. |
Note that we can |
In the past we've run small test crawls to check for regressions in profile handling. This test has since been disabled alongside the rest of our stateful crawling support (see: https://github.com/mozilla/OpenWPM/projects/2).
As proposed in #28 (review), CI-only crawls would be helpful in detecting regressions in the overall site crash rate, timeout rate, and error rate. Setting this up wouldn't be entirely straight forward, so I'm opening this issue for discussion purposes.
The text was updated successfully, but these errors were encountered: