-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use scraped site as referrer rather than a "hack" #2067
Comments
@audiodude I guess you already know my opinion on this! ;) |
Sorry, I was confused because mwoffliner scrapes many wikis, not just WMF ones. But I suppose that the only wikis with map tiles will be the ones that are hosted on valid WMF domains, so this could definitely work. |
If anyone is using mwoffliner to scrape non-Wikimedia wikis and also attempting to consume map tiles from the Wikimedia tile server as part of that scraping they should get the 403 rejection unless their hosting is at an authorized domain. This is the point of using "the very poor protection" of checking the HTTP referrer. We want the Wikimedia movement to be able to make use of the Wikimedia tile server, but we also have limited resources to devote to that title server and cannot scale it to serving map tiles for any and all on the Internet. The currently implemented solution looks exactly like a bad faith actor deciding to circumvent the loose protections we have implemented. You read technical details about the service that we published in good faith to provide transparency for the Wikimedia community and then weaponized them against the very project that y'all claim to be attempting to advance. It is not a good look. |
Yes exactly. That's the original point that I missed: that we shouldn't expect map tiles to show up on say Minecraft Wiki, and that if they do, we shouldn't expect them to work.
To be fair, our discussion does indicate that we saw the current solution to be a temporary workaround while we sought to obtain the proper permissions. I think you should consider our opening of the phabricator ticket and brining light to the issue to be a good faith effort towards that goal. |
Thanks for the quick attention folks. I'm also sorry if I was overly aggressive in my criticisms of the original work around. My bad days shouldn't be everyone else's problem. |
Why not use the actual site/page you are scraping as the referrer instead of this fiction? That would make your scraper function in the same way as any normal user-agent consuming the wiki content and allow you to avoid being seen as deliberately violating hot linking protections used on the Wikimedia content farm.
Originally posted by @bd808 in #2062 (comment)
The text was updated successfully, but these errors were encountered: