-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special handling for known websites (WP, youtube, ted, etc) #33
Comments
No, we've discussed that a while back and apparently, we did not create ticket but the idea was to have a list of known websites for which we refuses request and display a message explaining where to find already existing ZIMs. Switching scraper is not practical for many reasons ; mainly because we have no |
Sounds good to me and was the main point, but then the response message should identify the target and corresponding zim (e.g. "here is the link to en.wikipedia.org's latest in available" and not "got to download.kiwix.org/zim and figure it out". |
Ideally, yes. It can probably be implemented in two steps so that this gets a chance to be done. At first, we can redirect to the Wiki where files are listed. Or maybe the library with new kiwix-serve is considered easy-enough ? First thing you can do is list the domains and where to point to. It's easy for those we have a category for.
|
This would have my preference by far, but when I look at domains, based on the past three months (and this doc) I think we can simply send them to wikipedia_en_all.zim |
We could have a ZIM metadata "source_url" and then allow library.kiwix.org to filter on it? |
Yes, that's an interesting feature for which the default behavior might be tricky: how much matching do you want? domain? netloc ? path ? scheme ? but yeah, that would be best for us. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
I see that almost every day (and certainly several times a week) people are running requests for Wikipedia, Wikibooks or even Youtube.
Zimit should be able to a) switch gears to run the corresponding scrapers (youtube), or directly offer the latest zim available (wikipedia, wikibooks).
The text was updated successfully, but these errors were encountered: