-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TED recipes based on --topics
are not working anymore
#149
Comments
First obvious thing is that the filtering by language does not work anymore. It however seems to be linked to a change in UI (as far as I remember, the UI was not like this last time I visited the website), so I'm not sure the rest will work either. |
--topics
are not working anymore
I confirm that not supplying a A second issue (hidden) is that the topic page (e.g. https://www.ted.com/talks?sort=relevance&topics%5B0%5D=Design) does not accept a Are you aware of any new way to retrieve this list of videos filtered by topic ? It looks like we could plug directly to the underlying API used on the page, even if this is probably as fragile as parsing the HTML. |
I believe the scraper uses both ; because the internal API was introduced later and some info were easier to access from it but it already changed in the past (hence the emphasis on internal). Still appears to be a better strategy than the DOM. It's understood that those scrapers are fragile and as long as it doesn't change multiple times per day, it's an acceptable effort to adapt. |
I did not found any reference to an internal API in current codebase, do you remember what it was used for (I probably simply missed it). I only found scraping of the playlists or tasks page + using JSON found in every video page in a special Note that we should probably not fix this until discussion on #150 has settled. |
I don't recall but if you look at the online website, you'll see every playlist and/or talk calls a JSON file that has the details. I believe we get some data out of it but I haven't looked at this code base in a long time |
See https://farm.openzim.org/recipes?category=ted
The text was updated successfully, but these errors were encountered: