-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reason newly posted youtube video and medium posts #4595
base: master
Are you sure you want to change the base?
Conversation
Nice idea, have you tested the code? |
@Daniil-M-beep I would be happy if someone can help me to test the code. The problem is that I am not capable of creating useful test accounts. |
@user12986714 Ok. I'll try and organise some testing later today. |
@user12986714 Apologies but it might not be today but I'll try and get it done relatively soon. |
This has been tested by me and neither of the 2 rules work. |
I really like the idea, but I'm skeptical of rolling-our-own YouTube scraper, just because it is very much subject-to-change and might end up being troublesome to maintain down-the-road. There may be a couple of more stable alternatives:
|
I agree it would be a great idea if we can use the API, however it requires coordination w.r.t. API keys and limits the extensibility of such detection mechanisms. For example, it would be great if we later expand to other blog sites like blogspot or something else. |
@user12986714 Those are both true. However, we've had to do API key deployment in the past (e.g. for Perspective), and it's pretty simple:
As far as extensibility goes: yes, we're writing special code to use the YouTube API, but that doesn't stop us from using regexes on Medium or Blogspot. I'm worried about using regex specifically on YouTube because YT is not scraper-friendly and I would prefer to stay out of that cat-and-mouse game. |
I've just taken a closer look at the Medium one, too. I'm concerned by the However, there's a MUCH easier way to get the date out of a Medium post. Every Medium post has the following meta-tag:
Finally, since we already have BeautifulSoup, I'd suggest using that instead of regex to parse HTML, as it will be easier and more reliable. |
This issue has been closed because it has had no recent activity. If this is still important, please add another comment and find someone with write permissions to reopen the issue. Thank you for your contributions. |
This PR tries to catch newly posted youtube videos.