-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any plans to update the app? #2
Comments
Heya, Thanks for the kind reply! I'm actually quite surprised (but also very glad) that the app can still be of use today :). As the app fulfilled my needs at the time I had paused any development, and eventually moved on to other things. Knowing that there is a request for added features gives me motivation to start it up again. Number of watches, appearances on lists, etc. should all be a fairly simple addition I think. What did you have in mind for running the scraper on main pages? I do like tot keep it focused on user-specific content, but I'll see what's possible. Unfortunately I'm quite busy until the end of August, so any development will be slow up until then. -Arno |
It's definitely been a godsend for me personally... It makes it so much easier to go through these massive lists, and pick movies easily. I think I care about the number of watches because it would allow me to find actual movies. (Sometimes you'll see find several shorts, specials, musicals mentioned in a massive list...but then you'll discover that these outnumber actual movies...even no matter how strict you try to filter with what Letterboxd gives you) Other requests could be possible to output the Top 4 from a given list of users (Might be far-fetched, but definitely would be kinda cool) As far as the main pages, I previously used a custom-made scraper that would allow you to scrape, and just grab the pages you want.
Where you could easily have it to only query the specific pages that you wanted. This scraper has been far more efficient and quicker than the others that I used. Really hope to see some updates to it soon! |
Just created an account to say thanks for making this, I've been using it to scrape lists based on podcasts and making some stats displays around them (https://letterblocksd.com - still WIP, but I'll be adding the letterboxd stats collections too). I also want to second meanjoep92's request for watchcount. I'm currently pulling "popularity" from TMDB, but that seems kind of fickle and strange. I'd love to have access to the LB watchcount for movies alongside the other info your tool scrapes. Thanks again! |
Thank you for the kind words :). Your website looks amazing! Kudos for creating such a creative and interesting site haha. Looking forward to see how it develops. Luckily, I finally have some more time to work on this project again. I've just released an update of the scraper and it now has a lot more functionality (i.e. data columns for watches, likes, list appearances, genres, languages, countries). I hope this can be useful for your own projects on the short term. In the meantime I'll think of implementing more features that would fit this app. Of course, let me know if you have any requests and/or issues. Thanks! |
Very happy to see that this has been updated! I especially love that there's now ways to see the runtimes, and number watched! (It makes picking films even easier now) ❤️ Would anyone know if it's still possible to scrape certain pages beyond just lists and watchlists? Similar to what I asked months ago, I had a scraper where you could put your own custom query of pages in the brackets, and it would give you the info similar to the scraper.
I found that the workaround was to just copy and paste what you see from a desired page, and then put it in a private list to run the scraper on. Just didn't know if there was any capability to do it since manually trying to copy it is always prone to error. Thank you, and y'all are awesome! ❤️❤️ |
Thanks for the update! This is great timing for me, as I was planning to try my hand at extending this next week (and that likely wouldn’t have gone well ;) One thing I was going to add that didn’t get into this update is the fan count (the number displayed above the rating, e.g. ‘1.2k FANS’. It looks like there’s not a more precise number once they go over 1k, and not every movie has them, but that’s an interesting piece of LB data that’s present on the page. Two small issues that persist with the new version:
Thanks again! This is great! |
Oh one other feature I'd really appreciate would be the ability to call the function directly, or pass it a list of urls and a target directory. I'm sure this can be done somehow, I'm going to try and figure it out next week, but considering the programatic use-case instead of a user with a text prompt would make this tool even more versatile. (Like I said, I'm sure this is already possible somehow, I'm just new ;) |
I put in a fix (4 characters!) for the letterboxd url issue. The https://letterboxd.com/film/reel-5/ is actually a problem with letterboxd itself, they're not escaping the quotes correctly in the alt text. I think addressing that edge case is well outside the scope of this project. Thanks again! |
I've been fighting with github for awhile now, but maybe I submitted a pull request for a larger change? Hopefully? Your call if you want to merge it, I'm sure there's cleanup that could be done, or best practices to follow, but I wanted to at least share the functionality that I've been using. My fork maintains the original functionality, but if there's a list of letterboxd urls (and optional filenames) in a text file, it will scrape all those lists (up to 4 concurrently) without user input, which is very handy for my use case. Thanks for the functionality! |
Unfortunately, the fancount isn't in the source html, it's generated, so probably not something to be done with this script |
Apologies for the delay in responding. I've had quite a busy schedule, but have been able to dedicate some time to this project again since last week. The result has been the new v2.0.0 update. Regarding the requests of @meanjoep92:
Besides lists and watchlists, the scraper now also reads user films (
With the new program version you can now scrape certain pages of list like Regarding the requests of @BeSweets:
The new scraper has functionality for scraping the fan count, although only in the whole hundreds (i.e.
The film title scraping of e.g. https://letterboxd.com/film/reel-5/ was fixed by changing the way it finds the title in the HTML code. This was luckily quite easy to do and I hope the scraper can now correctly read all film titles. Thanks for this tip!
The scraper can now be called directly from the command line and it is very easy to supply it with a list of URLs from a .txt file by using the Thank you guys for your involvement and contribution to the project. I'm sorry for the lack of communication and the overall slow development, but I hope the new scraper can still be of (even better) use! :) (Also please don't hesitate to bring up any new suggestions/issues with the scraper!) |
So happy that this scraper has been updated tremendously since I've made this! ❤️ This thing is going to be a blessing for my movie watching. Hope these updates never stop. |
Hey there! Might be a silly question, but is there a way to have the synopsis of each film scrapped as well? Just curious! Thanks! |
Hey! Yes certainly possible and easy to implement (don't know why I have not added this earlier haha). I added the change to the new release v2.2.0. |
Man, you are the absolute best! Thank you so much!!! |
This might be another silly request (I greatly apologize for asking so many random things because this scraper has been a life saver for me) but is there a way to possibly grab the URL for the image of the movie poster? (Or is that even possible?) Thanks!! |
Hi certainly no silly requests here! I left this thread open exactly for requests such as this. Adding the movie poster URL is certainly possible. It has been on the TODO list for quite some time. I'm gonna check it out when I have some time next week :) |
So I genuinely tried to get the posters using BeautifulSoup, but to no avail...would Selenium be the only way to do this since it's served through React? UPDATE: At the top of the
Within the
I'm sure the creators might have a better solution, but only wanted this because I really take my movie-watching seriously. :3 |
Hey there, absolutely loving the hell out of this.
Is there any plan to add any more features to this app? (Like finding the number of watches, appearances on lists, etc)
Would also love to to be able to run the scraper on the main pages.
Keep up the good work!
The text was updated successfully, but these errors were encountered: