Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix For SexbabesVR Scraper #1847

Merged
merged 6 commits into from
Oct 11, 2024
Merged

Fix For SexbabesVR Scraper #1847

merged 6 commits into from
Oct 11, 2024

Conversation

pops64
Copy link
Contributor

@pops64 pops64 commented Sep 27, 2024

The scene id in on the scene webpage now seems to be 614 for all scenes. Causing all scenes to be rescraped and never adding new scenes.

This pulls the poster url which appears to have a unique identifier in the last directory.

Also updated the cover URL to pull the image used for the thumbnail on the index page. As the latest scene has a SBS image for the cover where the thumbnail contains a more useful image

All appears functional. Might require deleting all scraped SexbabesVR scenes due to a shift in scene-id causing file mismatch and scene preview generation hiccups Added Migration Code

The scene id in the the webpage now seems to be 614 for all scenes. Causing all scenes to be rescraped and never adding new scenes.

This pulls the poster url which appears to have a unique identifier in the 2nd to last directory .

Also updated the cover URL to pull the image used for the thumbnail on the index page. As the latest scene has has a SBS image for the cover where the thumbnail contains a more useful image

All appears functional
There are three separate variations on how they have this information posted depending on the age of the scene.  A random sampling over all scenes shows that the synopsis is successfully being scraped
It ran once I am unsure of how to properly test it tho.
@crwxaj
Copy link
Collaborator

crwxaj commented Oct 10, 2024

The scraper update looks fine, but I think we should get rid of the migration. It takes forever without actually changing the scene IDs and in my case it even lead to a duplicate for some reason:

...
[dev:go] time="2024-10-10T21:24:10+02:00" level=info msg="Updating sceneid: sexbabesvr-417 to sexbabesvr-417"
[dev:go] time="2024-10-10T21:24:11+02:00" level=info msg="Updating sceneid: sexbabesvr-418 to sexbabesvr-418"
[dev:go] time="2024-10-10T21:24:12+02:00" level=info msg="Updating sceneid: sexbabesvr-419 to sexbabesvr-419"
[dev:go] time="2024-10-10T21:24:13+02:00" level=info msg="Updating sceneid: sexbabesvr-420 to sexbabesvr-421"
[dev:go] time="2024-10-10T21:24:14+02:00" level=info msg="Updating sceneid: sexbabesvr-421 to sexbabesvr-421"
...

@pops64
Copy link
Contributor Author

pops64 commented Oct 11, 2024

According to KLH there are some scenes that change scene ids starting around 610. Tomorrow, I will add logic to only check scenes starting at 600 and if they match what is already present don't update. This should greatly cut down on the migration time because newer scenes do get a little wonky.

Added some error handling incase the website is unreachable.

Added logic to ensure we only check scenes originating from SexBabesVR. Check only scenes starting at 600 as this is where the reported divergence between sceneID sources numbering occurred.  And only update scenes that diverge in id
@pops64
Copy link
Contributor Author

pops64 commented Oct 11, 2024

Done

@crwxaj crwxaj merged commit 322e62e into xbapps:master Oct 11, 2024
1 check passed
@pops64 pops64 deleted the SexbabesVR_Fix branch October 11, 2024 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants