Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply xpath parseDate after subScraper #606

Merged
merged 1 commit into from
Jun 15, 2020

Conversation

bnkai
Copy link
Collaborator

@bnkai bnkai commented Jun 9, 2020

Moves parseDate processing after subscraper.
This is needed so that we can fetch dates from external urls.
One example is the bangbros site that has scene dates in search results but not in scene pages.
The bangbros scraper can be then adjusted like this

name: "BangBros"
sceneByURL:
  - action: scrapeXPath
    url:
      - https://bangbros.com/
    scraper: sceneScraper
xPathScrapers:
  sceneScraper:
    scene:
      Title: //div[@class="ps-vdoHdd"]/h1/text()
      Details: //div[@class="vdoDesc"]/text()
      Tags:
        Name:
          selector: //div[@class="vdoTags"]/a/text()
      Performers:
        Name: //div[@class="vdoCast"]/a[position()>1]/text()
      Image:
        selector: //img[@id="player-overlay-image"]/@src
        replace:
          - regex: ^
            with: "https:"
      Studio:
        Name: //div[@class="vdoCast"]/a[1]/text()
      Date:
        selector: //div[@class="vdoCast" and contains(text(), "Release:")]
        replace:
          - regex: "^Release: "
            with: "https://bangbros.com/search/"
        subScraper:
          selector: //span[@class="thmb_mr_cmn thmb_mr_2 clearfix"]/span[@class="faTxt"]
        parseDate: Jan 2, 2006
        
# Last Updated June 9, 2020

@bnkai bnkai added the improvement Something needed tweaking. label Jun 9, 2020
Copy link
Collaborator

@WithoutPants WithoutPants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@WithoutPants WithoutPants merged commit f40e234 into stashapp:develop Jun 15, 2020
Tweeticoats pushed a commit to Tweeticoats/stash that referenced this pull request Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Something needed tweaking.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants