Skip to content

Commit

Permalink
(change): expose most of the scraper code
Browse files Browse the repository at this point in the history
- bugs were occurring in the scraper code and it felt weird to fix them
  and not be able to commit or version control that data
- in the future, when multiple providers are supported and the API is
  somewhat stable, probably want to split this out into its own
  library / sdk and have the app depend on the SDK instead of doing
  it's own parsing
- still gitignore site names and URLs for now
  - and rename exposed files to numbers instead of names for now
    - might use acronyms or something in the future for multi-provider
      support and letting the user choose providers
  • Loading branch information
agilgur5 committed Nov 22, 2019
1 parent 8d568fc commit fbfdb89
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 2 deletions.
2 changes: 0 additions & 2 deletions models/.gitignore

This file was deleted.

3 changes: 3 additions & 0 deletions models/scraperDrivers/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# don't commit site names or URLs for now
allDrivers.js
driver*URLS.js
52 changes: 52 additions & 0 deletions models/scraperDrivers/driver1.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
export { searchURL, latestURL } from './driver1URLs.js'

export function getSearch ($) {
return $('.post-list li')
.map((index, el) => ({
link: $(el).find('a').attr('href'),
title: $(el).find('img').attr('title'),
cover: $(el).find('img').attr('src')
}))
.get()
}

export function getLatest ($) {
return $('.post')
.map((index, el) => ({
link: $(el).find('a').attr('href'),
title: $(el).find('img').attr('title'),
cover: $(el).find('img').attr('src'),
release: $(el).find('em').text()
}))
.get()
}

export function getChapters ($) {
const title = $('.manga-detail-top .title').text().trim()
const chapters = $('.chlist a')
.map((index, el) => {
return {
link: $(el).attr('href').replace('//', 'http://'),
title: $(el).text().match(/[0-9]+/)[0] || '0',
date: new Date($(el).find(':not(.newch)').text())
}
})
.get()
const tags = $('.manga-genres li')
.map((index, el) => $(el).text().trim())
.get()
const summary = $('.manga-summary').text().trim()
return { title, chapters, tags, summary }
}

export function getPages ($) {
return $('.mangaread-page option')
.map((index, el) => ({
link: $(el).attr('value').replace('//', 'http://')
}))
.get()
}

export function getImage ($) {
return $('#viewer img').attr('src')
}

0 comments on commit fbfdb89

Please sign in to comment.