Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] EH JSON from API #1325

Closed
w158297 opened this issue Feb 18, 2021 · 5 comments
Closed

[Suggestion] EH JSON from API #1325

w158297 opened this issue Feb 18, 2021 · 5 comments

Comments

@w158297
Copy link

w158297 commented Feb 18, 2021

Would it be possible to add an option to let the JSON data written for E-Hentai downloads be from their official API? https://ehwiki.org/wiki/API
The current implementation sadly leaves out a lot of data, although it provides information about Parent/ visibility which the API doesn't.

Also gallery-dl -j $gallery_page option for EH galleries starts going through all the pages instead of just showing the one for the requested URL

@mikf
Copy link
Owner

mikf commented Feb 19, 2021

leaves out a lot of data

"archiver_key", "category", "thumb", "uploader" "rating", and "torrentcount" from what I can tell. And the total file size is only an rough estimate. Everything else just has a different name.
"category", "thumb", "uploader" "rating", and "torrentcount" would be easy enough to get from the gallery HTML page as well, only "archiver_key" would be missing.

I wouldn't mind adding an option to (also) get the metadata from the API, but that's 1 extra HTTP request which might not be necessary.

Also gallery-dl -j $gallery_page option for EH galleries starts going through all the pages instead of just showing the one for the requested URL

Use -K or -j --range 0

@w158298
Copy link

w158298 commented Feb 19, 2021

That would also work, thank you very much!

… from what I can tell

Also if there are torrents it gives you the hash,name, torrent+file size for them. Additionally the upload time is missing seconds, which doesn't matter I guess.

On a side note, about the current gallery-dl JSON: what is the "subcategory" entry referring to?
(and for some reason there are the two entries for language entries)

mikf added a commit that referenced this issue Feb 24, 2021
to select between gallery metadata from 'api' or 'html'
@hellupline
Copy link
Contributor

now that we are adding more info from the EH-api, its possible to redownload only the metadata for those ?

mikf added a commit that referenced this issue Feb 26, 2021
- gallery_id    -> gid
- gallery_token -> token
- title_jp      -> title_jpn
- visible       -> expunged
- gallery_size  -> filesize
- count         -> filecount

Also changes the function of the 'metadata' option.
It is now boolean and causes extra data fields from the API to be added
instead of completely replacing the data from HTML when activated.
@mikf
Copy link
Owner

mikf commented Feb 26, 2021

Extracting a gallery_id wasn't the problem, but all format strings still expected the old metadata names, e.g. gallery_id instead of gid from the API. I've now renamed the old metadata fields to match their API counterparts, so this doesn't happen anymore (61fbbd2) and the metadata option is now a simple boolean value.

@hellupline I recommend --range 0 to skip all images, but this will still cost 1 image limit point per gallery.

@hellupline
Copy link
Contributor

btw, I created this gist to upgrade metadata files to use gid and token and other new fields
gist

@mikf mikf closed this as completed Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@hellupline @mikf @w158297 @w158298 and others