Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[paheal] Paheal's metadata is broken and incomplete #2641

Closed
Athari opened this issue May 30, 2022 · 1 comment
Closed

[paheal] Paheal's metadata is broken and incomplete #2641

Athari opened this issue May 30, 2022 · 1 comment

Comments

@Athari
Copy link

Athari commented May 30, 2022

When I run:

gallery-dl --dump-json "https://rule34.paheal.net/post/list/The_Lion_King_II%3A_Simba%27s_Pride/1"

I get entries like this:

    {
      "category": "paheal",
      "file_url": "https://tulip.paheal.net/_images/42d3c148468101f78ae6c4be3b33f<...>.jpg",
      "height": 1536,
      "id": 4423277,
      "md5": "42d3c148468101f78ae6c4be3b33f1a9",
      "search_tags": "The_Lion_King_II:_Simba's_Pride",
      "size": 0,
      "subcategory": "tag",
      "tags": "Simba The_Lion_Guard The_Lion_King The_Lion_King_II:_Simba&#039;s_Pride Vitani kaion",
      "width": 2200
    }

Issues:

  1. [bug] The tags field contains HTML-escaped text: The_Lion_King_II:_Simba&#039;s_Pride. This seems to be the case when searching for a tag, but not when downloading a post directly. Haven't tested it thoroughly.
  2. [bug] The size field is sometimes equal to 0. This, conversely, seems to be the case in the post subcategory, but not in the tags.
  3. [feature-request] The source_link field is not present in metadata, which is a really important piece of information.
  4. [feature-request] The upload_date field (and possibly uploader_name) are missing. 'Last-Modified' HTTP header seems to match the value, but I'd rather have data directly in metadata if it's provided by the website.

Gallery-dl version 1.21.2 (I noticed the new 1.22 release, but Paheal isn't mentioned in the list of changes)

mikf added a commit that referenced this issue Jun 1, 2022
- unescape 'tags'
- add 'date', 'source', and 'uploader' for single posts
mikf added a commit that referenced this issue Jun 4, 2022
@mikf
Copy link
Owner

mikf commented Jun 7, 2022

Done in 61fa9b5 and 4b78bd4, and part of v1.22.1.

You need to enable the metadata option to get full metadata for tag search URLs. Single posts work without.

gallery-dl -o metadata=1 --dump-json "https://rule34.paheal.net/post/list/The_Lion_King_II%3A_Simba%27s_Pride/1"
    {
      "category": "paheal",
      "date": "2021-06-28 03:23:01",
      "extension": "jpg",
      "file_url": "https://tulip.paheal.net/_images/42d3c148468101f78ae6c4be3b33f1a9/4423277%20-%20Simba%20The_Lion_Guard%20The_Lion_King%20The_Lion_King_II%3A_Simba%27s_Pride%20Vitani%20kaion.jpg",
      "filename": "4423277 - Simba The_Lion_Guard The_Lion_King The_Lion_King_II:_Simba's_Pride Vitani kaion",
      "height": 1536,
      "id": 4423277,
      "md5": "42d3c148468101f78ae6c4be3b33f1a9",
      "search_tags": "The_Lion_King_II:_Simba's_Pride",
      "size": 282931,
      "source": "https://inkbunny.net/s/2477229",
      "subcategory": "tag",
      "tags": "Simba The_Lion_Guard The_Lion_King The_Lion_King_II:_Simba's_Pride Vitani kaion",
      "uploader": "TheFapper19",
      "width": 2200
    }

@mikf mikf closed this as completed Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants