Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstagramTagExtractor] Error: Download untagged images when downloading #2659

Closed
frank3215 opened this issue Jun 5, 2022 · 1 comment
Closed
Labels

Comments

@frank3215
Copy link

For example, when I was downloading from #inart using:

gallery-dl --write-metadata https://www.instagram.com/explore/tags/inart/ -u my_username -p my_password --verbose

Although the first couple of them is indeed tagged #inart, at some point the images downloaded has nothing to do with the tag.

For example, gallery-dl outputted for the 149th image downloaded:

[urllib3.connectionpool][debug] Starting new HTTPS connection (53): scontent-sea1-1.cdninstagram.com
[urllib3.connectionpool][debug] https://scontent-sea1-1.cdninstagram.com:443 "GET /v/t51.2885-15/285809758_512970987285184_3795440801648580192_n.webp?stp=dst-jpg_e35&_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=104&_nc_ohc=HsGxgQEf_LsAX8iGuEv&edm=AMKDjl4BAAAA&ccb=7-5&ig_cache_key=Mjg1MzA1NjE5ODA1Mjk0MTc3NA%3D%3D.2-ccb7-5&oh=00_AT-MKwUUTmRXu75ngbZsJariIe7qi-LA-BJ72dWwmVF6Zg&oe=62A2637A&_nc_sid=1fe099 HTTP/1.1" 200 204685
./gallery-dl/instagram/tag/inart/2853056198052941774.jpg

Checking the original url for 2853056198052941774.jpg, the link showed that it is indeed not tagged with #inart.

The generated .json file for 2853056198052941774.jpg is as follows, for reference:

{
    "category": "instagram",
    "date": "2022-06-04 08:20:24",
    "display_url": "https://scontent-sea1-1.cdninstagram.com/v/t51.2885-15/285809758_512970987285184_3795440801648580192_n.webp?stp=dst-jpg_e35&_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=104&_nc_ohc=HsGxgQEf_LsAX8iGuEv&edm=AMKDjl4BAAAA&ccb=7-5&ig_cache_key=Mjg1MzA1NjE5ODA1Mjk0MTc3NA%3D%3D.2-ccb7-5&oh=00_AT-MKwUUTmRXu75ngbZsJariIe7qi-LA-BJ72dWwmVF6Zg&oe=62A2637A&_nc_sid=1fe099",
    "extension": "jpg",
    "filename": "285809758_512970987285184_3795440801648580192_n",
    "fullname": "Daniel K",
    "height": 1084,
    "media_id": "2853056198052941774",
    "num": 1,
    "owner_id": 48406547440,
    "post_id": "2853056198052941774",
    "post_shortcode": "CeYF375LA_O",
    "shortcode": "CeYF375LA_O",
    "subcategory": "tag",
    "tag": "inart",
    "tagged_users": [],
    "username": "richketodiet99",
    "video_url": null,
    "width": 1080
}

Also, downloading from #gawrt shows the same behavior: the first 27 is right, but the rest are completely unrelated to the tag.

However, when trying downloading about 600 images from #instagram, I found that the images I randomly checked are somehow all correct. Maybe this issue is more significant for tags with fewer posts?

Just in case, I am using download-dl on an Apple M1 Macbook Pro, running macOS Montery 12.2.1.

@frank3215
Copy link
Author

frank3215 commented Jun 5, 2022

P.S. the 28th file downloaded from #gawrt is from this link, with the following JSON (which, as can be seen from the link, is NOT tagged #gawrt):

{
    "category": "instagram",
    "date": "2022-06-04 14:06:29",
    "display_url": "https://scontent-sea1-1.cdninstagram.com/v/t51.2885-15/285677995_155250503734869_4054416472648917107_n.jpg?stp=dst-jpg_e15_s480x480&_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=109&_nc_ohc=EO9Hb_EMRy4AX9oPv0V&edm=AMKDjl4BAAAA&ccb=7-5&ig_cache_key=Mjg1MzIzMDE1MzkyNTQxMzYzMg%3D%3D.2-ccb7-5&oh=00_AT-yvb14v6GsfI7LVSPBDK00qGsCHISdKCT8tZrh8ckvUg&oe=62A33E0C&_nc_sid=1fe099",
    "extension": "mp4",
    "filename": "285846941_1413030355845362_755058295149053200_n",
    "fullname": "ʟɪʟ ᴠɪʙᴇ",
    "height": 576,
    "media_id": "2853230153925413632",
    "num": 1,
    "owner_id": 53347276766,
    "post_id": "2853230153925413632",
    "post_shortcode": "CeYtbU7KEsA",
    "shortcode": "CeYtbU7KEsA",
    "subcategory": "tag",
    "tag": "gawrt",
    "tagged_users": [],
    "username": "vibeposting__",
    "video_url": "https://scontent-sea1-1.cdninstagram.com/v/t50.16885-16/285846941_1413030355845362_755058295149053200_n.mp4?efg=eyJ2ZW5jb2RlX3RhZyI6InZ0c192b2RfdXJsZ2VuLjU3Ni5pZ3R2LmJhc2VsaW5lIiwicWVfZ3JvdXBzIjoiW1wiaWdfd2ViX2RlbGl2ZXJ5X3Z0c19vdGZcIl0ifQ&_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=102&_nc_ohc=Cnc_K4XlCmQAX8S4C4l&edm=AMKDjl4BAAAA&vs=544470010726649_1984068881&_nc_vs=HBksFQAYJEdKMnRDUkh5d0dyRUpBVUZBQkJGRXR1RWdYb0tidlZCQUFBRhUAAsgBABUAGCRHQm1UQkJGMDI0NXNwUzhTQUUxSzNDMVBERFpVYnZWQkFBQUYVAgLIAQAoABgAGwGIB3VzZV9vaWwBMRUAACas%2Foz3pavhPxUCKAJDMywXQBwAAAAAAAAYEmRhc2hfYmFzZWxpbmVfMV92MREAdewHAA%3D%3D&_nc_rid=818f635a41&ccb=7-5&oe=629E812E&oh=00_AT_MEdjOcQTpwioWRjyZOMAVc4aMmoD78d-P9G9EQkcL4Q&_nc_sid=1fe099",
    "width": 576
}

which, when downloaded using

gallery-dl --write-metadata https://www.instagram.com/p/CeYtbU7KEsA/ -u my_username -p my_password --verbose

gices the following output and JSON:

[gallery-dl][debug] Version 1.22.0 - Git HEAD: 30e3d8883
[gallery-dl][debug] Python 3.9.12 - macOS-12.2.1-arm64-64bit
[gallery-dl][debug] requests 2.18.4 - urllib3 1.22
[gallery-dl][debug] Starting DownloadJob for 'https://www.instagram.com/p/CeYtbU7KEsA/'
[instagram][debug] Using InstagramPostExtractor for 'https://www.instagram.com/p/CeYtbU7KEsA/'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.instagram.com
[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /graphql/query/?query_hash=2efa04f61586458cef44441f474eee7c&variables=%7B%22shortcode%22%3A+%22CeYtbU7KEsA%22%2C+%22child_comment_count%22%3A+3%2C+%22fetch_comment_count%22%3A+40%2C+%22parent_comment_count%22%3A+24%2C+%22has_threaded_comments%22%3A+true%7D HTTP/1.1" 200 4003
[instagram][debug] Active postprocessor modules: [MetadataPP]
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): scontent-sea1-1.cdninstagram.com
[urllib3.connectionpool][debug] https://scontent-sea1-1.cdninstagram.com:443 "GET /v/t50.16885-16/285510174_550575363243087_7951415962826832842_n.mp4?_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=103&_nc_ohc=DuxghbZoIu4AX9t8j45&edm=AP_V10EBAAAA&ccb=7-5&oe=629EBC85&oh=00_AT9ueNkjvw9i4SMk5HoOQwztq5tyD4O524kJ2WTwhv1DJA&_nc_sid=4f375e HTTP/1.1" 200 1058296
{
    "category": "instagram",
    "date": "2022-06-04 14:06:29",
    "description": "Tag prettiest person😼",
    "display_url": "https://scontent-sea1-1.cdninstagram.com/v/t51.2885-15/285677995_155250503734869_4054416472648917107_n.jpg?stp=dst-jpg_e15&_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=109&_nc_ohc=EO9Hb_EMRy4AX9oPv0V&edm=AP_V10EBAAAA&ccb=7-5&oh=00_AT-0y49vxvsGVBQXnOO9kQVjZGHU4jSYvHANGB5M58rOCA&oe=629EA0CC&_nc_sid=4f375e",
    "extension": "mp4",
    "filename": "285510174_550575363243087_7951415962826832842_n",
    "fullname": "ʟɪʟ ᴠɪʙᴇ",
    "height": 640,
    "likes": 13,
    "media_id": "2853230153925413632",
    "owner_id": "53347276766",
    "post_id": "2853230153925413632",
    "post_shortcode": "CeYtbU7KEsA",
    "post_url": "https://www.instagram.com/p/CeYtbU7KEsA/",
    "shortcode": "CeYtbU7KEsA",
    "subcategory": "post",
    "tagged_users": [],
    "typename": "GraphVideo",
    "username": "vibeposting__",
    "video_url": "https://scontent-sea1-1.cdninstagram.com/v/t50.16885-16/285510174_550575363243087_7951415962826832842_n.mp4?_nc_ht=scontent-sea1-1.cdninstagram.com&_nc_cat=103&_nc_ohc=DuxghbZoIu4AX9t8j45&edm=AP_V10EBAAAA&ccb=7-5&oe=629EBC85&oh=00_AT9ueNkjvw9i4SMk5HoOQwztq5tyD4O524kJ2WTwhv1DJA&_nc_sid=4f375e",
    "width": 640
}

@mikf mikf added the bug label Jun 6, 2022
mikf added a commit that referenced this issue Jun 7, 2022
@mikf mikf closed this as completed Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants