Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieving metadata to constructing Deviantart Gallery structure? #419

Closed
cloudywings2 opened this issue Sep 12, 2019 · 8 comments
Closed

Comments

@cloudywings2
Copy link

cloudywings2 commented Sep 12, 2019

First of all, my current Deviantart Extractor settings are as follows

        "deviantart":
        {
	"username":  "******",
	"password" :  "******",
	"client-id": "******",
	"client-secret":  "******",
        "refresh-token":  "******",
        "flat": true,
        "journals": "html",
	"folders": true,
        "mature": true,
        "original": true,
        "wait-min": 0,
	"filename": "{filename}.{extension}",
	"directory":["DeviantArt"],
	"metadata":true,

	"gallery": 
	{ 
		"directory" : ["DeviantArt" , "{author[username]}" , "Gallery"]
	},
			
	"favorite": 
	{ 
		"directory" : ["DeviantArt", "{username}", "Favorites"]
	},
				
	"postprocessors": [{
		"name": "metadata",
		"mode": "json"
	}]
        },

I am trying to reconstruct the folder structure of the entire gallery. From what I gather it's impossible to get all the files within a gallery without downloading it using the flat structure, so I have a separate external script to check the json metadata files and copy them to the proper folders.

However, I can't figure out how to properly extract all of the files within a user's account. If I try to use the root of an artist's page, such as the url

https://www.deviantart.com/shimoda7/

this downloads the Gallery only, ignoring favorites. I can process a second version of each URL to that targets the Favorites folder, but when I do so, this does not retrieve the Collections the file is a part of in the JSON file. It seems the only way to retrieve this information is by manually retrieving the path of each folder and extracting them one-by-one, which is impractical if you're trying to process a batch of galleries.

Is there any way I can all the files within a user's gallery, all the favorites in that user's favorites, and the metadata containing the folders they are within?

edit: oops i made a typo in the title 😳

edit2: This is a separate issue but I believe it's also related to having improper settings. I tried to download a gallery with these settings, and after about 18 hours I got 300 images before it froze up. I know using the folders = true option is supposed to take a long time, but is this behavior normal? This seems to be an exceptionally long time

It also seems that "Literature" posts no longer work, as if I try
gallery-dl https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668
the only output I get is a jpg. I was previously able to get them though, but no longer.

mikf added a commit that referenced this issue Sep 19, 2019
Some journal-like posts are not reported to be journals (isJournal
is set to False), even though they have a textContent field.

https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668
@mikf
Copy link
Owner

mikf commented Sep 25, 2019

You should remove the global "folders": true (and "flat": true) options and only apply them for the sub-categories where they are actually necessary, i.e.:

	"gallery": 
	{ 
		"directory" : ["DeviantArt" , "{author[username]}" , "Gallery"],
		"folders": true
	},

	"favorite": 
	{ 
		"directory" : ["DeviantArt", "{username}", "Favorites"],
		"flat": false
	},

To explain:
The folders option gets the deviation's folder locations in its artists's own gallery, so using this for a collection of favorites will scan the galleries of each individual artist and that, as you may have noticed, is not a good idea.
"flat": false for the favorite sub-category will spawn collection extractors for each favorite collection, instead of bundling them all into one big "package", which will make it so each deviation has an collection metadata field with the title of its favorite folder inside, like

      "collection": {
        "index": "70595441",
        "owner": "pencilshadings",
        "title": "3D Favorites",
        "uuid": "F050486B-CB62-3C66-87FB-1105A7F6379F"
      },

(although the index field will always be 0 thanks to the shortcomings of the OAuth API that's currently used)

The issue with "Literature" posts such as https://www.deviantart.com/gliitchlord/art/brashstrokes-812942668 should also be fixed (01bc7ad)

@cloudywings2
Copy link
Author

Thanks for replying! I'll try this out ASAP and get back to you!

@cloudywings2
Copy link
Author

Okay, I reinstalled and tried using the program again using your settings. Unfortunately any attempt to download any Favorites link just dumps everything in the root directly, and the Favorite directory option is just ignored. My settings are as follows:

` "username": "",
"password": "
",
"client-id": "",
"client-secret": "
",
"refresh-token": "*****",
"journals": "html",
"mature": true,
"original": true,
"wait-min": 0,
"filename": "{filename}.{extension}",
"directory":["DeviantArt"],
"metadata":true,
"gallery":
{
"directory" : ["DeviantArt" , "Users", "{author[username]}" , "Gallery"],
"folders": true
},

		"favorite": 
		{ 
			"directory" : ["DeviantArt", "Users", "{username}", "Favorites"],
			"flat": false
		},
					
		"postprocessors": [{
			"name": "metadata",
			"mode": "json"
		}]`

@mikf
Copy link
Owner

mikf commented Sep 26, 2019

My bad, sorry about that.
Since the individual favorite folders get handled by collection extractors when using "flat": false, gallery-dl tries to use the directory settings in the "collection" subsection of your config and falls back to the general "directory":["DeviantArt"], when it can't find it.

Essentially you'll have to replace the "favorite" section in your config with


		"favorite": 
		{ 
			"flat": false
		},
		"collection": 
		{ 
			"directory" : ["DeviantArt", "Users", "{username}", "Favorites", "{collection[title]}"]
		},

@mikf
Copy link
Owner

mikf commented Sep 26, 2019

And to actually come back to your initial question: To get everything from a DeviantArt profile, you'll have to run gallery-dl with 4 different URLs:

The "top-level" URL (https://www.deviantart.com/shimoda7/) gets handled as if it were a /gallery URL, assuming that most users would only want to download an artist's gallery and nothing else. There are plans on changing that (or at least adding an option where you can select what you want to download) when using such a "top-level" URL as input.

@cloudywings2
Copy link
Author

cloudywings2 commented Sep 27, 2019

There are plans on changing that (or at least adding an option where you can select what you want to download) when using such a "top-level" URL as input.

This would be very helpful! But everything seems to be working now! :D

Edit: Actually it seems that on large galleries ( 200+ images) it seems to lose track of images. I find that sometimes I get OATH errors that terminate the whole download, which then leaves me with an incomplete gallery and no way to recover. Is there a way for me to validate whether or not I have everything?

Edit2: Actually I realized it's this #273 but this issue says it's been fixed. Has Deviantart changed something since then?

@mikf
Copy link
Owner

mikf commented Sep 27, 2019

Those OAuth errors are most likely due to your refresh-token chain being broken and now you get errors whenever gallery-dl tries to find private images at the very end of a gallery, but that's only a guess. More information, like for example a --verbose log, would be helpful in figuring out what's going on.

Anyway, try getting a new refresh-token and see if that helps. It should be noted that the refresh-token implementation for DeviantArt requires a persistent cache file. If you are on Windows, deleting your temporary files will also delete your cache file and invalidate your current refresh-token.

Is there a way for me to validate whether or not I have everything?

Not really, but if gallery-dl finishes without error, you should have gotten everything.

Edit2: Actually I realized it's this #273 but this issue says it's been fixed. Has Deviantart changed something since then?

It has, but it shouldn't be the same error again. If your error looks like

[deviantart][info] Refreshing private access token
[deviantart][error] HTTP request failed:  400: Bad Request for url: https://www.deviantart.com/oauth2/token

it's your refresh-token. Otherwise I'd need more information.

@mikf
Copy link
Owner

mikf commented Nov 8, 2019

#377 (comment)

@mikf mikf closed this as completed Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants