deviantART broken image files #112

Elytreus · 2014-08-29T20:44:05Z

The downloader works great most of the time, but once in a while it downloads broken images. The pictures can not be opened with any program and are in some cases just a few bytes big. If I download these images manually they are completely fine.

The files look like this:

Bendito999 · 2014-08-30T03:01:48Z

Sometimes it seems to do this when the deviant art download link times out. If you open one of these files in notepad, it is an html redirect page that takes you to the original page that it was downloaded from (which then contains an updated link to redownload). The ripper program is renaming from .html back to the expected .png or .jpg. Now we just need to figure out how to make the deviant art ripper extract these "failed links" and retry with updated links reparsed from the page.

Though I don't know the inner workings of this program enough to implement this logic, I will throw together a quick little script that will check for these hidden .html files and redownload the real ones. I am still working on it, though.

Until thats done, I devised a manual way that works for me.

Edit: Simpler method:

Uncheck Overwrite Existing files and Preserve Order, and change max download threads to 1 in RipMe.
2.Delete all of the tiny "failed files" from the album folder (put "size:tiny" in the search bar to find these)
3.Rerun Ripme until the corrupt files start piling up (there's a nice rerun button in "history"
4.Stop Ripme
Repeat from step 2

Edit 2::
I modified another downloading script to take all of the broken files in a directory (whether they are named wrong or not) and redownload them. It doesn't search recursively (yet), and doesn't delete old .html files, but is a work in progress. You will need Python 2.7 with Mechanize library
http://pastebin.com/JZa1Pr2z
Though I haven't tried it, the original script that the actual deviant art downloading routines came from may be promising, as the logic in it seems sound.
https://github.com/voyageur/dagr

Complicated Manual Method (doesn't download pictures only available as thumbnail): If you sort by size and find all of these failed links, you can rename all of the extensions from whatever they are to .html with a program called Bulk Rename Utility.
http://www.bulkrenameutility.co.uk/Download.php

To do the .html rename, sort by size, change the pictured setting, and select the .png and .jpg images. Press Rename

Because Firefox disallows opening many html files at once, move all of these html files into their own folder. Open command prompt and navigate to that folder.

First, prep Firefox for the abuse by installing Imageblock
https://addons.mozilla.org/en-US/firefox/addon/image-block/

and Downthemall
https://addons.mozilla.org/en-US/firefox/addon/downthemall/

You should have a little button for image block, set it to block images.
Also, go to Firefox's options and under the "Tabs" tab, uncheck "Don't load tabs until selected"

Open up a new Window in Firefox

Then run these 2 commands in the command window we opened earlier:
dir /b > url_list.txt

for /F %i in (url_list.txt) do "C:\Program Files (x86)\Mozilla Firefox\firefox.exe" -new-tab "%i"

You may have to change the "Program Files (x86)" to "Program Files"

Once loaded, use the DownThemAll firefox extension, and press DownThemAll "All Tabs". You may need to find this button by going to "3 lines" Firefox option menu, then customizing your toolbar to contain the DTA buttons. Once that is successful...
In the fast filter section, type "download", and all of the links across all tabs with "download" in the description should be selected. It should catch everything ripme missed.

I suggest changing DownThemAll preferences under its Network Tab to "concurrent downloads=1" and under "Advanced" Max number of Segments Per Download to 1/Disabled. This makes this look less like a mass downloading operation to Deviant Art.

The bad part is that, if it was a huge album, we might run into this problem once again while DownThemAll is running, as the time once again expired. Just run the opening command line again, and tell DTA to skip any files it already downloaded when it starts asking about overwriting files.

Elytreus · 2014-08-30T09:20:09Z

Thank you for your quick and detailed response. I will try your method but I hope the creator of the program will find a solution, as I am not familiar with programming.

Bendito999 · 2014-08-30T14:04:01Z

Yeah, sorry about the excessively complicated instructions. I'm sure there's a better way (I edited the previous post with simpler instructions that seem to work), and a cleanup script you can run in a directory shouldn't be too hard (Edit:look in first post), as I run into this problem when manually downloading from deviantart.

Edit (Better programming change solution, but possibly would require more extensive restructuring):
The "run ahead" scanning behavior is actually fine, but the links to the actual image need to be generated on the fly. The subroutine that takes a deviant "page" with the picture on it and turns it into a full-definition download link needs to be moved from the 'scanning' phase to the downloader phase.
The que will fill up with the raw page links, and the download links would be generated "just in time" for downloading (preferably one at a time) as a part of the download routine.

Old (possibly simpler but not foolproof) Delay solution:
Instead of running ahead and queing up expiring download links, it should wait after scanning 1 page
send to download thread
wait for the download thread to complete links on that page,
then advance to getting fresher links on the next page.
Repeat

4pr0n · 2014-10-21T09:54:17Z

I've had issues getting some download links from deviantart pages. Exhibit A & B:
https://github.com/4pr0n/ripme/blob/master/src/main/java/com/rarchives/ripme/ripper/rippers/DeviantartRipper.java#L162

rautamiekka · 2017-08-11T13:03:03Z

I haven't ever encountered this ancient problem, and we're up to RipMe 1.5.5 nowadays, but I'm still on 1.5.2.

Window$ 7 Ultimate SP1 x64, Oracle Java SE 8 Update 144 x64, I've ripped dozens of entire galleries (which still are on the disk) and every so often sync my favs folders, no real problems.

metaprime · 2017-08-14T07:19:09Z

Yeah the dA ripper still has a lot of problems but I think we can probably close this one at this point. I've never seen it either.

Added hentai2read ripper

4pr0n added the bug label Oct 21, 2014

metaprime mentioned this issue Apr 25, 2017

Developers: Requested Rippers up-for-grabs (meta-issue) #510

Closed

11 tasks

metaprime mentioned this issue Aug 11, 2017

Developers: Requested Rippers up-for-grabs (meta-issue) RipMeApp/ripme#43

Open

13 tasks

metaprime closed this as completed Aug 14, 2017

metaprime added the can't repro label Aug 14, 2017

cyian-1756 added a commit to cyian-1756/ripme that referenced this issue Nov 4, 2017

Merge pull request 4pr0n#112 from cyian-1756/hentai2read

08d3d38

Added hentai2read ripper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deviantART broken image files #112

deviantART broken image files #112

Elytreus commented Aug 29, 2014

Bendito999 commented Aug 30, 2014

Elytreus commented Aug 30, 2014

Bendito999 commented Aug 30, 2014

4pr0n commented Oct 21, 2014

rautamiekka commented Aug 11, 2017

metaprime commented Aug 14, 2017 •

edited

Loading

deviantART broken image files #112

deviantART broken image files #112

Comments

Elytreus commented Aug 29, 2014

Bendito999 commented Aug 30, 2014

Elytreus commented Aug 30, 2014

Bendito999 commented Aug 30, 2014

4pr0n commented Oct 21, 2014

rautamiekka commented Aug 11, 2017

metaprime commented Aug 14, 2017 • edited Loading

metaprime commented Aug 14, 2017 •

edited

Loading