-
-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions, Feedback and Suggestions #3 #146
Comments
simple snippet to turn gallery-dl into api from types import SimpleNamespace
from unittest.mock import patch, Mock
import os
import click
from flask.cli import FlaskGroup
from flask import (
Flask,
jsonify,
request,
)
from gallery_dl import main, option
from gallery_dl.job import DataJob
def get_json():
data = None
parser = option.build_parser()
args = parser.parse_args()
args.urls = request.args.getlist('url')
if not args.urls:
return jsonify({'error': 'No url(s)'})
args.list_data = True
class CustomClass:
data = []
def run(self):
dj = DataJob(*self.data_job_args, **self.data_job_kwargs)
dj.run()
self.data.append({
'args': self.data_job_args,
"kwargs": self.data_job_kwargs,
'data': dj.data
})
def DataJob(self, *args, **kwargs):
self.data_job_args = args
self.data_job_kwargs = kwargs
retval = SimpleNamespace()
retval.run = self.run
return retval
c1 = CustomClass()
with patch('gallery_dl.option.build_parser') as m_bp, \
patch('gallery_dl.job.DataJob', side_effect=c1.DataJob) as m_jt:
# m_option.return_value.parser_args.return_value = args
m_bp.return_value.parse_args.return_value = args
m_jt.__name__ = 'DataJob'
main()
data = c1.data
return jsonify({'data': data, 'urls': args.urls})
def create_app(script_info=None):
"""create app."""
app = Flask(__name__)
app.add_url_rule(
'/api/json', 'gallery_dl_json', get_json)
return app
@click.group(cls=FlaskGroup, create_app=create_app)
def cli():
"""This is a script for application."""
pass
if __name__ == '__main__':
cli() e: this could be simple when using direct DataJob to handle the urls, but i haven't check if there is anything have to be done before initialize DataJob instance |
You don't need to do anything before initializing any of the Job classes:
You can initialize anything logging related if you want logging output, |
@rachmadaniHaryono what does that code do? |
simpler api (based on above suggestion) #!/usr/bin/env python
from types import SimpleNamespace
from unittest.mock import patch, Mock
import os
import click
from flask.cli import FlaskGroup
from flask import (
Flask,
jsonify,
request,
)
from gallery_dl import main, option
from gallery_dl.job import DataJob
from gallery_dl.exception import NoExtractorError
def get_json():
data = []
parser = option.build_parser()
args = parser.parse_args()
args.urls = request.args.getlist('url')
if not args.urls:
return jsonify({'error': 'No url(s)'})
args.list_data = True
for url in args.urls:
url_res = None
error = None
try:
job = DataJob(url)
job.run()
url_res = job.data
except NoExtractorError as err:
error = err
data_item = [url, url_res, {'error': str(error) if error else None}]
data.append(data_item)
return jsonify({'data': data, 'urls': args.urls})
def create_app(script_info=None):
"""create app."""
app = Flask(__name__)
app.add_url_rule(
'/api/json', 'gallery_dl_json', get_json)
return app
@click.group(cls=FlaskGroup, create_app=create_app)
def cli():
"""This is a script for application."""
pass
if __name__ == '__main__':
cli() |
@rachmadaniHaryono instructions on using this GUG and combing it with Hydrus? Any pre-configurstions besides |
|
@rachmadaniHaryono add that to the Wiki in https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts if you can, sounded like a really good solution. Also, why port 5013, is that port specifically used for something? |
not a really technical reason. i just use it because the default port is used for my other program.
i will consider it, because i'm not sure where to put that another plan is fork (or create pr) for server command but i'm not sure if @mikf want pr for this |
@rachmadaniHaryono https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/wiki |
this depend on hydrus vs imgbrd-grabber download speed. from my test gallery-dl give direct link, so hydrus don't have to process the link anymore. |
I've already had something similar to this in mind (implementing a (local) server infrastructure to (remotely) send commands / queries: A few questions from me concerning Hydrus
|
this still depend on how big will this be. will it just be an api or there will be html interface for this. although an existing framework will make it easier and the plugin for the framework will let other developer create more feature they want. of course there is more better framework than flask as example, e.g. sanic, django but i actually doubt if using the standard will be better than those.
that is modified version from flask cli example. flask can do that simpler but it require to set up variable environment which add another command
hydrus dev is planned to make api for this on the next milestone. there is also other hydrus user which make unofficial api but he didn't make one for download yet. so either wait for it or use existing hydrus parser
hydrus expect either html and json and try to extract data based on the parser the user made/import. i make this one for html but it maybe changed on future version https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/guide/create_parser_furaffinity.md . if someone want to make one, they can try made api similar to 4chan api,copy the structure and use modified parser from existing 4chan api. my best recommendation is to try hydrus parser directly and see what option is there. ask hydrus discord channel if anything is unclear |
can gallery-dl support weibo ? i found this https://github.com/nondanee/weiboPicDownloader but it take too long to scan and dont have ability to skip downloaded files |
@rachmadaniHaryono I opened a new branch for API server related stuff. The first commit there implements the same functionality as your script, but without external dependencies. Go take a look at it if you want. And when I said your script "should be simplified ... further" I didn't mean it should use less lines of code, but less resources in term of CPU and memory. Python might not be the right language to use when caring about things like that, but there is still no need to call functions that effectively do nothing - command-line argument parsing for example. |
will it be only api or will there will be html interface @mikf? e: i will comment the code on the commit |
I don't think there should be an HTML interface directly inside of gallery-dl. I would prefer it to have a separate front-end (HTML or whatever) communicating with the API back-end that's baked into gallery-dl itself. It is a more general approach and would allow for any programing language and framework to more easily interact with gallery-dl, not just Python. |
still on port 5013 e: related issue CuddleBear92/Hydrus-Presets-and-Scripts#69 |
About twitter extractor, we have limited request depend on how many tweets user had right ? |
@wankio The Twitter extractor gets the same tweets you would get by visiting a timeline in your browser and scrolling down until no more tweets get dynamically loaded. I don't know how many tweets you can access like that, but Twitter's public API has a similar restriction:: https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-user_timeline.html
You could try ripme. It uses the public API instead of a "hidden", browser-only API like gallery-dl. Maybe you can get more results with that. |
but if i remember, ripme rip all tweet/retweet not just user tweet |
For some reason the login with OAuth and App Garden tokens or the -u/-p commands doesn't work with flickr which makes images that require a login to view them not downloadable. But otherwise amazing tool, thank you so much! |
today when i'm checking e-hentai/exhentai, it just stucked forever. maybe my ISP is the problem because i can't access e-hentai but exhentai still ok. So i think Oauth should help, using cookies instead id+password to bypass |
is there a way to download files directly in a specified folder instread of subfolders? |
@tddschn You can manually specify the browser profile folder path when the defaults don't work.
|
Message ID: ***@***.***>Did gofile.io change something? Only getting errors.. (with other python scripts as well)
|
Message ID: ***@***.***>GoFile.io issue solution:
https:// <https://gitjub>github.com/Jules-WinnfieldX/CyberDropDownloader/pull/802
|
could someone please help me with making instagram download stories, reels, posts to their own files using command line? |
i notice that even when gallery-dl says i have ratelimit i can still use twitter website and go to the users profile and view their tweets if i click replies or media tab. but when i change timeline strategy to media and i change twitter include to "media" it doesnt work and i am still rate limit. why is there ratelimit for gallerydl but not on website media tab? |
Can you pass the cookies that gallery-dl is currently using to a post-processor? |
Could we have a config option to sleep the extractor for a set amount of time upon encountering a 429 Too Many Requests error and retry with base delay before it goes into the delay interval increase routine? I'm wondering for larger image repositories (my use case in this instance is DeviantArt, I'm downloading collections) if I just slept it for five/ten minutes or so and continued as normal afterwards if it might be faster than getting stuck in 17s delay purgatory. It's currently effectively what I'm doing when I interrupt the process when it gets too egregious and then attempt again 10 minutes later, and it seems to work as usual upon start, I just want to be able to continue where I left off. |
@WhyEssEff Is that not what |
@biggestsonicfan I'd prefer the behavior I'm trying to get at to happen specifically on encountering an error. I'd like to assume minimum request time when possible, while telling the extractor to halt for x seconds if it throws back a 429, in order to see if it can just restart comfortably on the default delay after just not doing anything for x amount of time. e.g., assume 0.5 second sleep-request until 429 is thrown, pause the extractor for 120 seconds, retry with default delay, then if it's still throwing 429s assume current behavior of increasing delay interval by 1s and trying again until it works what this could look like would be something akin to the following: and then it could retry with default delay, upon which if it still fails it increases delay interval. I'm wondering this because the longer delays sort of rack up runtime cumulatively and it might be more optimal for larger galleries to have this option, even if you have to set it to like 5/10 minutes to use it effectively. |
hi, is there a way to download all the saved posts on my Instagram account with like, all of them at once. |
@useless642 |
@github-userx @biggestsonicfan Enable |
How often are these QFS issues rotated? This one is getting kinda long. |
hi, does
does |
You can just run |
|
@BakedCookie (This isn't properly documented for some reason, while other options with similar semantics like |
Can gallery-dl cache cookies grabbed from the browser for a duration if you grab cookies from a browser? I'm noticing startup takes a while per-use and if I use cookies from a file, it's instant. |
@biggestsonicfan To improve startup time, you could use
and then load them from there. |
I noticed that per #80 there was some talk about Collections, but they still aren't implemented. They probably aren't that different from albums (see e.g. https://www.artstation.com/gallifreyan/collections/197428), so probably (?) wouldn't be that hard to implement. Should I open an issue for this? |
I used to do cookies.txt on a per-site basis but it got a little tedious to manage. I already do
if that's what you meant. I will try something like Ah: Moving on from that though, I would like to contribute a new site support for gallery-dl but other than browsing the existing code, I don't really see any templates for both an extractor and a test suite? Where would I start for a site that has embedded JSON in it's html page of the page's contents? |
@JinEnMok @biggestsonicfan
There isn't any. You could take a look at merged PRs that add support for a new site.
|
@mikf you're the best, cheers! :) |
Closing this as suggested by taskhawk (#146 (comment)). |
Continuation of the old issue as a central place for any sort of question or suggestion not deserving their own separate issue. There is also https://gitter.im/gallery-dl/main if that seems more appropriate.
Links to older issues: #11, #74
The text was updated successfully, but these errors were encountered: