Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

youtube-dl solution for YouTube throttling DASH streams #199

Closed
embryo10 opened this issue Feb 12, 2018 · 44 comments
Closed

youtube-dl solution for YouTube throttling DASH streams #199

embryo10 opened this issue Feb 12, 2018 · 44 comments

Comments

@embryo10
Copy link

After the recent problems with the throttling of DASH streams (audio or video), youtube-dl seems to have solve the problem at last!
Pafy 0.5.4 still has it though.
Trying to get an m4a with pafy (m4astreams[-1]) still gets throttled but getting the same stream directly with youtube-dl 2018.02.11 works fast as before the speed limiting.
Am I doing something wrong?
Any ideas?

@vn-ki
Copy link
Member

vn-ki commented Feb 14, 2018

Pafy uses it's own download function for downloading streams. Any fixes done to the downloader in YouTube-dl won't be inherited.

As a side note, it would be nice if we could use the YouTube-dl downloader to download the streams. But I have no idea whether it's possible or not.

@embryo10
Copy link
Author

I thought pafy used YouTube-dl for something, but if that something is not downloading then I wonder what is it.
Most of the other stuff are done through the YouTube API (I think) and are done much faster than YouTube-dl.
Now that I think about it, downloading with pafy is giving me a much smoother progress status, so it must be doing something else...

@vn-ki
Copy link
Member

vn-ki commented Feb 14, 2018

something is not downloading then I wonder what is it.

Pafy uses youtube-dl for retrieving info about video(like title, description, stream info etc.).

Most of the other stuff are done through the YouTube API

Pafy uses YouTube API for playlist and channel retrieval (I don't know about the internal backend, maybe in that too).

Do you have any examples where the throttling is evident? I can try to port the fix over here.

@garynicolas
Copy link

Hello, I have exactly the same problem, pafy download is very slow since a few weeks. I try to download with youtube-dl and there is no problem.
For me, the throttling is evident on all videos I try to dowload.

@embryo10
Copy link
Author

@vn-ki you can try to download any m4a dash audio stream.
The download starts really fast for about the 15-20% of a 3-4 min file and after that it can take over a minute to finish the download...
This is not happening for the files that have video & audio.
These get downloaded with full speed.
As a temporary solution for my app I added an option to get the audio from the video/audio stream, but it takes more time..

@vn-ki
Copy link
Member

vn-ki commented Feb 14, 2018

We could either port youtube-dl fix onto our download function or use youtube_dl.downloader.http.HttpFD class to handle our download. @ids1024 What do you think?

I am currently trying to implement the latter. It seems doable, but I'm quite busy over the next few weeks and this seems to be an important issue. If I hit a breakthrough and if @ids1024 accepts this approach, I will make a PR.

@ids1024
Copy link
Contributor

ids1024 commented Feb 14, 2018

This sounds like a good idea. But using something from youtube-dl (without copying the code) isn't an option with the internal backend, which doesn't depend on youtube-dl.

Pafy originally did not use youtube-dl, but instead implemented similar functionality. But youtube-dl is better maintained and did not have various issue pafy did, so I changed it to use youtube-dl (#109). Some people complained about this (for some legitimate reasons), so I added back the original code as a separate backend.

I'm not sure what to do about the internal backend; perhaps it should just be removed. I disabled it by default in the last release (726c1a7) because it has bugs, and people were reporting issues that would not have happened if they had youtube-dl installed. I don't know if many/any people still rely on it (or for that matter, if it is currently working).

@ids1024
Copy link
Contributor

ids1024 commented Feb 15, 2018

I thought pafy used YouTube-dl for something, but if that something is not downloading then I wonder what is it.

Youtube-dl is used to get the stream urls; then pafy just downloads them normally.

@embryo10
Copy link
Author

Is there any progress with this?
I'm just asking because if none is working on this, (and given my lack of pafy code knowledge so I can't contribute with a PR), I have to change my app to use the youtube-dl for downloading streams.
If someone IS working on this, I can wait and maybe help by testing or whatever is needed.

@vn-ki
Copy link
Member

vn-ki commented Feb 24, 2018

I have my exams till 1st of March. I'll surely work on this after that.

If someone can work on this before that, then cheers!

@embryo10
Copy link
Author

That's great!
I can wait that much, I'm not in a hurry, just wondering...
Good luck with your exams!!

@ritiek
Copy link
Member

ritiek commented Feb 24, 2018

I have no idea what's going on but I came across something interesting (while wondering about ytdl-org/youtube-dl#15271).

Passing headers {'Range': 'bytes=0-'} to requests library allows full speed downloads:

>>> import requests
>>> import pafy
>>> content = pafy.new('https://www.youtube.com/watch?v=sJa-1MKCx3w')
>>> audio = content.getbestaudio()

# without header
>>> slow_resp = requests.get(audio.url)
>>> with open('slow_download.webm', 'wb') as fout:
>>>     fout.write(slow_resp.content)

# with header
>>> lucky_header = {'Range': 'bytes=0-'}
>>> fast_resp = requests.get(audio.url, headers=lucky_header)
>>> with open('fast_download.webm', 'wb') as fout:
>>>     fout.write(fast_resp.content)

Just compare the time delay when fetching content without and with header, you'll know.

@embryo10
Copy link
Author

Wow!
Where did you get this header? What does it mean?
Trying with a normal header (like "Mozilla/5.0 (Windows ...") does not work.
But this...
slow_resp: 0:01:07.841000
fast_resp: 0:00:03.373000
Can we integrate it in the pafy code?

@vn-ki
Copy link
Member

vn-ki commented Feb 24, 2018

@ritiek Yup. It should work for one time successful downloading. But I don't know whether this would work when you are resuming the download (It might). Did you try resuming the download?

From what I understood from looking in the http_downloader of youtube-dl you have to do more work if you want to resume. Their downloader looked more robust, so I thought utilizing their downloader would be a better choice than trying to port the fix over.

But if this works, doing this would be a lot easier than trying to utilize their downloader class.

@ritiek
Copy link
Member

ritiek commented Feb 24, 2018

Can we integrate it in the pafy code?

@embryo10 Yep, I actually made a local working pafy fork. I'll make a PR soon.

EDIT:

Where did you get this header? What does it mean?

Since YouTube allows resume support, so we can request partial content using {'Range': 'bytes=<start_byte>-<end_byte>'} but no idea why this behaves differently than passing nothing at all.

But I don't know whether this would work when you are resuming the download

@vn-ki That should probably work as well but gotta test that.

@embryo10
Copy link
Author

embryo10 commented Feb 24, 2018

OK did it too..
Added this in backend_shared.py line 622

else:
    resuming_opener = build_opener()
    resuming_opener.addheaders = [("Range", "bytes=%s-" % offset)]
    response = resuming_opener.open(self.url)

and everything works fast again....

@vn-ki
Copy link
Member

vn-ki commented Feb 24, 2018

@embryo10 If you want to use this right now, call this line before calling download on a stream.

pafy.g.opener.addheaders.append(('Range', 'bytes=0-'))

@ritiek It should work. When resuming, pafy adds the range header.

EDIT: Works on resume too. Nice work, @ritiek!

@embryo10
Copy link
Author

@vn-ki Thank you.
I added it permanent to the g.opener.
It will suffice until the official update.
@ritiek Great catch, thank you!

@ritiek
Copy link
Member

ritiek commented Feb 24, 2018

Actually, I think this is not a great solution either. Long videos like https://www.youtube.com/watch?v=ffQM8ALVJV8 throttle with pafy even when passing Range header (but downloads at full speed with youtube-dl).

@embryo10
Copy link
Author

Check it and sadly you are right...
More than 12-13mins and the throttling starts.

@embryo10
Copy link
Author

embryo10 commented Mar 5, 2018

@vn-ki Any news from the throttling front? :o)

@vn-ki
Copy link
Member

vn-ki commented Mar 5, 2018

@embryo10 I have implemented the basic http downloader from youtube-dl. This means throttling is fixed (Yay!).

But I have to make sure the download function's functionality remains same. This means I have to find a way to implement the callback and generate the filename in the same old way.

This week is pretty heavy for me, so please wait for 1 week (I'll try to do it before this weekend). Within that time frame, I will fix this (atleast, for the youtube-dl backend)

EDIT: I did take a look at porting their fix onto our download function. It is doable(not too complex) but would require a lot of rewriting. I want to eventually port that over here.

@embryo10
Copy link
Author

embryo10 commented Mar 5, 2018

These are great news! :o)
I just asked because I was curious for the current status.
Please, take your time..

@vn-ki
Copy link
Member

vn-ki commented Mar 9, 2018

@embryo10 @ritiek I have pushed my changes into the develop branch in my fork. Can one of you test whether the throttling is actually fixed? My internet connection is suffering from some problems at the moment, so I can't test it (My speed is less than 100 kbps right now, -_- ).

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

I'm getting this:

Traceback (most recent call last):
  File "D:\Apps\DEV\PROJECTS\KataLib\secondary.py", line 1185, in process
    self.get_stream()
  File "D:\Apps\DEV\PROJECTS\KataLib\secondary.py", line 1240, in get_stream
    stream.download(self.m4a_file, quiet=True, callback=self.progress_down)
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 575, in download
    self._youtubedl_download(*args, **kwargs)
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 642, in _youtubedl_download
    downloader.real_download(filename, infodict)
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\http.py", line 341, in real_download
    return download()
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\http.py", line 298, in download
    'elapsed': now - ctx.start_time,
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\youtube_dl\downloader\common.py", line 372, in _hook_progress
    ph(status)
  File "D:\Apps\DEV\PYTHON\Python27\lib\site-packages\pafy\backend_shared.py", line 600, in progress_hook
    rate = s['speed']/1024
TypeError: unsupported operand type(s) for /: 'NoneType' and 'int'

I just overwritten the pafy files with yours.
Should I change anything else?

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

I used the https://www.youtube.com/watch?v=ffQM8ALVJV8 link that you used.
I removed the division with 1024 and tried again but the throttling still exists.
The callback worked OK even without the division...

@vn-ki
Copy link
Member

vn-ki commented Mar 9, 2018

@embryo10 I did some changes locally and forgot to push! Sorry for that!

Try the code now!

@ritiek
Copy link
Member

ritiek commented Mar 9, 2018

@vn-ki Nope, tried on e80e993. Same results, download speed keeps decaying.

@vn-ki
Copy link
Member

vn-ki commented Mar 9, 2018

@ritiek Can you try against the new HEAD? It should be fixed now (was a math error from my part).

@ritiek
Copy link
Member

ritiek commented Mar 9, 2018

@vn-ki Wow, yes. It works great now. Amazing. Thanks for the great work!

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

The download speed seems to be OK now (!!!!)
but the stream gets downloaded to my working directory and not where I'm telling it to.

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

It seems that the savedir from 641 is not used for the downloader.download(filename, infodict)

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

Added

        if savedir:
            filename = os.path.join(savedir, filename)

before the
downloader.download(filename, infodict)
and it seems to be working OK

@vn-ki
Copy link
Member

vn-ki commented Mar 9, 2018

@embryo10 Fixed and created PR!

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

Great! Waiting for a new release...
Thank you for your time.
Should we close this or wait to test the release?

@vn-ki
Copy link
Member

vn-ki commented Mar 9, 2018

Wait for the merge, at least!

@embryo10
Copy link
Author

embryo10 commented Mar 9, 2018

OK ;o)

@g8keepa
Copy link

g8keepa commented Mar 30, 2018

Seems like this issue returned. Anybody else noticed the slowdown lately?

@embryo10
Copy link
Author

Nothing wrong here yet..
Any links to try?

@g8keepa
Copy link

g8keepa commented Mar 30, 2018

@embryo10 I tried this, super slow download on my end (latest version installed)

youtube-dl -f140,264 https://www.youtube.com/watch?v=xegAZE0ez04

@embryo10
Copy link
Author

I've just updated to youtube-dl-2018.3.26.1
Also checked with youtube-dl-2018.3.20
No speed problems...

@g8keepa
Copy link

g8keepa commented Mar 30, 2018

@embryo10 Same version here. Weird. I'll try a few more and report back.

@g8keepa
Copy link

g8keepa commented Mar 31, 2018

@embryo10 Try this one:
youtube-dl -f313,140 https://www.youtube.com/watch?v=GtHOlXJsVtI

Painfully slow on my end. I pulled this one much quicker a few weeks ago.

@embryo10
Copy link
Author

No speed problems here..
Keep in mind that I don't use the command line for youtube-dl, rather my apps (Downer, a front end for youtube-dl and KataLib, a Player/Librarian/Converter)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants