Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"--hls-use-mpegts --no-part" should create files with ".ts" extension #11105

Open
Vangelis66 opened this issue Nov 2, 2016 · 7 comments
Open
Labels

Comments

@Vangelis66
Copy link

Vangelis66 commented Nov 2, 2016

OS: Windows Vista SP2 x86, latest OS updates from Microsoft.
Using the latest standalone "youtube-dl.exe" downloaded from Github.

youtube-dl --version => 2016.11.02

Also using FFmpeg.exe 3.0 (x86) downloaded from the Zeranoe repo.

The first channel of the Greek National TV broadcaster uses Youtube to host its live stream;
ERT1 LIVE => https://www.youtube.com/watch?v=gnnO4TeJG-4
The actual youtube URL changes frequently...
--hls-prefer-native doesn't seem to be able to handle this stream,
so ffmpeg MUST be used. When issuing:

youtube-dl -f 94 --no-part "https://www.youtube.com/watch?v=gnnO4TeJG-4"

the stream download is delegated to FFmpeg and (correctly) youtube-dl creates
a file with the ".mp4" extension. When I CTRL+C, the end file is correctly
identified by MediaInfo to be an MP4 file.
On the contrary, when I issue:

youtube-dl -f 94 --hls-use-mpegts --no-part "https://www.youtube.com/watch?v=gnnO4TeJG-4"

youtube-dl (again) generates a file with file extension ".mp4", but
the --hls-use-mpegts switch creates an MPEG-TS container which has
(by default) a ".ts" file extension (... at least on Windows)!
CTRL+Cing results in an end file which is, in essence, a ".ts" file,
as reported by MediaInfo, but with an erroneous ".mp4" extension.
I have to manually rename the file back to the correct ".ts" extension.

I am of the opinion this is a bugged behaviour on the part of the
application and that youtube-dl should be smart enough to attribute
the correct file extention (.ts) when --hls-use-mpegts is employed...

Many thanks to all the devs' team for their on-going efforts!

@jaimeMF jaimeMF added the request label Nov 3, 2016
@Vangelis66
Copy link
Author

@jaimeMF

Thanks for reading my report; though, if you ask me,
the "bug" label would have been more appropriate...

I am not "requesting" some extra (new) feature of youtube-dl
(the case on most issues here), rather pointing out
that an existing feature be ironed out!
Anyhow, keep up the good work!

@ghost
Copy link

ghost commented Dec 15, 2016

@jaimeMF @Vangelis66 I am new to youtube-dl, but I think most or all modules default to "%(title)s-%(id)s.%(ext)s", so instead of just sending --hls-use-mpegts, you should instead send --hls-use-mpegts --output "%(title)s-%(id)s.ts". You will still get good filenames that way.

But I fully agree that this should be done by default... if possible. Because --hls-use-mpegts is an "if" option, which starts downloading m3u8s as mpegts IF the URL contained an m3u8. But if the given URL wasn't detected as m3u8 then I don't want to get a .ts extension. But with my workaround that's what would happen. Something like a .flv file would get a .ts extension instead, if downloaded via my flags.

So for now I have to be sure to only send the .ts/use-mpegts flags when I know the site uses mpegts.

@jaimeMF I think it makes a lot of sense to automatically replace the meaning of "ext" with ".ts" when --hls-use-mpegts is enabled and youtube-dl detects a m3u8 playlist. That way we can use a single command to download any file, and always be sure to get live-playable .ts files when they are livestreams. I hate using .mp4 since its metadata isn't written until the end so if my computer or the app crashes then I can never recover the file since it has no metadata. And only mpegts files can be live-previewed before the download is complete. So I choose to get all livestreams as .ts and then manually convert to a different container later for archival.

@diamond12
Copy link

@SteveJobzniak Or maybe it is easier to replace ext with .ts if ffmpeg is executed with the -f mpegts parameter.

@ghost
Copy link

ghost commented Jan 23, 2017

@diamond12 Not sure if that would work. The "external downloader" script that launches ffmpeg says ffmpeg -f mpegts every time you use youtube-dl --hls-use-mpegts, even if the actual stream is not mpegts. See: https://github.com/rg3/youtube-dl/blob/master/youtube_dl/downloader/external.py#L264

And https://github.com/rg3/youtube-dl/blob/master/youtube_dl/downloader/hls.py is another HLS downloader which would need fixing too.

I propose a possible solution: If the youtube-dl extractor detects a .m3u8 file AND --hls-use-mpegts is passed as a parameter (basically what the linked line of code already checks for above), then rewrite %(ext)s to "ts". That means that anytime "ext" is used in the output filename pattern, it would correctly use ".ts" - IF - the stream is HLS and is being saved as mpegts.

Either way, this is a pretty silly problem in youtube-dl. Its "automatic filename" feature definitely shouldn't be applying a totally incorrect file extension! :-O It's not a "feature request". It's a necessary bugfix.

@diamond12
Copy link

@SteveJobzniak I agree it's a bug and I don't know why it is marked as a feature request.

The source that you pointed me to is right after if protocol in ('m3u8', 'm3u8_native') so it should be fine. Besides, -f mpegts will always produce a .ts. But yes, your proposed solution is a good option.

@ghost
Copy link

ghost commented Jan 25, 2017

@diamond12 Oh yeah you are right! So they do check that it's a HLS stream before applying --hls-use-mpegts. Nice find! That should mean it's possible to always run with --hls-use-mpegts on the command line even if the URL isn't HLS. And that would support both regular downloads and HLS streams with a single command.

All that's missing now is applying the correct output extension (.ts) whenever it detects a HLS stream.

I think correcting the meaning of ext to ts before the output filename pattern is parsed (as described above) makes the most sense. That way it fixes the ext portion of the filename everywhere in the code that tries to look at / use the extension, and even in any user-provided filename patterns on the command-line (if the user uses %(ext)s.

@ghost
Copy link

ghost commented Jan 25, 2017

@jaimeMF I am extremely overworked, but took a minute to look at the source to get things rolling. It seems the function process_video_result is the correct injection point:

https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L1363

It loops through all possible video sources and fills in their 'ext' value if missing. And then it also runs determine_protocol (which sets the value to m3u8 or m3u8_native?) if the 'protocol' value is missing.

The determine_protocol code is in utils:

https://github.com/rg3/youtube-dl/blob/master/youtube_dl/utils.py#L2323

It seems like all of this extension/protocol/URL-determining code is done without connecting to the server to check any mimetype. It looks like any advanced detection (such as a ".m3u8" hiding behind a ".php" URL), is up to that website's extractor plugin to put in the returned protocol value (however, that "protocol" value isn't provided by many extractors since it isn't necessary at all if the URL itself already starts with a clear protocol like rtmp:// or ends in a clear extension like .m3u8). If the protocol isn't provided by the extractor, YouTube-DL guesses it (https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L1366). So it looks like YouTube-DL doesn't connect any further to the URL it's given by the extractor. It just fills in any missing fields (like protocol) and then passes everything to the desired downloader method. If a website hides its .m3u8 file behind a ".php" extension script, then it'd be up to the extractor to set protocol = m3u8.

Since it seems like YouTube-DL hasn't got any responsibility to further verify the protocol, it seems like modifying the extension selector in YoutubeDL.py's process_video_result is the correct location for the bugfix, from what I can see:

  • First, let it run the existing code that does URL format determination if none was provided by the extractor (if format.get('format') is None:), so that we know what file format the URL is providing.
  • Next, let it run the existing code that does extension determination if none was provided by the extractor (if format.get('ext') is None:), so that we know what extension to use for the output file in normal situations.
  • Next, let it run the existing code that does protocol determination if none was provided by the extractor (if format.get('protocol') is None:), so that we see when it's a m3u8 or m3u8_native protocol. This just looks at the extension/URL, so it's quite basic. So in case the website uses something like a .php script as I theorized above, then it'd be up to the extractor to actually provide a proper protocol value to YouTube-DL.
  • After all of that, we know the format, extension and protocol, so then just add the following code: If --hls-use-mpegts was provided as an option to YouTube-DL (maybe using the same if self.params.get('hls_use_mpegts', False) or tmpfilename == '-': check that external.py uses? https://github.com/rg3/youtube-dl/blob/master/youtube_dl/downloader/external.py#L264), and the protocol value is either m3u8 or m3u8_native, then rewrite format['ext'] = 'ts'.
  • Voila?!

That would fix the bug. Please comment to let us know your thoughts. Is there a better location to fix the bug? Did I find the right location? Could you please fix this bug without a pull request? I need to go now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants