Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP PUT crashes in static-linked linux builds #996

Closed
mariocynicys opened this issue Oct 11, 2021 · 12 comments
Closed

HTTP PUT crashes in static-linked linux builds #996

mariocynicys opened this issue Oct 11, 2021 · 12 comments
Assignees
Labels
priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Milestone

Comments

@mariocynicys
Copy link

System info

Operating System: Ubuntu 18.04
Shaka Packager Version: v2.6

Issue and steps to reproduce the problem

Run the command below while having an HTTP server running on port 5000 on localhost.
Packager Command:

packager 'in=testvid1.mp4,stream=video,init_segment=http://localhost:5000/output_files/video_init.mp4,segment_template=http://localhost:5000/output_files/video_$Number$.mp4' --segment_duration 10 --generate_static_live_mpd --mpd_output http://localhost:5000/output_files/dash.mpd

What is the expected result?
Packaged content are sent to the server using PUT requests.

What happens instead?
Only the first init segment reached the server, then Shaka Packager crashes with a seg fault.

testvid1.mp4

repro.zip

@joeyparrish joeyparrish added priority: P1 Big impact or workaround impractical; resolve before feature release type: bug Something isn't working correctly labels Oct 12, 2021
@joeyparrish joeyparrish self-assigned this Oct 12, 2021
@joeyparrish
Copy link
Member

Thank you for the simple python server. Super helpful!

If I build locally and statically link, I get a segfault. If I build locally and don't statically link, it works correctly. The official v2.6 binaries are statically linked.

@joeyparrish
Copy link
Member

With the non-static build, I get inconsistent results.

In some cases, I get something like [1012/121026:ERROR:http_file.cc(231)] HttpFile request failed: 7 (HTTP_FAILURE): Failed sending data to the peer. When this happens, the number of segments it uploaded successfully varies greatly: 5, 6, 9, 8... If it ever makes it to 11 segments output, it seems to hang. (My test clip should be 41 segments long.)

@shaka-bot shaka-bot added this to the v2.6 milestone Oct 12, 2021
@joeyparrish
Copy link
Member

When this failure occurs, IoCache has a large amount of data written to it, but libcurl has only read a small amount of it back out. In the run I just did, curl read out 8 x 16,372 byte chunks (total 130,976 bytes), but Packager wrote 333 x 65,536 byte chunks + 18797 bytes at the end (total 21,842,285 bytes).

When I've resolved this, I'll return the question of why it is crashing in a static-link build.

@joeyparrish
Copy link
Member

The non-crash failure is caused by the python server, which doesn't read the input data and doesn't support chunked-transfer encoding.

If I update it to read the upload data and handle chunked transfer encoding, the failure goes away. Here's the new server code:

from http.server import HTTPServer
from http.server import SimpleHTTPRequestHandler
import http.server


class HTTPRequestHandler(SimpleHTTPRequestHandler):
    def do_PUT(self):
      if "Content-Length" in self.headers:
        content_length = int(self.headers["Content-Length"])
        body = self.rfile.read(content_length)

      elif "chunked" in self.headers.get("Transfer-Encoding", ""):
        body = bytes()

        while True:
          line = self.rfile.readline().strip()
          chunk_length = int(line, 16)

          if chunk_length != 0:
            chunk = self.rfile.read(chunk_length)
            body += chunk

          # Each chunk is followed by an additional empty newline
          # that we have to consume.
          self.rfile.readline()

          # Finally, a chunk size of 0 is an end indication
          if chunk_length == 0:
            break

        self.send_response(201, 'Created')
        self.end_headers()

        print(self.headers)
        print('Body length', len(body))

if __name__ == '__main__':
    http.server.test(HandlerClass=HTTPRequestHandler,
                     ServerClass=HTTPServer,
                     port=5000)

@joeyparrish
Copy link
Member

I'm still getting a segfault on the static-linked binary with the fixed server. So the failures I was seeing were red herrings.

@joeyparrish
Copy link
Member

Backtrace from the crash:

(gdb) bt
#0  0x00007ffff45c1298 in internal_getent (stream=stream@entry=0x7fffcc000b60, result=result@entry=0x7fffed7fc910, buffer=buffer@entry=0x7fffed7fcd70 "\177", buflen=buflen@entry=1024, 
    errnop=errnop@entry=0x7fffed7fd6b0, herrnop=herrnop@entry=0x7fffed7fd6e8, af=0) at nss_files/files-XXX.c:173
#1  0x00007ffff45c2544 in _nss_files_gethostbyname4_r (name=0x7fffd400c0e0 "localhost", pat=0x7fffed7fca68, buffer=0x7fffed7fcd70 "\177", buflen=1024, errnop=0x7fffed7fd6b0, 
    herrnop=0x7fffed7fd6e8, ttlp=0x0) at nss_files/files-hosts.c:400
#2  0x0000000000ffed96 in gaih_inet.constprop ()
#3  0x0000000000fffc15 in getaddrinfo ()
#4  0x000000000069c1fc in Curl_getaddrinfo_ex (nodename=0x7fffd400c0e0 "localhost", servname=0x7fffed7fd25c "5000", hints=0x7fffd400c0a0, result=0x7fffd400c098)
    at ../../packager/third_party/curl/source/lib/curl_addrinfo.c:124
#5  0x0000000000694359 in getaddrinfo_thread (arg=0x7fffd400c078) at ../../packager/third_party/curl/source/lib/asyn-thread.c:279
#6  0x000000000069d903 in curl_thread_create_thunk (arg=0x7fffd400c140) at ../../packager/third_party/curl/source/lib/curl_threads.c:57
#7  0x0000000000f60d07 in start_thread (arg=<optimized out>) at pthread_create.c:477
#8  0x0000000001003d1f in clone ()

@mariocynicys
Copy link
Author

Oops, I already implemented the server that reads chunked transfer but sent an over simplified one for readability. It's weird though that the version of packager I tested and works good will work fine whether the chunks are read by the server or not. Maybe it has to do with the chunk size. Let me compile v2.6 to have a better view of the bug.
Thanks for updates :)

@joeyparrish
Copy link
Member

It appears that only the static-linked official binaries are crashing, and only inside of libcurl. If you build it yourself with standard flags, it should work correctly. I hope this unblocks you for now while I try to find a solution.

@joeyparrish joeyparrish changed the title HTTP PUT not working HTTP PUT crashes in static-linked linux builds Oct 12, 2021
@joeyparrish
Copy link
Member

It turns out that getaddrinfo and all of its alternatives in Linux's glibc fail when statically linked. Alpine Linux's musl (replacement for glibc) doesn't seem to have this issue, but we can't require that on other distros.

If I configure libcurl at build time to use libc-ares for DNS instead of libc's getaddrinfo, this seems to resolve the crash.

Since this would be a new dependency on Linux, I need to test this across GitHub Actions, Dockerfile, and the various docker builds in packager/testing/dockers, and I may need to update documentation, too.

@joeyparrish
Copy link
Member

The fix is in Packager v2.6.1, which the robots are pushing out now. After that, I'll update shaka-streamer and shaka-streamer-binaries as well.

@wader
Copy link
Contributor

wader commented Oct 15, 2021

I have some Dockerfile:s where i build static PIE binaries where i test the binaries by having a build step where i copy the binaries over to a scratch image and do some basic tests and try network requests etc. Maybe interesting approach in this case?

@joeyparrish
Copy link
Member

That sounds great. Would you like to submit a PR?

@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Dec 13, 2021
@shaka-project shaka-project locked and limited conversation to collaborators Dec 13, 2021
sr1990 pushed a commit to sr1990/shaka-packager that referenced this issue Feb 18, 2023
The official, static-linked linux builds were crashing in their use of
getaddrinfo, which libcurl was configured to use.  Both getaddrinfo
and all of its alternatives available in glibc fail with static
linking.

We can fix this by configuring libcurl to use libc-ares on Linux
instead.  This allows us to keep the benefits of a statically-linked
Linux binary.

Closes shaka-project#996

Change-Id: Ib4a9eb939813fd165727788726459ef4adf3fc4d
sr1990 pushed a commit to sr1990/shaka-packager that referenced this issue Feb 18, 2023
The official, static-linked linux builds were crashing in their use of
getaddrinfo, which libcurl was configured to use.  Both getaddrinfo
and all of its alternatives available in glibc fail with static
linking.

We can fix this by configuring libcurl to use libc-ares on Linux
instead.  This allows us to keep the benefits of a statically-linked
Linux binary.

Closes shaka-project#996

Change-Id: Ib4a9eb939813fd165727788726459ef4adf3fc4d
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly
Projects
None yet
Development

No branches or pull requests

4 participants