Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local polis-file-server sometimes fails to serve files; polis-server crashes #896

Open
midgleyc opened this issue Mar 9, 2021 · 5 comments

Comments

@midgleyc
Copy link
Contributor

midgleyc commented Mar 9, 2021

Expected behavior:
Polis-file-server always serves static files, or polis-server can handle a single connection drop. Polis-server does not crash.

Actual behavior:
Polis-file-server sometimes drops the connection; polis-server crashes

To Reproduce:
Deploy both polis-server and polis-file-server in production, running polis-file-server as 'local' upload. Wait several hours, then access a page on polis-server.

Screenshots:
image

Device information:

  • AWS t2.small running docker-compose

Additional context:
Logs from polis-server:

part2
{}
{
  'x-forwarded-proto': 'https',
  host: 'polis.client.newredo.com',
  connection: 'close',
  'user-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0',
  accept: 'image/webp,*/*',
  'accept-language': 'en-GB,en;q=0.7,en-US;q=0.3',
  'accept-encoding': 'gzip, deflate, br',
  referer: 'https://polis.client.newredo.com/',
  cookie: 'REDACTED'
}
/app/node_modules/http-proxy/lib/http-proxy/index.js:120
    throw err;
    ^

Error: socket hang up
    at connResetException (internal/errors.js:617:14)
    at Socket.socketCloseListener (_http_client.js:443:25)
    at Socket.emit (events.js:327:22)
    at Socket.EventEmitter.emit (domain.js:486:12)
    at TCP.<anonymous> (net.js:673:12)
    at TCP.callbackTrampoline (internal/async_hooks.js:129:14) {
  code: 'ECONNRESET'
}

After this polis-server crashes. I have the docker container set to restart: unless-stopped, and it comes back up after that so the next bit is just init 1. The page loads on refresh.

@midgleyc
Copy link
Contributor Author

midgleyc commented Mar 17, 2021

The cause of this is that routingProxy sometimes raises an ECONNRESET. Because there are no listeners -- e.g.:

routingProxy.on('error', (e) => {
...
})

this error propagates and crashes the application.

I think this is http-party/node-http-proxy#1455

@patcon
Copy link
Contributor

patcon commented Mar 17, 2021

This is great sleuthing @midgleyc! much appreciated! Sorry, there are not many outside people trying to host Polis in production-like environments.

@joshsmith2 are you hitting something like this as well?

@pluby
Copy link

pluby commented Feb 8, 2022

@patcon The application does not terminate as such but instead it logs the error, then keeps running with the broken socket and does not respond to further requests. It's unclear why the process does not terminate but perhaps the use of setInterval and long setTimeout calls is related. As the process does not terminate, application supervisors (like docker) don't know to restart it.

I'd like to fix this by making sure the application terminates properly, with an error code, as this will provide a wide solution covering other failure scenarios. Any objections or other ideas?

@patcon
Copy link
Contributor

patcon commented Feb 9, 2022

No objection from me, but to be clear, I don't have merge access directly. I can't imagine the polis team would have issue so long as the solution is scoped and as minimal as feasible :) Feel free to talk it out loud here

@edenist
Copy link

edenist commented Aug 31, 2022

I don't have much to add [other than my appreciation to those who have discussed beforehand], but wanted to confirm that we are hitting this issue on our polis deployment in what appears to be a similar configuration to @midgleyc. Happy to contribute to some possible fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants