Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple pushes don't recover after a proxy failure #304

Open
MatMaul opened this issue May 24, 2022 · 4 comments
Open

Apple pushes don't recover after a proxy failure #304

MatMaul opened this issue May 24, 2022 · 4 comments
Labels

Comments

@MatMaul
Copy link
Contributor

MatMaul commented May 24, 2022

We use sygnal with a HTTP proxy.
Sometimes our HTTP proxy restarts and is unavailable for a short period of time, returning a 503 response.

In this case we can clearly see the related errors in the logs, for both APNS and GCM.

However APNS never recovers after that, and no errors are logged after. GCM manages to recover well.

Here is the logs around the time the proxy restarts.

sygnal_crash.txt

@MatMaul MatMaul self-assigned this May 24, 2022
@richvdh
Copy link
Member

richvdh commented May 24, 2022

@MatMaul can you paste some keywords (and the stacktrace?) from the log into the issue, so that they will come up with a keyword search?

@MatMaul
Copy link
Contributor Author

MatMaul commented May 24, 2022

Good idea.
Here is the stacktrace related to the 503, for APNS side:

May 18 00:00:07 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.ProxyConnectError: An error occurred while connecting: Unexpected status on CONNECT: b'503'.
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: 2022-05-18 00:00:07,703 [24431] twisted 147 - CRITICAL -
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: Traceback (most recent call last):
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/asyncioreactor.py", line 138, in _readOrWrite
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     why = method()
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 248, in doRead
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self._dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 253, in _dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     rval = self.protocol.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/endpoints.py", line 147, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self._wrappedProtocol.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 207, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self.http_setup_client.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/protocols/basic.py", line 549, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     why = self.lineReceived(line)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/web/http.py", line 635, in lineReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     self.handleStatus(version, status, message)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 239, in handleStatus
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     raise ProxyConnectError("Unexpected status on CONNECT: %s" % status)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.ProxyConnectError: An error occurred while connecting: Unexpected status on CONNECT: b'503'.
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: 2022-05-18 00:00:07,704 [24431] sygnal.gcmpushkin 394 - WARNING - [a92c89b5-8ca7-4deb-b336-46c0022a4eb2] Temporary failure, will retry in 10 seconds
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: Traceback (most recent call last):
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 196, in _perform_http_request
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     bodyProducer=body_producer,
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/asyncioreactor.py", line 138, in _readOrWrite
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     why = method()
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 248, in doRead
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self._dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 253, in _dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     rval = self.protocol.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/endpoints.py", line 147, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self._wrappedProtocol.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 207, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     return self.http_setup_client.dataReceived(data)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/protocols/basic.py", line 549, in dataReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     why = self.lineReceived(line)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/web/http.py", line 635, in lineReceived
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     self.handleStatus(version, status, message)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 239, in handleStatus
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     raise ProxyConnectError("Unexpected status on CONNECT: %s" % status)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.ProxyConnectError: An error occurred while connecting: Unexpected status on CONNECT: b'503'.
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: The above exception was the direct cause of the following exception:
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: Traceback (most recent call last):
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 379, in _dispatch_notification_unlimited
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     n, log, body, headers, pushkeys, span
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 220, in _request_dispatch
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     response, response_text = await self._perform_http_request(body, headers)
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 202, in _perform_http_request
May 18 00:00:07 tchap-sygnal sygnal-0[24431]:     ) from exception
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.TemporaryNotificationDispatchException: GCM request failure
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: 2022-05-18 00:00:07,704 [24431] twisted 147 - INFO - Stopping factory <sygnal.helper.proxy.connectproxyclient_twisted.HTTPProxiedClientFactory object at 0x7f680f5aa0b8>
May 18 00:00:07 tchap-sygnal sygnal-0[24431]: 2022-05-18 00:00:07,712 [24431] aioapns 356 - ERROR - Could not connect to server: An error occurred while connecting: Error from HTTP Proxy whilst attempting CONNECT: 503 (Service

And here is the one for GCM (that manages to recover afterwards):

May 17 23:59:46 tchap-sygnal sygnal-0[24431]: 2022-05-17 23:59:46,092 [24431] sygnal.gcmpushkin 394 - WARNING - [38369a3f-17b0-4851-998c-391805fadca9] Temporary failure, will retry in 10 seconds
May 17 23:59:46 tchap-sygnal sygnal-0[24431]: Traceback (most recent call last):
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 196, in _perform_http_request
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     bodyProducer=body_producer,
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/asyncioreactor.py", line 138, in _readOrWrite
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     why = method()
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 248, in doRead
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     return self._dataReceived(data)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/tcp.py", line 253, in _dataReceived
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     rval = self.protocol.dataReceived(data)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/internet/endpoints.py", line 147, in dataReceived
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     return self._wrappedProtocol.dataReceived(data)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 207, in dataReceived
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     return self.http_setup_client.dataReceived(data)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/protocols/basic.py", line 549, in dataReceived
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     why = self.lineReceived(line)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/twisted/web/http.py", line 635, in lineReceived
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     self.handleStatus(version, status, message)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/helper/proxy/connectproxyclient_twisted.py", line 239, in handleStatus
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     raise ProxyConnectError("Unexpected status on CONNECT: %s" % status)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.ProxyConnectError: An error occurred while connecting: Unexpected status on CONNECT: b'503'.
May 17 23:59:46 tchap-sygnal sygnal-0[24431]: The above exception was the direct cause of the following exception:
May 17 23:59:46 tchap-sygnal sygnal-0[24431]: Traceback (most recent call last):
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 379, in _dispatch_notification_unlimited
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     n, log, body, headers, pushkeys, span
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 220, in _request_dispatch
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     response, response_text = await self._perform_http_request(body, headers)
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:   File "/opt/sygnal/env/lib/python3.7/site-packages/sygnal/gcmpushkin.py", line 202, in _perform_http_request
May 17 23:59:46 tchap-sygnal sygnal-0[24431]:     ) from exception
May 17 23:59:46 tchap-sygnal sygnal-0[24431]: sygnal.exceptions.TemporaryNotificationDispatchException: GCM request failure

@MatMaul
Copy link
Contributor Author

MatMaul commented May 24, 2022

Not happy news: while the proxy restarts every night (we can see the errors from my previous message in the logs every night), the bug is not triggered each time, meaning sometimes it manages to recover and keep sending APNS after the proxy is back.
Probably meaning it's a quite hairy edge case to find.

@DMRobertson
Copy link
Contributor

DMRobertson commented Dec 7, 2023

I have just noticed the following aioapns bug: Fatal1ty/aioapns#42. It sounds very similar to the symptoms described here. That bug was fixed in aioapns v3---so I am very tempted to think that this might magically work by upgrading to the latest Sygnal release, which requires at least this version:

"aioapns>=3.0",

If we can get some confirmation of this that would be ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants