Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebPush support for urgency and coalescing #213

Merged
merged 14 commits into from
Apr 13, 2021
Merged
1 change: 1 addition & 0 deletions changelog.d/213.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
WebPush: add support for Urgency and Topic header
19 changes: 18 additions & 1 deletion docs/applications.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,24 @@ so it can be used to pass identifiers specific to your client

##### events_only

As of the time of writing, all webpush-supporting browsers require you to set `userVisibleOnly: true` when calling (`pushManager.subscribe`)[https://developer.mozilla.org/en-US/docs/Web/API/PushManager/subscribe], to (prevent abusing webpush to track users)[https://goo.gl/yqv4Q4] without their knowledge. With this (mandatory) flag, the browser will show a "site has been updated in the background" notification if no notifications are visible after your service worker processes a `push` event. This can easily happen when sygnal sends a push message to clear the unread count, which is not specific to an event. With `events_only: true` in the pusher data, sygnal won't forward any push message without a event id. This prevents your service worker being forced to show a notification to push messages that clear the unread count.
As of the time of writing, all webpush-supporting browsers require you to set
`userVisibleOnly: true` when calling (`pushManager.subscribe`)
[https://developer.mozilla.org/en-US/docs/Web/API/PushManager/subscribe], to
(prevent abusing webpush to track users)[https://goo.gl/yqv4Q4] without their
knowledge. With this (mandatory) flag, the browser will show a "site has been
updated in the background" notification if no notifications are visible after
your service worker processes a `push` event. This can easily happen when sygnal
sends a push message to clear the unread count, which is not specific
to an event. With `events_only: true` in the pusher data, sygnal won't forward
any push message without a event id. This prevents your service worker being
forced to show a notification to push messages that clear the unread count.

##### only_last_per_room

You can opt in to only receive the last notification per room by setting
`only_last_per_room: true` in the push data. Note that if a first notification
bwindels marked this conversation as resolved.
Show resolved Hide resolved
can be delivered before the second one is sent, you will still get both;
it only has effect for when notifications are queued up on the gateway.
bwindels marked this conversation as resolved.
Show resolved Hide resolved

##### Multiple pushers on one origin

Expand Down
77 changes: 49 additions & 28 deletions sygnal/webpushpushkin.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
import json
import logging
import os.path
from base64 import urlsafe_b64encode
from hashlib import blake2s
from io import BytesIO
from typing import List, Optional, Pattern
from urllib.parse import urlparse
Expand Down Expand Up @@ -102,7 +104,7 @@ def __init__(self, name, sygnal, config):
contextFactory=tls_client_options_factory,
proxy_url_str=proxy_url,
)
self.http_agent_wrapper = HttpAgentWrapper(self.http_agent)
self.http_request_factory = HttpRequestFactory()

self.allowed_endpoints = None # type: Optional[List[Pattern]]
allowed_endpoints = self.get_config("allowed_endpoints")
Expand Down Expand Up @@ -173,6 +175,17 @@ async def _dispatch_notification_unlimited(self, n, device, context):
payload = WebpushPushkin._build_payload(n, device)
data = json.dumps(payload)

# web push only supports normal and low priority, so assume normal if absent
low_priority = n.prio == "low"
# allow dropping earlier notifications in the same room if requested
topic = None
if n.room_id and device.data.get("only_last_per_room") is True:
richvdh marked this conversation as resolved.
Show resolved Hide resolved
# ask for a 22 byte hash, so the base64 of it is 32,
# the limit webpush allows for the topic
topic = urlsafe_b64encode(
blake2s(n.room_id.encode(), digest_size=22).digest()
)

# note that webpush modifies vapid_claims, so make sure it's only used once
vapid_claims = {
"sub": "mailto:{}".format(self.vapid_contact_email),
Expand All @@ -186,15 +199,17 @@ async def _dispatch_notification_unlimited(self, n, device, context):
try:
with SEND_TIME_HISTOGRAM.time():
with ACTIVE_REQUESTS_GAUGE.track_inprogress():
response_wrapper = webpush(
request = webpush(
subscription_info=subscription_info,
data=data,
ttl=self.ttl,
vapid_private_key=self.vapid_private_key,
vapid_claims=vapid_claims,
requests_session=self.http_agent_wrapper,
requests_session=self.http_request_factory,
)
response = await request.execute(
self.http_agent, low_priority, topic
)
Comment on lines +202 to 212
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does the deduplication actually take place, out of interest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do this, it happens in the push gateway. If you have two messages with the same topic header, the second will replace the first (if the first hasn't been delivered yet).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so there is a separate "push gateway" after sygnal (which I tend to think of as a, y'know, push gateway)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, webpush pipes notification to a push provider (or gateway, hard to name things correctly with so many parts to it) like FCM that will actually deliver it to the browser application, similarly to how the apns and gcm pushkins forward it to the push providers of those platforms. Every browser that supports web push needs to have a server like this that application servers can post push messages to. It's the browser-specific push provider that will drop the first message if it hasn't been deliver yet.

response = await response_wrapper.deferred
response_text = (await readBody(response)).decode()
finally:
self.connection_semaphore.release()
Expand Down Expand Up @@ -314,14 +329,11 @@ def _handle_response(self, response, response_text, pushkey, endpoint_domain):
return False


class HttpAgentWrapper:
class HttpRequestFactory:
"""
Provide a post method that matches the API expected from pywebpush.
"""

def __init__(self, http_agent):
self.http_agent = http_agent

def post(self, endpoint, data, headers, timeout):
"""
Convert the requests-like API to a Twisted API call.
Expand All @@ -336,25 +348,15 @@ def post(self, endpoint, data, headers, timeout):
timeout (int)
Ignored for now
"""
body_producer = FileBodyProducer(BytesIO(data))
# Convert the headers to the camelcase version.
headers = {
b"User-Agent": ["sygnal"],
b"Content-Encoding": [headers["content-encoding"]],
b"Authorization": [headers["authorization"]],
b"TTL": [headers["ttl"]],
}
deferred = self.http_agent.request(
b"POST",
endpoint.encode(),
headers=Headers(headers),
bodyProducer=body_producer,
)
return HttpResponseWrapper(deferred)
return HttpDelayedRequest(endpoint, data, headers)


class HttpResponseWrapper:
class HttpDelayedRequest:
"""
Captures the values received from pywebpush for the endpoint request.
The request isn't immediately executed, to allow adding headers
not supported by pywebpush, like Topic and Urgency.

Provide a response object that matches the API expected from pywebpush.
bwindels marked this conversation as resolved.
Show resolved Hide resolved
pywebpush expects a synchronous API, while we use an asynchronous API.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks outdated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still serves as something that looks like a response to pywebpush, while actually containing the information to perform the request ourselves once pywebpush finishes. I agree the naming is not ideal, but couldn't come up with something better. Ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I see.


Expand All @@ -363,8 +365,6 @@ class HttpResponseWrapper:
in the background.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it now doesn't really happen "in the background", afaict? it's more that it is deferred until later.

How about:

To keep pywebpush happy we present it with some hardcoded values that
make its assertions pass even though the HTTP request has not yet been
made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I forgot to update this as the async request was fired immediately before but now is deferred to the execute method 👍 Have adopted your suggestion.


Attributes:
deferred (Deferred):
The deferred to await the actual response after calling pywebpush.
status_code (int):
Defined to be 200 so the pywebpush check to see if is below 202
passes.
Expand All @@ -375,5 +375,26 @@ class HttpResponseWrapper:
status_code = 200
text = None
richvdh marked this conversation as resolved.
Show resolved Hide resolved

def __init__(self, deferred):
self.deferred = deferred
def __init__(self, endpoint, data, vapid_headers):
self.endpoint = endpoint
self.data = data
self.vapid_headers = vapid_headers

def execute(self, http_agent, low_priority, topic):
body_producer = FileBodyProducer(BytesIO(self.data))
# Convert the headers to the camelcase version.
headers = {
b"User-Agent": ["sygnal"],
b"Content-Encoding": [self.vapid_headers["content-encoding"]],
b"Authorization": [self.vapid_headers["authorization"]],
b"TTL": [self.vapid_headers["ttl"]],
b"Urgency": ["low" if low_priority else "normal"],
}
if topic:
headers[b"Topic"] = [topic]
return http_agent.request(
b"POST",
self.endpoint.encode(),
headers=Headers(headers),
bodyProducer=body_producer,
)