Freeway Auth Token Implementation #118

travis · 2024-10-04T06:04:48Z

travis
Oct 4, 2024
Maintainer

Introduction 😄

Per storacha/project-tracking#140, https://github.com/storacha/RFC/blob/feat/egress-billing/rfc/egress-billing.md and #109 we are adding the notion of "auth tokens" to freeway. Once the implementation is complete, users of the w3s.link gateway will need to include an "authorization token" with their request or be subject to fairly strict rate limits. These rate limits are necessary because Storacha currently pays all costs associated with the egress of data through our gateway, and this is not financially sustainable.

Problem 😞

According to the Egress Billing RFC, auth tokens should be created by including them in the pol field of a delegation. This effectively creates a restriction on the UCAN invocations used to authorize data egress out of our backends, and is a very flexible and extensible mechanism for limiting egress now and in the future. Unfortunately, pol is a feature of the UCAN 1.0 specification, and does not exist in the version that we use. As a result, we need to figure out where to record and track auth tokens in the short-to-medium term, before we upgrade to UCAN 1.0. We suggest two possible paths forward.

Possible Solutions 💁

1. Include an auth token in the `nb` field of "content claim" delegations

The proposed delegations to the gateway look like this (some fields elided for clarity) :

{
  "iss": "did:key:zAlice",
  "aud": "did:web:w3s.link",
  "exp": 1716235987 // restrict to a month
  "cmd": "/assert/location",
   "sub": "did:web:asia.web3.storage",
  "pol": [
    // Request URL must have a query string that includes "token=zrptvx"
    ["==", ".query.token", "zrptvx"]
  ],
}

This says, effectively, that did:web:w3s.link is authorized to invoke /assert/location on did:web:asia.web3.storage AS LONG AS it includes the "auth token" zrptvx.

Contrast this with the current implementation of content claims, described by the following schema:

export const location = capability({
  can: 'assert/location',
  with: URI.match({ protocol: 'did:' }),
  nb: Schema.struct({
    /** CAR CID */
    content: linkOrDigest(),
    location: Schema.array(URI),
    range: Schema.struct({
      offset: Schema.integer(),
      length: Schema.integer().optional()
    }).optional()
  })
})`

One possible solution to the current problem is to add an optional token field to the nb struct. The upload service will then need to generate location claims targeted at the gateway that include this new field and save them (using the currently in-progress decentralized indexing service), and the gateway will find these new claims when it queries for content claims relevant to a request for a particular CID.

Benefits

This is probably the closest we can come to the implementation proposed in https://github.com/storacha/RFC/blob/feat/egress-billing/rfc/egress-billing.md
No need for any new services beyond the ones we are already implementing
We already query for content claims in the gateway, so this likely comes with very little performance penalty

Drawbacks

This will need to be migrated, along with the rest of our stack, as part of the UCAN 1.0 upgrade
Origin restrictions and other similar user-facing functionality would require a similar hack
We'll need to create a separate content claim per-CID per-auth token (ie, we can't support https://github.com/storacha/RFC/blob/feat/egress-billing/rfc/egress-billing.md or any sort of "bulk configuration" in general
We'll need to add support for this temporary hack to, at least, the following APIs, and likely others, and then remove/update it later:
a. https://github.com/storacha/content-claims/blob/main/packages/core/src/client/api.ts#L5
b. https://github.com/storacha/content-claims/blob/main/packages/core/src/capability/assert.js#L13
c. https://github.com/storacha/blob-fetcher/blob/main/src/api.ts#L29
Generally we get all the downsides of needing to update our capabilities in a variety of places without many of the benefits

2. Implement a temporary "auth token" service

In this solution we'd add a table (probably in Dynamo?) to store "auth tokens" generated by our clients. The gateway would make a request to a lightweight (HTTP?) service to determine whether an auth token is valid - if it is, the request would be served with no rate limits, if not it would be subject to normal CID rate limits.

Benefits

This matches the conceptual model we are proposing for the preliminary implementation of egress billing
This can be fairly easily extended to support origin restrictions and whatever other user needs we discover through this process
This does not expand the footprint of the UCAN 1.0 upgrade project
This trivially supports bulk auth use cases
This requires no changes to existing APIs

Drawbacks

This is a "web 2.0" style solution that only works because we control the gateway (though other people could run a gateway that did effectively the same thing and pointed at their own auth token service)
This requires the implementation of a temporary service that will likely be thrown away within the next 6-12 months
This requires a request to a new service, which could add latency to read requests (though it can be done concurrently with the content claims query and is likely to return as fast if not faster, and is fairly easily cached for at least short periods)

Recommendations 🧙

Having rolled this around in my head for a few days, I'm partial to option (2) - I think it's conceptually simpler, easier to implement, and gives us more flexibility to iterate on the design of the authorization token and rate limiting system. Both systems likely require the implementation of a new auth-token/create capability that will allow users to create new auth tokens, but the implementation (especially for "bulk" use-cases) of this capability feels much simpler with option (2). While it would be nice to move closer to the eventual steady-state "decentralized" implementation of this functionality, I'm not sure that comes with many benefits at the moment, and the migration from an auth token "service" feels like less work than migrating the "hacky" location claim UCAN v.current implementation to the pol-based UCAN v1.0 service of the future.

I can definitely be convinced that it's worth going with (1) - would love to hear arguments in that direction!

One final note - either of these should support "private" data equally well - I believe that is effectively orthogonal to this conversation - once we have the content claims indexing service up and running the existence of any content claim for a particular CID delegated to the gateway will determine whether a CID is considered "public" by the gateway. I could be wrong here, and if I am I do think that's potentially a strong argument to go with (1).

travis · 2024-10-07T12:16:49Z

travis
Oct 7, 2024
Maintainer Author

I think the discussion here supercedes this and points to a better way to implement this - closing!

storacha/project-tracking#138 (comment)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Freeway Auth Token Implementation #118

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Freeway Auth Token Implementation #118

travis Oct 4, 2024 Maintainer

Introduction 😄

Problem 😞

Possible Solutions 💁

1. Include an auth token in the nb field of "content claim" delegations

Benefits

Drawbacks

2. Implement a temporary "auth token" service

Benefits

Drawbacks

Recommendations 🧙

Replies: 1 comment

travis Oct 7, 2024 Maintainer Author

travis
Oct 4, 2024
Maintainer

1. Include an auth token in the `nb` field of "content claim" delegations

travis
Oct 7, 2024
Maintainer Author