Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"De novo" project creation via API tokens/OIDC #11296

Closed
woodruffw opened this issue May 1, 2022 · 26 comments · Fixed by #12915
Closed

"De novo" project creation via API tokens/OIDC #11296

woodruffw opened this issue May 1, 2022 · 26 comments · Fixed by #12915
Assignees

Comments

@woodruffw
Copy link
Member

There's currently a little bit of friction around bootstrapping API tokens on a new PyPI project:

  1. Prepare a package for a new project
  2. Upload the first version of the project to PyPI using a user-scoped API token (not ideal for long-term actions) or user credentials (shouldn't be used in CI)
  3. Log into PyPI and create a project-scoped token for the new project
  4. Use the new project-scoped token for all future releases

A similar problem will happen with OIDC: we currently require a project to exist before an OIDC publisher can be registered against it, meaning that a user will have to upload at least one release with a user-scoped token or credentials before they can switch to a credential-less workflow.

Some potential solutions:

  1. Allow users to pre-create projects on PyPI before uploading any releases.
    • Pros: Solves the "de novo" problem above.
    • Cons: Malicious users can abuse this as a low-friction way to squat names.
  2. Allow users to pre-register OIDC publishers; on use, the project would be created for the first time.
    • Pros: Solves the "de novo" problem and minimizes squatting potential.
    • Cons: Complicates the OIDC data mode: pre-registering a publisher means either pre-creating the project (same problem as (1)) or creating a "symbolic" link to a nonexistent project (another column). We'd have to be careful about invalidating a publisher if someone else races/beats it to creating a project, so that someone can't quietly takeover a project's releases via OIDC.
  3. Something else?
@woodruffw
Copy link
Member Author

From discussions, I'm currently leaning towards solution (1).

Here's the proposed flow:

  1. An already registered PyPI user who meets certain requirements can "pre-register" a project name via the Warehouse web UI
    • Some potential requirements: the account must be over a week old, must have a verified email, must have 2FA, etc.
  2. Pre-registering is done in a new view + form, where the user specifies the name they'd like to pre-register.
  3. If the name is available and passes PyPI's normal name restriction checks (e.g. for typosquatting) an empty project is created, with the user in the Owner role.
  4. Simultaneously, a timer is begun: if the user does not upload a release to the project within the reservation timespan, the project is deleted entirely and released for anyone to use. This has the added effect of invalidating any tokens that the user might have minted (but not used) for the project, since the project ID (and user role) will no longer match.
    • Additionally, a user who allows the reservation period to expire could be temporarily locked out of other reservations for a short period, like 72 hours.

Additional considerations:

  • No user should be able to pre-register more than a small handful of project names at once. 5 is probably a reasonable restriction, since it'll allow people uploading a handful of new projects to do them all at once.
  • We could consider adding a "cooldown" period (with exponential backoff) for expired registration periods, to prevent people from squatting a name by cycling it between two accounts. This may be more work than it's worth, however, since namesquatting on PyPI is already easier than that.

@woodruffw
Copy link
Member Author

Working on this this week.

@woodruffw woodruffw self-assigned this Nov 1, 2022
@woodruffw
Copy link
Member Author

Just some notes:

We'll probably want the button for this under the project management view (/manage/projects), i.e. somewhere on the top of this page:

Screenshot 2022-11-03 at 5 26 42 PM

...probably right below "Your projects," with language like "Reserve a project name." The button will be inactive if the user isn't eligible to reserve a project name (per the restrictions above).

Clicking that button should, in turn, produce a form modal (not a separate page, since it's pretty much a single input) that includes (1) the name to reserve, and (2) a confirmation checkbox ("I understand that this is a temporary reservation, and that I need to upload a package in the next XX hours to make it permanent"), plus a submit button.

Pressing the button should produce a success or error flash, as appropriate.

@woodruffw
Copy link
Member Author

Current sketch:

image

As-is, I've dropped the plan for the modal: it doesn't add anything to the flow, and makes it less accessible.

@webknjaz
Copy link
Member

webknjaz commented Nov 9, 2022

Hi @woodruffw, it's great that this effort is getting revived. Have you seen the previous discussion and my previous suggestions @ #6378 (comment) + #6378 (comment).

It seems that people in the discussion agreed that it was a good idea, and it seems to partially align with what you're describing here. So I'm mostly just checking to see if you know about it and whether you're planning to incorporate those ideas.

@woodruffw
Copy link
Member Author

I did, thanks! And thanks for cross-referencing as well.

I think the ultimate plan here is close to what you commented in that issue: we aren't planning single-use tokens, and instead we're going to make "reservations" part of the Warehouse UI. I'll have some initial work pushed up soon to demonstrate the rough idea there.

@woodruffw
Copy link
Member Author

Demonstrating the "reserved" chip on a project:

Screenshot 2022-11-10 at 5 14 30 PM

@webknjaz
Copy link
Member

@woodruffw that sounds roughly what I had in mind originally (except for the expiration part). Do you think the Reserve button could also automatically make a token so that it's not a two-step process but a single-step one?

@di
Copy link
Member

di commented Nov 11, 2022

I think we want to limit this to use by OIDC publishers only (see #11272).

@woodruffw
Copy link
Member Author

woodruffw commented Nov 11, 2022

I think we want to limit this to use by OIDC publishers only (see #11272).

I might not be understanding, but I think there's a chicken-and-egg problem with restricting this to OIDC publishers: we don't know which publishers have been registered against with PyPI projects until the project already exists, so we can't temporarily register the project itself via OIDC.

The exception to that would be some kind of "TOFU" scheme, where the user could submit their OIDC token to an endpoint with the project name they'd like to reserve, which would then reserve it and produce a scoped API token. I think that would work, but it would need to be slightly different than the endpoint we've already built in #11272 (since it would need to specify the project being created). But even then we'd need to figure out how to add the owner role to the temporary project, since we don't know (ahead of time) which PyPI users are associated with which OIDC publishers.

@di
Copy link
Member

di commented Nov 15, 2022

I'm thinking that there would be an 'add a publisher' view that has a free-form field for the project name, limited to projects that don't exist -- so OIDCPublishers would be slightly less coupled to the Project than they currently are.

The user creates the publisher, which doesn't necessarily reserve the project name, but lives in a 'pending' state until it enables them to publish successfully, creating a strong relationship between the OIDCPublisher and Project like we have now.

As a result, we don't need 'reservations' or expirations, because if another user (or publisher) successfully publishes it before they do, we can just deactivate/disable all other publishers that are in a 'pending' state for that project name.

So, sort of TOFU, but more like TOFU with a headsup?

@woodruffw
Copy link
Member Author

I think that scheme makes sense!

I'll need to think some more about how to shoehorn this into the existing user <-> projects <-> providers relationship. Currently every OIDCProvider has a many-many Projects relationship, which might be a little cumbersome to refactor.

Some things I'm not sure about yet:

  • How should we handle multiple users adding the same publisher for different uncreated project names? For example, users Alice and Bob might both register release.yml @ foo/bar, but Alice registers it for cool-project-1 while Bob registers it for cool-project-2. Should we honor both registrations, or treat them as first-come-first serve on just the uniqueness of the OIDC token's claims?

  • How do we track which user should ultimately get added to the project? We know the user(s) doing the reservation at the "add a publisher" stage, but when we're consuming the OIDC token all we know is what's in the token. This might require us to add some kind of User relationship for each OIDCProvider, which again complicates the data model (since the point of OIDC providers is to be entirely independent of user identities).

Some random ideas (not fully fleshed out):

  • We could have a PendingOIDCProvider model hierarchy, roughly mirroring the existing OIDCProvider hierarchy. This would have a project name instead of a many-many project relationship, and we could "reify" the pending provider into a full OIDCProvider once a compatible OIDC token is seen.

    • Pros: Minimally invasive, requires minimal refactoring of the current OIDC handling
    • Cons: Structurally complex, requires significant duplication against the existing OIDCProvider hierarchy
  • We could add some kind of pending_creation: str | None field to OIDCProvider, along with a constraint of pending_creation XOR projects. pending_creation would then be the name of the project we'll create once a compatible OIDC token is seen.

    • Pros: Still relatively unintrusive, requires fewer changes than a separate hierarchy.
    • Cons: Still requires us to track the user somehow, also unclear how to handle "duplicate" requests from multiple users

@di
Copy link
Member

di commented Nov 17, 2022

Should we honor both registrations, or treat them as first-come-first serve on just the uniqueness of the OIDC token's claims?

I think they should both be honored? This wouldn't be any different than setting the same publisher for multiple existing projects now.

How do we track which user should ultimately get added to the project?

This should be the user that created the publisher -- probably something we want to track anyways.

Some random ideas (not fully fleshed out):

I think either approach could work. The PendingOIDCProvider seems duplicative, but possibly also safer, as we could never accidentally forget to check the pending_creation field and allow a 'pending' publisher to actually publish something they shouldn't have been able to publish, or forget to clean up 'duplicate' publishers, etc.

@woodruffw
Copy link
Member Author

This should be the user that created the publisher -- probably something we want to track anyways.

Makes sense! Yeah, we should definitely be tracking this.

I think either approach could work. The PendingOIDCProvider seems duplicative, but possibly also safer, as we could never accidentally forget to check the pending_creation field and allow a 'pending' publisher to actually publish something they shouldn't have been able to publish, or forget to clean up 'duplicate' publishers, etc.

Yeah, I think I'm leaning towards the PendingOIDCProvider approach. Thinking about it more, the duplication here won't be too bad -- we can turn the actual claim verification parts (which are shared) into mixin functionality.

@woodruffw
Copy link
Member Author

woodruffw commented Nov 28, 2022

Now that the initial models are merged, I'm going to start work on the corresponding routes and views.

My current plan: a new "Publishing" link under the "Your account" ToC, which goes to a new view just for pending OIDC publishers. This view will closely mirror the extant /project/{project}/settings/publishing/ view, but will be at the account layer instead.

Edit: Something like this:

Screenshot 2022-11-28 at 2 57 43 PM

@woodruffw
Copy link
Member Author

Just to fully enumerate the behavior here, this is how API token exchange with a "pending" OIDC provider will work:

  1. PyPI's token endpoint receives the OIDC JWT;
  2. The JWT's signature is verified;
  3. The PendingOIDCProvider for the JWT's claims is retrieved;

Then, if the pending provider's project does not already exist, it is created, a temporary macaroon is minted for it, and the PendingOIDCProvider is "reified" into a full-fledged OIDCProvider.

If the pending provider's project does already exist, then the PendingOIDCProvider is invalidated, removed, and no temporary macaroon is returned.

@woodruffw
Copy link
Member Author

woodruffw commented Dec 8, 2022

From discussion with @di, there are some additional constraints we'll need to impose to limit the potential for abuse from "pending" OIDC providers:

  1. In addition to the user restrictions described above (needs 2FA, verified email, etc.), we should limit the number of "pending" OIDC providers that any given user should be able to register at once. This will limit any individual user's ability to "grief" other users.
  2. Each "pending" OIDC provider's "provider-unique" state needs to remain unique. For GitHub, that means that we need to retain the UniqueConstraint on repository_name, repository_owner, workflow_filename. This is more a consequence of the data model than a security property: we need to map a OIDC JWT to its pending provider, and then to exactly one project for creation.
  3. Finally, as a consequence of (2): we need to present users with a big warning when pending OIDC provider registration fails, if the failure reason is that someone else has already registered a pending OIDC provider with the same "provider-unique" state. This will prevent users from accidentally activating a pending OIDC provider that they didn't actually register, giving some other user control over a newly created package.

Finally: we should not give pending OIDC providers an expiration window: doing so would would allow an attacker to place a reservation as soon as the legitimate user's reservation has expired, leading to a confusing situation where the legitimate user believes they're creating a project under their own account but are actually creating it under the attacker's account.

Edit: With these constraints, the PendingOIDCProvider.project_name does not have to be unique: multiple users can attempt to register for the same nonexistent PyPI project name using different OIDC providers, and the first one to actually submit an OIDC token is the winner.

Edit: These constraints have been added to #12646.

@woodruffw
Copy link
Member Author

#12646 is the the penultimate component of this: it adds the views, routes, forms, etc. needed to register and manage "pending" OIDC providers.

I'll do a follow-up PR for the final part, which will be validating OIDC JWTs using those "pending" providers similarly to the current functionality for "normal" OIDC providers.

@woodruffw
Copy link
Member Author

I'll open the final PR to close this out after #12646 is merged.

Here's an enumeration of its behavior when mint-token is used with an OIDC token linked to a pending OIDC provider:

  1. If the pending provider's project name already exists, fail (since the project has already been created, possibly by an unrelated party).
    • Remove the pending OIDC provider in the process, since it is now invalid.
  2. If the pending provider's project name does not exist, create the project.
    • We need to do this up-front in order to create an appropriate Macaroon caveat for its token.
    • Convert the pending provider into a full provider, and attach it to the newly created project.
    • Perform the token minting as normal, now that we have both a project and a full OIDC provider.

@woodruffw
Copy link
Member Author

woodruffw commented Jan 17, 2023

Oh, and some miscellaneous cleanup/usability tasks:

  1. The provider pages should be inverted to list all existing providers, followed by the form(s) to add new providers.
  2. All OIDC providers (both pending and normal) should be listed under manage/account/publishing, not just pending ones.
  3. Possibly unifying the creation of both provider types under manage/account/publishing, with the correct one selected contextually (dropdown for normal providers, freeform text input for pending providers).
  4. The admin app should provide a list of pending providers, at minimum, and possibly some functionality for administering them.

@di
Copy link
Member

di commented Jan 25, 2023

If the pending provider's project name already exists, fail (since the project has already been created, possibly by an unrelated party).
Remove the pending OIDC provider in the process, since it is now invalid.

This should probably just happen at upload time, when the project is created by the unrelated third party (and we should probably notify the owner of the pending provider)

@woodruffw
Copy link
Member Author

This should probably just happen at upload time, when the project is created by the unrelated third party (and we should probably notify the owner of the pending provider)

Makes sense!

@webknjaz
Copy link
Member

I think we want to limit this to use by OIDC publishers only (see #11272).

What about non-OIDC publishers? I think, it'd make sense to still support my original idea for those. Probably as a disconnected effort, though.

@woodruffw
Copy link
Member Author

Status update: starting work on those follow-on items now.

@webknjaz
Copy link
Member

@woodruffw when will this be available for early testing? I'd like to try it out and I hope to introduce it into the pypi-publish GitHub Action too.

@woodruffw
Copy link
Member Author

@webknjaz Very soon! You should register interest on #12965 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants