Determine new API URL structure for warehouse (starting with new JSON API) #284

ctheune · 2014-04-15T14:49:28Z

At the PyCon2014 sprint I have started to make bandersnatch easier to cache. This means moving away from XML-RPC in general.

I'm leveraging the existing /pypi//json API which already helps, but I'll need two more endpoints:

get a list of all packages and their most recent serial
get the changelog

I implemented the necessary code on a branch for PyPI:
https://bitbucket.org/ctheune/pypi/branch/ctheune-bandersnatch-json

However, I don't wanna force this through but have a decision how the URLs should look like.

Ideally we can implement this in both warehouse and PyPI in a way that bandersnatch can support both of them without breaking when you guys switch the public server (and I might be on vacation. ;) )

dstufft · 2014-04-15T15:11:34Z

So I have some ideas on both a new API for accessing data about PyPI and also some rough ideas for a new mirroring API in general. I'll take a look at what you have so far.

dstufft · 2014-04-15T16:02:51Z

So if I read your PR correctly, the new URLs would be https://pypi.python.org/json/changes and https://pypi.python.org/json/packages? If that's the case then I'm not really a big fan for adding those in Warehouse.

Ideally what i'd like to do is get a nice hypermedia based API setup probably rooted at /api/. Using something based on https://jsonapi.org/ is a possibility. There are a few options and need to dive into it to figure out what exactly needs done. Ideally the new API will also replace the existing JSON api and we can deprecate the old JSON api (but leave it in place until (or if!) it's no longer getting traffic.

r1chardj0n3s · 2014-04-15T16:09:19Z

I echo @dstufft in this. The question I have is whether we go all the way to /api/v0/ to future-proof us a little too. Unless we'd be happy with /api-v1/ or similar later on?

dstufft · 2014-04-15T16:12:40Z

So there are two ways to deal with that, one way is to version using the content type, so it's always /api/ but it'll select the version based on the content type, github uses this like: Accept: application/vnd.github.beta+json or Accept: application/vnd.github.v3+json. The other way is to do /api/v0/ etc. I lean towards using the content type but we'll need to figure out in general how we want to handle versioning going forward and how the code to handle that looks like.

steveklabnik · 2014-04-15T16:22:08Z

Just let me know if I can help out at all regarding JSON API stuff.

brainwane · 2018-01-24T20:52:36Z

As I understand it, this issue (designing and implementing a new Warehouse API) is a prerequisite for integrating twine into pip and thus dealing with pypa/packaging-problems#76 and pypa/packaging-problems#60 , per @dstufft's comment in pypa/twine#127. Is that correct? If so, I'd suggest we add this to one of our upcoming milestones.

brainwane · 2018-01-30T01:06:57Z

We talked about this issue in today's bug triage meeting and folks explained to me: Even though this may be necessary for some Twine improvements, this is not a ticket we will address before launch. This is a new feature and is best suited for post-launch; Warehouse needs to be done before we can improve twine.

phildini · 2018-02-11T16:04:17Z

Hello! Quick call-out that some of us would really enjoy the JSON API containing info like owners/maintainers before the XMLRPC API is shut down. See ticket linked right above. Cheers, thanks for all your work!

brainwane · 2018-03-06T20:01:37Z

I've marked #2914 as something we should address before shutting down legacy PyPI, but developing the structure for the new API can wait till after we shut down the legacy site.

brainwane · 2018-03-12T17:08:30Z

As we develop the new API we should consider #347 as well. And I've added this issue to the list of things we might work on at the PyCon sprints.

theacodes · 2018-04-17T22:02:51Z

I would also love an API for managing both my account and my projects. For some examples of where this is useful:

We have an account that owns all of the projects our organization publishes. I want to rotate its password every week.
Likewise, I want to audit all of my organization's project and verify that no more than n people have admin access to it.
I am actually in the process of migrating all of my projects in my personal account to a new account. It would be cool to do that programmatically.

I'm happy to help with the design and discussions around this (my day job is helping design APIs and implement clients for Google Cloud Platform).

di · 2018-04-17T22:10:38Z

I am actually in the process of migrating all of my projects in my personal account to a new account. It would be cool to do that programmatically.

We could probably just call this "the ability to add/remove collaborators via API" I think, since actual account migration is probably not something that happens very often.

dstufft · 2018-04-17T22:16:07Z

I'm hoping to carve out some ideas on this soon, maybe next week? Ideally the output of this ticket is the basic framework/skeleton of the API, and then further tasks can extend the functionality of it.

Defining APIs for PyPI is a tad bit trickier than the general case, because we generally have to design for a decade+ (for instance, XMLRPC got added, but it has not aged or scaled well! From my investigations so far, GraphQL would be a similar mistake). I'm almost certain that something Hypermedia based is the way forward here, but there's a lot of different ways to take that, we'll also need to be ensure to include all of the typical scaling things like pagination and the like.

theacodes · 2018-04-17T22:19:56Z

We could probably just call this "the ability to add/remove collaborators via API" I think, since actual account migration is probably not something that happens very often.

Yep just calling a a specific use case.

I'm hoping to carve out some ideas on this soon, maybe next week? Ideally the output of this ticket is the basic framework/skeleton of the API, and then further tasks can extend the functionality of it.

Sounds good, happy to review and be around to bounce ideas off of (I'm on IRC during PST working hours as thea).

Hypermedia based is the way forward here, but there's a lot of different ways to take that, we'll also need to be ensure to include all of the typical scaling things like pagination and the like.

Agreed - REST/JSON (and to some extent RPC/JSON) has more or less stood the test of time (in tech years, at least). Happy to provide feedback on that sort of stuff as well.

dstufft · 2018-05-18T20:18:48Z

@theacodes I guess I'm just not seeing what an IDL actually gets us here? The example of JSON Hyper Schema has JSON-Schema as part of it, it's just instead of your client hardcoding URLs and actions all over the place, it can discover them at runtime. You can also ship them as part of your client so that a network access isn't required in the common case (unless you introduce a new schema).

dstufft · 2018-05-18T20:22:57Z

@theacodes If it would be helpful, I'm happy to jump on a call or into IRC to go over the two things in a higher bandwidth setting instead of throwing github comments back and forth. I feel like there are probably some misconceptions on both sides about RPC and Hypermedia, and perhaps a higher bandwidth mechanism would help to work out what those are?

theacodes · 2018-05-22T15:59:24Z

I don't want to hold up progress. I would love to see a design doc or proof of concept if/when we have one.

asmacdo · 2018-05-25T02:34:23Z

POC is up #4078, I'm using this etherpad to document design proposal.
https://pad.sfconservancy.org/p/hypermedia_api_design

I did my best to incorporate the ideas discussed here, as well as in person discussions at pycon. I've set aside some time to keep working on this, so all feedback is welcome.

Following discussion with team, the xmlrpc api is not deprecated today. It will not disappear soon. Also, as: - parsing the legacy html api [1] is considered bad practice - discussions exist to create equivalent apis to their deprecated/legacy apis [1] [2] We chose to implement the xmlrpc one. [1] https://warehouse.readthedocs.io/api-reference/legacy/#simple-project-api [2] pypi/warehouse#284 [3] pypi/warehouse#4078 Related T422

@werwty

Adds a new API that covers the usage of the XMP-RPC and the simple api. pypi#284 This work is intended as a proof of concept for how a hypermedia API could be implemented, setting up the patterns that can be extended to cover the rest of the API. The new API introduces pagination to reduce the load for list views. Serializers are used to increase maintainability and code reuse. Some filtering is added to meet the use cases of XML-RPC. Many thanks to @werwty for hacking out an initial implementation which has been squashed. Introduces new dependencies: apispec==0.37.0 : Used to generate an api spec at /api/ marshmallow==3.0.0b10 : Used to serialize responses PyYAML==3.12 : Dependency of apispec All new endpoints are added to a new domain, "sandbox". Note: Locally, all subdomains were treated just like the actual domain so I was unable to make the subdomain works as expected. I followed the pattern that forklift uses, and guessed how it should work.

brainwane · 2020-06-02T17:09:32Z

Per discussion in IRC just now -- the author closed #4078 last year, and it's unclear whether this kind of feature would be welcome if someone were to try again at implementing it.

brainwane · 2020-06-02T17:10:48Z

Maintainers' opinions are welcome. Also, in my opinion, it would be easier to finalize design and implementation, test and review, and deploy this if we had funding for it.

brainwane · 2020-11-09T15:06:11Z

@asmacdo that Etherpad has now dissolved and reset -- did you keep a copy of your design proposal anywhere else?

Reminder to others that work on this could probably use funding.

asmacdo · 2020-11-10T21:41:34Z

@brainwane unfortunately I dont have a backup , but the PR could still be distilled down into a design proposal.

Key points:

Create a resource based, hypermedia API
Use Marshmallow to serialize
use apispec to generate OpenAPI schema

Additional necessary features

Pagination
Filtering
CDN caching (consider how pagination/filtering will affect)

di · 2022-06-26T22:12:46Z

Bit of a related update here: PEP 691 has been accepted.

r1chardj0n3s changed the title ~~Decide for better json API URL structure~~ Determine new API URL structure for warehouse (starting with new JSON API) Apr 15, 2014

This was referenced Jun 6, 2016

Provide a way to get a list of all trove classifiers #1241

Closed

JSON API for Trove Classifiers #1244

Closed

nlhkabu added the requires triaging maintainers need to do initial inspection of issue label Jul 2, 2016

brainwane added feature request needs discussion a product management/policy issue maintainers and users should discuss labels Jan 24, 2018

brainwane added this to the 5: Shut Down Legacy PyPI milestone Jan 30, 2018

ewdurbin mentioned this issue Feb 11, 2018

API providing contributor data #2914

Open

brainwane added the APIs/feeds label Feb 12, 2018

brainwane modified the milestones: 5: Shut Down Legacy PyPI, 6. Post Legacy Shutdown Mar 6, 2018

brainwane mentioned this issue Mar 18, 2018

Python package index upload API spec pypa/packaging-problems#128

Open

brainwane mentioned this issue Apr 11, 2018

mobile app #3623

Open

This was referenced Apr 19, 2018

Ordering of releases in JSON API #3650

Closed

Search API to search over keywords #3436

Open

Package update feed #2165

Closed

asmacdo mentioned this issue May 25, 2018

Add hypermedia API to replace XML-RPC and simple #4078

Closed

nlhkabu removed the Europython-2018-sprint label Jul 31, 2018

brainwane mentioned this issue Aug 10, 2018

Add API endpoint to get latest version of all projects #347

Open

di mentioned this issue Dec 26, 2018

Add API to search for packages matching a pattern #5231

Open

brainwane mentioned this issue Jun 27, 2019

Update to new PyPI API myint/yolk#16

Open

di mentioned this issue Oct 8, 2019

API feature to request new project-scoped upload API token #6396

Open

di mentioned this issue Jan 2, 2020

Allow querying RSS feeds for items published since some time #7116

Open

uranusjr mentioned this issue May 29, 2020

add search subcommand pypa/pipx#249

Closed

brainwane mentioned this issue Jun 10, 2020

Request: PEP to describe current Warehouse JSON API pypa/packaging-problems#367

Open

7 tasks

uranusjr mentioned this issue Aug 3, 2020

a search --strict option to search only for an exact package name pypa/pip#7985

Closed

di mentioned this issue Sep 16, 2020

Publish a list of malicious packages that have been taken down #4703

Open

pradyunsg mentioned this issue Nov 19, 2020

[2020-resolver] Pip downloads lots of different versions of the same package pypa/pip#8713

Closed

di mentioned this issue Mar 2, 2021

Add cursor to updates RSS feed. #9155

Closed

di mentioned this issue Mar 11, 2021

browse(classifiers) without XML-RPC #9156

Closed

xmunoz mentioned this issue Apr 23, 2021

New download API for PyPI psf/fundable-packaging-improvements#22

Open

abitrolly mentioned this issue Sep 1, 2021

REST API is missing package_roles info from XML-RPC #9700

Open

di removed the requires triaging maintainers need to do initial inspection of issue label Jun 26, 2022

pradyunsg mentioned this issue Oct 7, 2022

"pip search" should use PackageFinder, not XML-RPC API pypa/pip#395

Closed

di mentioned this issue Nov 6, 2022

RSS feeds and JSON APIs don't provide the capabilities of Mirroring Support that XML-RPC does #12488

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine new API URL structure for warehouse (starting with new JSON API) #284

Determine new API URL structure for warehouse (starting with new JSON API) #284

ctheune commented Apr 15, 2014

dstufft commented Apr 15, 2014

dstufft commented Apr 15, 2014

r1chardj0n3s commented Apr 15, 2014

dstufft commented Apr 15, 2014

steveklabnik commented Apr 15, 2014

brainwane commented Jan 24, 2018

brainwane commented Jan 30, 2018

phildini commented Feb 11, 2018

brainwane commented Mar 6, 2018

brainwane commented Mar 12, 2018

theacodes commented Apr 17, 2018

di commented Apr 17, 2018

dstufft commented Apr 17, 2018

theacodes commented Apr 17, 2018 •

edited

Loading

dstufft commented May 18, 2018

dstufft commented May 18, 2018

theacodes commented May 22, 2018

asmacdo commented May 25, 2018 •

edited

Loading

brainwane commented Jun 2, 2020

brainwane commented Jun 2, 2020

brainwane commented Nov 9, 2020

asmacdo commented Nov 10, 2020

di commented Jun 26, 2022

Determine new API URL structure for warehouse (starting with new JSON API) #284

Determine new API URL structure for warehouse (starting with new JSON API) #284

Comments

ctheune commented Apr 15, 2014

dstufft commented Apr 15, 2014

dstufft commented Apr 15, 2014

r1chardj0n3s commented Apr 15, 2014

dstufft commented Apr 15, 2014

steveklabnik commented Apr 15, 2014

brainwane commented Jan 24, 2018

brainwane commented Jan 30, 2018

phildini commented Feb 11, 2018

brainwane commented Mar 6, 2018

brainwane commented Mar 12, 2018

theacodes commented Apr 17, 2018

di commented Apr 17, 2018

dstufft commented Apr 17, 2018

theacodes commented Apr 17, 2018 • edited Loading

dstufft commented May 18, 2018

dstufft commented May 18, 2018

theacodes commented May 22, 2018

asmacdo commented May 25, 2018 • edited Loading

brainwane commented Jun 2, 2020

brainwane commented Jun 2, 2020

brainwane commented Nov 9, 2020

asmacdo commented Nov 10, 2020

di commented Jun 26, 2022

theacodes commented Apr 17, 2018 •

edited

Loading

asmacdo commented May 25, 2018 •

edited

Loading