Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search API to search over keywords #3436

Open
mitar opened this issue Mar 27, 2018 · 13 comments
Open

Search API to search over keywords #3436

mitar opened this issue Mar 27, 2018 · 13 comments
Labels
APIs/feeds blocked Issues we can't or shouldn't get to yet feature request help needed We'd love volunteers to advise on or help fix/implement this. search Elasticsearch, search filters, and so on

Comments

@mitar
Copy link

mitar commented Mar 27, 2018

With planned deprecation of XML-RPC API it seems there will no way to search packages by their keywords. So I would like to do a feature request for this for the new API.

@brainwane brainwane added help needed We'd love volunteers to advise on or help fix/implement this. feature request search Elasticsearch, search filters, and so on APIs/feeds labels Mar 27, 2018
@brainwane brainwane added this to the 6. Post Legacy Shutdown milestone Mar 27, 2018
@brainwane
Copy link
Contributor

@mitar Thank you for bringing this up! We will only remove the XML-RPC API when its functionality is covered by other new APIs, so I have put this in a future milestone, and I've tagged it so people will see it when looking at API issues. Thanks again.

@yuvalreches
Copy link

Hey @brainwane
Looking at your roadmap and this thread, do I understand correctly that the old XML-RPC API will remain operational in the near future under pypi.org as well?

Currently when performing pip search <packageName> under pypi.python.org we get HTTP 302 and being redirected to pypi.python.org/pypi

However when performing the same (pip search) under pypi.org we get HTTP error 404 while getting http://pypi.org/RPC2

Isn't it supposed to be redirected to the old API for now?
Thanks

@di
Copy link
Member

di commented Apr 8, 2018

@yuvalreches:

Looking at your roadmap and this thread, do I understand correctly that the old XML-RPC API will remain operational in the near future under pypi.org as well?

Yes, the XML-RPC API will remain for now.

Currently when performing pip search <packageName> under pypi.python.org we get HTTP 302 and being redirected to pypi.python.org/pypi

Correct, the 302 redirect from https://pypi.python.org/ to https://pypi.python.org/pypi doesn't exist for pypi.org.

However when performing the same (pip search) under pypi.org we get HTTP error 404 while getting http://pypi.org/RPC2

I'm not sure exactly what index URL you're using here, but you should be using https://pypi.org/pypi:

$ pip search foobar -vvv --index https://pypi.org/pypi
Starting new HTTPS connection (1): pypi.org
https://pypi.org:443 "POST /pypi HTTP/1.1" 200 272
foobar (1.1)         - This is the FooBar  project. (foo is taken at PyPI -
                       hahaha)
django-foobar (1.0)  - Super awesome portable module for Django

@yuvalreches
Copy link

Hey @di

I get your point, but why redirect to an endpoint that doesn't exist?

Would it be possible to have the same redirect as https://pypi.python.org/ has for the time being (until XML-RPC v1 is deprecated)?

Meaning redirect from https://pypi.org/ to https://pypi.org/pypi instead of https://pypi.org/RPC2

It would help us a lot

@di
Copy link
Member

di commented Apr 9, 2018

I get your point, but why redirect to an endpoint that doesn't exist?

I'm not sure I follow. There is nothing on pypi.org that is redirecting to an endpoint that doesn't exist. The base URL for link is just misconfigured, e.g. it's using 'https://pypi.org' + '/' + 'RPC2' instead of 'https://pypi.org/pypi' + '/' + 'RPC2', where 'RPC2' is just the search query.

Would it be possible to have the same redirect as https://pypi.python.org/ has for the time being (until XML-RPC v1 is deprecated)?

Meaning redirect from https://pypi.org/ to https://pypi.org/pypi instead of https://pypi.org/RPC2

I don't think this is necessary and we're unlikely to add it. Your client should just use the correct search index instead.

@yuvalreches
Copy link

Sorry, my bad.
You are right - the redirect to RPC2 happens upon pip search <package> -i https://pypi.org instead of using pip search ... -i pypi.org/pypi
It does look strange to me to have a redirect to a page that doesn't exist.

I'll explain our scenario:
JFrog Artifactory performs the search to PyPI in case a repository is pointing at it.
The registry url is configured by default (for the past several years) as https://pypi.python.org
and upon each search - the request goes to this url.

When sending the search request (POST method) we rely on the redirect and perform the search on the path redirected to.

Now with the new registry version we don't get that redirect and the search fails.

Easiest way to see it:
curl -XPOST -i https://pypi.python.org results inHTTP/2 302 location: https://pypi.python.org/pypi
Which is the desired behaviour.

However in the warehouse
curl -XPOST -i https://pypi.org results inHTTP/2 405

Setting the same redirect in the warehouse will allow Artifactory users to keep using the search function, without anything breaks when the full redirect of pypi.python.org to pypi.org takes place.

@di
Copy link
Member

di commented Apr 9, 2018

@yuvalreches After chatting the other maintainers, we realized that the attempt to use /RPC2 is standard behavior for most XML-RPC clients when the root URL does not support XML-RPC. We decided to add this endpoint to Warehouse to duplicate the /pypi endpoint, which should fix your issues once merged (#3594).

However, Artifactory should probably still attempt to use the correct XML-RPC endpoints (either /pypi or /RPC2) when using the XML-RPC API, rather than allowing your client to test the root domain, then fall back on /RPC2 as this is technically generating unnecessary requests that are likely slowing down the responsiveness of these requests for your users.

@yuvalreches
Copy link

yuvalreches commented Apr 10, 2018

Thank you @di

Currently we want to make sure Artifactory instances won't be broken due to the changes implemented in Warehouse, and the /pypi redirect will assure that :)

I see your PR is already merged, when can we expect to see the change in pypi.org?

We sure have that in our roadmap.
Also please feel free to reach out again when a RPC2 beta is operational so we can implement the changes needed in Artifactory.

@di
Copy link
Member

di commented Apr 10, 2018

Currently we want to make sure Artifactory instances won't be broken due to the changes implemented in Warehouse, and the /pypi redirect will assure that.

I think we are having communication issues. Let me be clear: there will not be a redirect from / to /pypi in Warehouse.

What we have added is an endpoint at /RPC2 that your client should be able to use. We have not added any redirects.

I see your PR is already merged, when can we expect to see the change in pypi.org?

It is live now.

@yuvalreches
Copy link

Artifactory's code only relies on the Location header that is returned upon POST pypi.org (don't mind if its /pypi or /rpc2)

Would it be possible to change the response of such requests?
I see it still returns HTTP/2 405 instead of HTTP/2 302 location: https://pypi.org/rpc2

Such redirect (as exists on pypi.python.org, however to /pypi) will assure nothing will break on our end.

@mitar
Copy link
Author

mitar commented Apr 18, 2018

Is new API location really the same as old one? I am sure I was getting results before for the following query, but now it is returning empty:

client = xmlrpc.ServerProxy('https://pypi.python.org/pypi')
client.search({'keywords': 'd3m_primitive'})

Or do packages have to be (re)published for them to be visible through the new API location?

@ewdurbin
Copy link
Member

@mitar we're tracking issues with our search index becoming empty (bug) at #3746, and issues with search oddities in general at #3717

@di
Copy link
Member

di commented Apr 19, 2018

More or less blocked on #284.

@di di added the blocked Issues we can't or shouldn't get to yet label Apr 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APIs/feeds blocked Issues we can't or shouldn't get to yet feature request help needed We'd love volunteers to advise on or help fix/implement this. search Elasticsearch, search filters, and so on
Projects
None yet
Development

No branches or pull requests

5 participants