Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add top_k parameter to rerank endpoint #396

Merged
merged 4 commits into from
Oct 7, 2024

Conversation

aloababa
Copy link
Contributor

@aloababa aloababa commented Oct 4, 2024

This PR adds top_k parameter (number of top documents to return after reranking) to the rerank endpoint.

#342

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR adds a 'top_k' parameter to the rerank endpoint, allowing users to specify the number of top documents to return after reranking, as requested in Issue #342.

  • Added 'top_k' parameter with default value 0 to RerankInput model in libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py
  • Implemented 'top_k' functionality in rerank endpoint in libs/infinity_emb/infinity_emb/infinity_server.py
  • The new parameter aligns with industry standards for reranking APIs, improving flexibility and usability
  • Consider adding input validation for 'top_k' to ensure it's a non-negative integer
  • Recommend updating documentation and adding tests to cover the new functionality

2 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings

Comment on lines 299 to 301
if data.top_k > 0:
data.documents = data.documents[: data.top_k]
scores = scores[: data.top_k]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Add error handling for case where top_k > len(data.documents)

@michaelfeil
Copy link
Owner

Thanks for opening the PR. There need quite some improvements unfortunate.

Align to one of the existing protocols (jina, cohere, voyage, tei), happy to discuss if this is a useful feature.

Expectation would be:

  • implementation in "AsyncEngine", and not infinity_server.py
  • full unit test.
    • Covering int=0, int=1, int=2, int=very large
    • int=0 is a bad default.
    • Covering from engine side and covering from API side.

@codecov-commenter
Copy link

codecov-commenter commented Oct 4, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.56%. Comparing base (2ff39c6) to head (47524b7).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

❗ There is a different number of reports uploaded between BASE (2ff39c6) and HEAD (47524b7). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (2ff39c6) HEAD (47524b7)
2 1
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #396      +/-   ##
==========================================
- Coverage   78.62%   72.56%   -6.06%     
==========================================
  Files          39       39              
  Lines        3008     3011       +3     
==========================================
- Hits         2365     2185     -180     
- Misses        643      826     +183     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@aloababa
Copy link
Contributor Author

aloababa commented Oct 5, 2024

I've updated the PR with the following changes:

  • top_k is now optional and must be greater than 0 on API side.
  • top_k is now implemented in AsyncEngine.
  • Added tests that covers Engine and API.

@pytest.mark.anyio
async def test_reranker_top_k(client):
query = "Where is the Eiffel Tower located?"
documents = [
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential flaw: If backend does not do sorting, top_k=1 will just take always the first result.

Better unit test: Wrap this with a unit test, and return_text. Make sure that topk=1 solution is always paris in this unit test. e.g. use https://docs.python.org/3/library/itertools.html

# someting like:
for return_text in [true, False]
    for raw_score in [True, False]:
          for permutation in itertools.permutation([..paris, ..us, uk]):
               # you test above
```

@michaelfeil
Copy link
Owner

michaelfeil commented Oct 5, 2024

@aloababa Thanks for adding all the changes! One concern, I am not sure if the top_k is working as intended, I think it currently just filters the list, but not doing topk actually ( if it implements sort)

Some extra wish:

Example why this currently does not work:

curl -X 'POST' \
  'https://infinity.modal.michaelfeil.eu/rerank' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "where is paris",
  "documents": [
    "italy", "us", "uk", "france"
  ],
  "return_documents": true,
  "raw_scores": false,
  "model": "mixedbread-ai/mxbai-rerank-xsmall-v1"
}'
{
  "object": "rerank",
  "results": [
    {
      "relevance_score": 0.018829345703125,
      "index": 0,
      "document": "italy"
    },
    {
      "relevance_score": 0.06927490234375,
      "index": 1,
      "document": "us"
    },
    {
      "relevance_score": 0.0240936279296875,
      "index": 2,
      "document": "uk"
    },
    {
      "relevance_score": 0.177001953125,
      "index": 3,
      "document": "france"
    }
  ],
  "model": "mixedbread-ai/mxbai-rerank-xsmall-v1",
  "usage": {
    "prompt_tokens": 71,
    "total_tokens": 71
  },
  "id": "infinity-14b91ebc-f8e2-4bfc-9593-bb11ec7ac95a",
  "created": 1728163265
}

@michaelfeil michaelfeil changed the base branch from main to tmp-top-n October 7, 2024 07:20
@michaelfeil michaelfeil merged commit 7e0b5e5 into michaelfeil:tmp-top-n Oct 7, 2024
27 of 36 checks passed
michaelfeil added a commit that referenced this pull request Oct 7, 2024
* add top_n parameter to rerank endpoint

* RerankInput: top_k as optional + add field validation

* handle top_k parameter in AsyncEngine

* add tests for top_k

* MichaelFeil: add sorting, dtype, to rerank api

* add parametrization for engine

* fix: typing

---------

Co-authored-by: Benjamin Gustin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants