-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: converting Pinecone
metadata fields from float back to int
#1034
fix: converting Pinecone
metadata fields from float back to int
#1034
Conversation
Pinecone
metadata fields from float back to int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic looks good.
I found some small opportunities for improvement.
Let's also add a small unit test for the new class method.
integrations/pinecone/src/haystack_integrations/document_stores/pinecone/document_store.py
Outdated
Show resolved
Hide resolved
@@ -257,3 +260,59 @@ def test_embedding_retrieval(self, document_store: PineconeDocumentStore): | |||
assert len(results) == 2 | |||
assert results[0].content == "Most similar document" | |||
assert results[1].content == "2nd best document" | |||
|
|||
@pytest.mark.skipif("PINECONE_API_KEY" not in os.environ, reason="PINECONE_API_KEY not set") | |||
def test_sentence_window_retriever(self, sleep_time): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this test could be made much shorter if we use the document_store
fixture, as done for example in test_write_documents_duplicate_overwrite
.
The fixture should take care of creating and deleting the document store, automatically sleep after writing docs, etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point 👍🏽 thanks
Co-authored-by: Stefano Fiorucci <[email protected]>
Do you think an extra unit test is necessary? It's a private function, and testing its functionality is already covered by other test |
I'll leave it up to you; I'd like to have it. |
|
integrations/pinecone/src/haystack_integrations/document_stores/pinecone/document_store.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some small changes.
I'll merge if the tests pass.
) * Pinecone converting floats back to int * linting * simplifying tests * fixing tests * fixing test * linting issues * adding and sleep between inserting and querying * cleaning the indexes after tests ran * ruff fix * fixing tests * Update integrations/pinecone/tests/test_document_store.py Co-authored-by: Stefano Fiorucci <[email protected]> * changing docstring for private utility function * adding unit tests to private function * fixing linting * fixing linting according to black rules * fixing linting * removing time sleep, it's contained in the fixture * fixes --------- Co-authored-by: Stefano Fiorucci <[email protected]>
Proposed Changes:
Pinecone converts all meta numbers in the meta field to float (https://docs.pinecone.io/guides/data/filter-with-metadata). This causes the
SentenceWindowRetriever
to crash completely.This PR checks if the metadata values are floats and convert them back to integers making
Pinecone
supported by theSentenceWindowRetriever
How did you test it?
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.