-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AzureSearch vectorstore does not work asyncronously #24064
Labels
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
Ɑ: vector store
Related to vector store module
Comments
dosubot
bot
added
Ɑ: vector store
Related to vector store module
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
labels
Jul 10, 2024
isahers1
added a commit
that referenced
this issue
Jul 12, 2024
…24081) Thank you for contributing to LangChain! **Description**: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` **Issue**: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 **Dependencies**: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <[email protected]>
isahers1
pushed a commit
that referenced
this issue
Aug 13, 2024
**Description** Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from [this commit](ffe6ca9) create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue #24740. To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators. @chrislrobert said in [this comment](#24740 (comment)) that there was a still a flaw due to `with` blocks that close the client after each call. I removed this `with` blocks in the `async_client` following the same pattern as the sync `client`. In order to close up the connections, a __del__ method is included to gently close up clients once the vectorstore object is destroyed. **Issue:** #24740 and #24064 **Dependencies:** No new dependencies for this change **Example notebook:** I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well. ![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67) **Lint and test**: Passes the tests and the linter
olgamurraft
pushed a commit
to olgamurraft/langchain
that referenced
this issue
Aug 16, 2024
…-ai#24921) **Description** Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from [this commit](langchain-ai@ffe6ca9) create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue langchain-ai#24740. To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators. @chrislrobert said in [this comment](langchain-ai#24740 (comment)) that there was a still a flaw due to `with` blocks that close the client after each call. I removed this `with` blocks in the `async_client` following the same pattern as the sync `client`. In order to close up the connections, a __del__ method is included to gently close up clients once the vectorstore object is destroyed. **Issue:** langchain-ai#24740 and langchain-ai#24064 **Dependencies:** No new dependencies for this change **Example notebook:** I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well. ![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67) **Lint and test**: Passes the tests and the linter
dosubot
bot
added
the
stale
Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
label
Oct 9, 2024
dosubot
bot
removed
the
stale
Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
label
Oct 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
Ɑ: vector store
Related to vector store module
Checked other resources
Example Code
I am trying to use the Azure AI Search Vectorstore and retriever and the vectorstore and retriever (given from the vectorstore) work perfectly when doing retrieval of documents using the syncronous methods but gives an error when trying to run the async methods.
Creating the instances of embeddings and Azure Search
Syncronous methods working and returning documents
Asyncronous methods working and returning documents
Error Message and Stack Trace (if applicable)
KeyError Traceback (most recent call last)
Cell In[15], line 1
----> 1 await vectorstore.avector_search("what is the capital of France")
File ~/.local/lib/python3.11/site-packages/langchain_community/vectorstores/azuresearch.py:695, in AzureSearch.avector_search(self, query, k, filters, **kwargs)
682 async def avector_search(
683 self, query: str, k: int = 4, *, filters: Optional[str] = None, **kwargs: Any
684 ) -> List[Document]:
685 """
686 Returns the most similar indexed documents to the query text.
687
(...)
693 List[Document]: A list of documents that are most similar to the query text.
694 """
--> 695 docs_and_scores = await self.avector_search_with_score(
696 query, k=k, filters=filters
697 )
698 return [doc for doc, _ in docs_and_scores]
File ~/.local/lib/python3.11/site-packages/langchain_community/vectorstores/azuresearch.py:742, in AzureSearch.avector_search_with_score(self, query, k, filters, **kwargs)
730 """Return docs most similar to query.
731
732 Args:
(...)
739 to the query and score for each
740 """
741 embedding = await self._aembed_query(query)
--> 742 docs, scores, _ = await self._asimple_search(
743 embedding, "", k, filters=filters, **kwargs
744 )
746 return list(zip(docs, scores))
File ~/.local/lib/python3.11/site-packages/langchain_community/vectorstores/azuresearch.py:1080, in AzureSearch._asimple_search(self, embedding, text_query, k, filters, **kwargs)
1066 async with self._async_client() as async_client:
1067 results = await async_client.search(
1068 search_text=text_query,
1069 vector_queries=[
(...)
1078 **kwargs,
1079 )
-> 1080 docs = [
1081 (
1082 _result_to_document(result),
1083 float(result["@search.score"]),
1084 result[FIELDS_CONTENT_VECTOR],
1085 )
1086 async for result in results
1087 ]
1088 if not docs:
1089 raise ValueError(f"No {docs=}")
File ~/.local/lib/python3.11/site-packages/langchain_community/vectorstores/azuresearch.py:1084, in (.0)
1066 async with self._async_client() as async_client:
1067 results = await async_client.search(
1068 search_text=text_query,
1069 vector_queries=[
(...)
1078 **kwargs,
1079 )
1080 docs = [
1081 (
1082 _result_to_document(result),
1083 float(result["@search.score"]),
-> 1084 result[FIELDS_CONTENT_VECTOR],
1085 )
1086 async for result in results
1087 ]
1088 if not docs:
1089 raise ValueError(f"No {docs=}")
KeyError: 'content_vector'
Description
The async methods for searching documents (at least) do not work and raise an error. The async client is not being used for async methods for retrieval possibly.
System Info
langchain==0.2.6
langchain-community==0.2.4
langchain-core==0.2.11
langchain-openai==0.1.8
langchain-text-splitters==0.2.1
The text was updated successfully, but these errors were encountered: