Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: fix AzureSearch vectorstore asyncronous methods #24921

Merged

Conversation

thedavgar
Copy link
Contributor

@thedavgar thedavgar commented Aug 1, 2024

Description
Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from this commit create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue #24740.
To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators.

@chrislrobert said in this comment that there was a still a flaw due to with blocks that close the client after each call. I removed this with blocks in the async_client following the same pattern as the sync client.

In order to close up the connections, a del method is included to gently close up clients once the vectorstore object is destroyed.

Issue: #24740 and #24064
Dependencies: No new dependencies for this change

Example notebook: I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well.
image

Lint and test: Passes the tests and the linter

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Aug 1, 2024
Copy link

vercel bot commented Aug 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Aug 1, 2024 3:51pm

@dosubot dosubot bot added community Related to langchain-community Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 1, 2024
@chrislrobert
Copy link

Thanks, @thedavgar, for tackling this!

From a quick look at the code, it looks like this should allow accessing search results, multiple calls, etc. Just two questions:

  1. Is anything special needed to ensure that the async_client is eventually cleaned up (closed)?
  2. There are still with blocks in aadd_embeddings and adelete, which leaves those as kind of one-way doors: they'll suffer the same issue where subsequent calls will fire exceptions about the client being closed. I can see focusing this fix on the actual search cases, but is there an issue somewhere to clean that up too? Shame to leave it broken.

Thanks again!

@thedavgar
Copy link
Contributor Author

Thanks, @thedavgar, for tackling this!

From a quick look at the code, it looks like this should allow accessing search results, multiple calls, etc. Just two questions:

  1. Is anything special needed to ensure that the async_client is eventually cleaned up (closed)?
  2. There are still with blocks in aadd_embeddings and adelete, which leaves those as kind of one-way doors: they'll suffer the same issue where subsequent calls will fire exceptions about the client being closed. I can see focusing this fix on the actual search cases, but is there an issue somewhere to clean that up too? Shame to leave it broken.

Thanks again!

  1. The best way could be to close the clients once the vectorstore instance is closed.
  2. Thank you telling me the missing methods that still have the with block. I will try to tackle it.

Copy link
Contributor Author

@thedavgar thedavgar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The async code works

Copy link

@chrislrobert chrislrobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thedavgar, this all looks good to me — but I have to say that I'm not an official reviewer for LangChain and don't know their standards and practices. If possible, I would ask @eyurtsev and/or @baskaryan to review, as they were involved in the original async PR #22075.

@thedavgar
Copy link
Contributor Author

Hi @baskaryan,

I’m looking forward to approving this PR for the next LangChain community version. I’ve followed the steps in the contributor guide, but I’m not sure how to assign or approve the code.

Thanks in advance!

@thedavgar
Copy link
Contributor Author

thedavgar commented Aug 12, 2024

Hi @isahers1, the async version of AzureSearch vectorstore continues to fail. This PR solves several issues in the code. Could you help me get these changes approved to have it fixed for the next version of langchain community?

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 13, 2024
@isahers1 isahers1 merged commit 9d08369 into langchain-ai:master Aug 13, 2024
43 checks passed
olgamurraft pushed a commit to olgamurraft/langchain that referenced this pull request Aug 16, 2024
…-ai#24921)

**Description**
Fix the asyncronous methods to retrieve documents from AzureSearch
VectorStore. The previous changes from [this
commit](langchain-ai@ffe6ca9)
create a similar code for the syncronous methods and the asyncronous
ones but the asyncronous client return an asyncronous iterator
"AsyncSearchItemPaged" as said in the issue langchain-ai#24740.
To solve this issue, the syncronous iterators in asyncronous methods
where changed to asyncronous iterators.

@chrislrobert said in [this
comment](langchain-ai#24740 (comment))
that there was a still a flaw due to `with` blocks that close the client
after each call. I removed this `with` blocks in the `async_client`
following the same pattern as the sync `client`.

In order to close up the connections, a __del__ method is included to
gently close up clients once the vectorstore object is destroyed.

**Issue:** langchain-ai#24740 and langchain-ai#24064
**Dependencies:** No new dependencies for this change

**Example notebook:** I created a notebook just to test the changes work
and gives the same results as the syncronous methods for vector and
hybrid search. With these changes, the asyncronous methods in the
retriever work as well.

![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67)


**Lint and test**: Passes the tests and the linter
@ND-code-ai
Copy link

Amazing! For me your changes also fixed some faulty behaviour in the @search.score calculation where retrieved Doc objects would contain unexpected extremely low scores

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature community Related to langchain-community lgtm PR looks good. Use to confirm that a PR is ready for merging. size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants