-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change from basic psycopg to psycopg_pool #847
Comments
Hey, @johantandy! I understand the problem and I think we can do something to handle it... (although, I am not a DB expert) The only perplexity is about using a different package: Any other possible ideas? Or a draft of how you imagine the change? |
Facing the same issue atm, but even for 5mins timeout. Could be potentially use ? cc: @anakin87 |
Hey @anakin87 @silvanocerza just checking if this would worked on by the team or are you guys open to a PR |
Sorry, I missed the discussion. To be fair I don't see the benefit of a connection pool in this case, nor how it would solve the problem. As of now the integration has at most one open connection and no more so why we need a pool of multiple connections? The problem we have is that we open a single connection and keep it open as long as the Document Store exists. What we should do in my opinion is handle the connection context correctly, something that right we completely ignore. If anyone wants to take care of that feel free to open a PR. |
@octalpixel @johantandy any of you willing to take care of this using the connection context properly? I would suggest taking a look at this first 👇 |
This is up for grabs by anyone that want to fix how the connection is handled, see my above comment. ☝ |
@silvanocerza @julian-risch I would like to take up this issue. |
Is your feature request related to a problem? Please describe.
Yes, i'm currently using haystack pipeline & pgvector with FastAPI to serve embedding search but seems the current implementation of psycopg doesn't handle the reconnection after a long idle time Eg. overnight without any calls to API
the exception that thrown is
psycopg.OperationalError: the connection is lost
Describe the solution you'd like
i'd like to implement psycopg_pool to make it better in handling lost connection also for anyone who serve it via API this
is more scaleable to use a connection pool.
Describe alternatives you've considered
Increasing DB Connection max lifetime but seems not a good practice, make a retry call to PG also doesn't work with current implementation, re-initiate the connection are impossible because of the connection var never be None once it initialized
haystack-core-integrations/integrations/pgvector/src/haystack_integrations/document_stores/pgvector/document_store.py
Lines 169 to 173 in 60c6666
Additional context
I'd be happy to make a PR for this myself, just want to make sure that I'm not missing anything before working on this.
open for discussion & also correct me if i'm wrong 🙇
The text was updated successfully, but these errors were encountered: