Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change from basic psycopg to psycopg_pool #847

Open
johantandy opened this issue Jun 24, 2024 · 7 comments
Open

Change from basic psycopg to psycopg_pool #847

johantandy opened this issue Jun 24, 2024 · 7 comments
Labels
contributions wanted! Looking for external contributions feature request Ideas to improve an integration integration:pgvector P3

Comments

@johantandy
Copy link

johantandy commented Jun 24, 2024

Is your feature request related to a problem? Please describe.
Yes, i'm currently using haystack pipeline & pgvector with FastAPI to serve embedding search but seems the current implementation of psycopg doesn't handle the reconnection after a long idle time Eg. overnight without any calls to API
the exception that thrown is psycopg.OperationalError: the connection is lost

Describe the solution you'd like
i'd like to implement psycopg_pool to make it better in handling lost connection also for anyone who serve it via API this
is more scaleable to use a connection pool.

Describe alternatives you've considered
Increasing DB Connection max lifetime but seems not a good practice, make a retry call to PG also doesn't work with current implementation, re-initiate the connection are impossible because of the connection var never be None once it initialized

def connection(self):
if self._connection is None:
self._create_connection()
return self._connection

Additional context
I'd be happy to make a PR for this myself, just want to make sure that I'm not missing anything before working on this.
open for discussion & also correct me if i'm wrong 🙇

@johantandy johantandy added the feature request Ideas to improve an integration label Jun 24, 2024
@anakin87
Copy link
Member

Hey, @johantandy!

I understand the problem and I think we can do something to handle it... (although, I am not a DB expert)

The only perplexity is about using a different package: psycopg_pool.

Any other possible ideas? Or a draft of how you imagine the change?

(@silvanocerza)

@octalpixel
Copy link

Facing the same issue atm, but even for 5mins timeout.

Could be potentially use ?
https://github.com/supabase/vecs

cc: @anakin87

@octalpixel
Copy link

Hey @anakin87 @silvanocerza just checking if this would worked on by the team or are you guys open to a PR

@silvanocerza
Copy link
Contributor

Sorry, I missed the discussion.

To be fair I don't see the benefit of a connection pool in this case, nor how it would solve the problem.

As of now the integration has at most one open connection and no more so why we need a pool of multiple connections?

The problem we have is that we open a single connection and keep it open as long as the Document Store exists. What we should do in my opinion is handle the connection context correctly, something that right we completely ignore.

If anyone wants to take care of that feel free to open a PR.

@silvanocerza
Copy link
Contributor

@octalpixel @johantandy any of you willing to take care of this using the connection context properly?

I would suggest taking a look at this first 👇
https://www.psycopg.org/psycopg3/docs/basic/usage.html#connection-context

@silvanocerza silvanocerza added contributions wanted! Looking for external contributions and removed P1 labels Sep 2, 2024
@silvanocerza
Copy link
Contributor

This is up for grabs by anyone that want to fix how the connection is handled, see my above comment. ☝

@srini047
Copy link

@silvanocerza @julian-risch I would like to take up this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions wanted! Looking for external contributions feature request Ideas to improve an integration integration:pgvector P3
Projects
Development

No branches or pull requests

7 participants