Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (and test) explicit support for PgBouncer #67

Open
mehcode opened this issue Jan 16, 2020 · 17 comments
Open

Add (and test) explicit support for PgBouncer #67

mehcode opened this issue Jan 16, 2020 · 17 comments
Labels
db:other DBs that are not explicitly supported but may work db:postgres Related to PostgreSQL enhancement New feature or request

Comments

@mehcode
Copy link
Member

mehcode commented Jan 16, 2020

Notable differences

  • Prepared statements are invalid after a SYNC (so they can be used but may not be cached).

  • Apparently does not support the startup parameter IntervalStyle. That may need to be deferred to a SET after connect.

@abonander
Copy link
Collaborator

abonander commented Jan 16, 2020

How would PgConnection differentiate? Maybe a query parameter? ?pgbouncer=1 or something.

@mehcode
Copy link
Member Author

mehcode commented Jan 16, 2020

@abonander My hope is we can identify that it's PgBouncer on start-up. If not we'll need to tinker and provide configuration for people to use. Perhaps a way to disable the prepared cache in the database url.

@abonander abonander added the db:other DBs that are not explicitly supported but may work label Apr 10, 2021
@Firaenix
Copy link

Firaenix commented Sep 9, 2021

Any news on this? I would love to be able to use pgbouncer for my SQLX apps

@abonander
Copy link
Collaborator

We'd gladly accept a pull request that improves compatibility, but otherwise it's not currently a priority.

@mehcode
Copy link
Member Author

mehcode commented Sep 9, 2021

You should be able to get support by turning off the prepared statement cache. Set the cache capacity to 0. See https://docs.rs/sqlx/0.5.5/sqlx/postgres/struct.PgConnectOptions.html#method.statement_cache_capacity

I haven't tried it but from the OP of this issue, it sounds like this should work.

@TuhinNair
Copy link

I can confirm that it does indeed work with the cache capacity set to 0.

Specifically, using PgBouncer's transaction pooling mode and statement cache capacity at 0, I was able to use sqlx with AWS Lambda to begin and commit transactions.

@crepererum
Copy link
Contributor

I don't think setting the cache capacity to 0 works in all cases. If you run queries w/o any transaction and the cache size is 0, SQLx will still prepare the statement if the query has any arguments:

let format = if let Some(mut arguments) = arguments {
// prepare the statement if this our first time executing it
// always return the statement ID here
let (statement, metadata_) = self
.get_or_prepare(query, &arguments.types, persistent, metadata_opt)
.await?;

let statement = prepare(self, sql, parameters, metadata).await?;
if store_to_cache && self.cache_statement.is_enabled() {

Since pgbouncer doesn't keep the "prepare" step and the execution together, this will lead to confusing error messages where prepared statements cannot be found.

I'm wondering if there is a way to pass arguments to a query w/o relying on "prepare statement" (and w/o having the user to do manual but safe string/parameter escaping).

@abonander
Copy link
Collaborator

I don't really think that's something that SQLx should tackle. I think that could be built as a generic SQL interpolation library.

If you have access to PgBouncer's config, it looks like you can set the pooling mode to Session which should make everything work as expected. Kinda defeats the point of PgBouncer, I guess, but it would work.

@uthng
Copy link

uthng commented Apr 6, 2022

Hi,

I got the same issue with the error prepared statement sqlx_s_1 already exists while doing with pgbouncer. Setting capacity to 0 and persistent at query level to false did not help. I finally did a workaround with DEALLOCATE all; before each query execution.

@abonander
Copy link
Collaborator

@uthng more discussion here: #1740 (comment)

@digizeph
Copy link

Add my work-around here, albeit it will bottleneck on a single connection:
let pool = PgPoolOptions::new().max_connections(1).connect(db_url)
The pool above will work with pgbouncer.

@crepererum
Copy link
Contributor

Another approach would be to implement proper prepared statement support in pgbouncer since sqlx isn't the only client suffering from this limitation, see pgbouncer/pgbouncer#695

@abusch
Copy link

abusch commented Nov 17, 2022

Just wanted to mention that I've had that exact issue with AWS RDS Proxy (in front of RDS Postgres), and the workaround from #67 (comment) worked.

I couldn't find any info on wether they use pgbouncer under the hood or if it's just a similar problem.

@vtselfa
Copy link

vtselfa commented Jul 25, 2023

I have been looking into this problem and my conclusions are the following:

Using pgbouncer in pool_mode=session is fine (quite pointless, but fine). The problem is with pool_mode=transaction. The problem is twofold:

  1. we cannot use named statements
  2. preparing, binding, describing and executing statements needs to be done in one round-trip.
    The reasons behind those problems is that after a SYNC pgbouncer may assign us a different physical connection and statements are prepared per connection.

In the case of problem 1), if we are using named statements and our physical connection changes, the statement will no longer be there when we try to reuse it. Even worse, some other application will have it in their connection and they could get the famous prepared statement sqlx_s_1 already exists. This can be solved in different ways:
a) implementing unnamed statement support in sqlx (not hard, I have a proof of concept)
b) setting the statement cache size to 0 AND setting the option server_reset_query_always=1 in pgbouncer, so it removes any prepared named statements we create. If we don't want pgbouncer to do the cleanup, we can do it manually by executing DEALLOCATE all; before each query execution.

Problem 2) is much more tricky. Assuming that we solved problem 1) and we are no longer caching statements we still have the problem that currently sqlx does the following when executing queries:

1. Parameter type name -> OID resolution.  We only have the names of the type parameters, so we run a query to get their OIDs. Potentially does a full blown query.
2. PREPARE
3. DESCRIBE
4. (*) SYNC
5. (*) Parameter type OID -> extended type info (potentially does a full blown query, i.e. at least PREPARE SYNC BIND EXECUTE CLOSE_PORTAL SYNC)
6. (*) Return type OID -> extended type info (potentially does a full blown query, similar to the previous one)
7. BIND
8. EXECUTE
9. CLOSE_PORTAL
10. SYNC

The problem is that every time that we do a SYNC (see asterisks in previous list), potentially pgbouncer changes our connection and we lose our prepared statement (anonymous or not), making the binding and execution fail.

Note that we can send PREPARE DESCRIBE BIND EXECUTE SYNC together, in one round-trip, but that does not help, because we need the extended type info from steps 5 and 6 before we return the first row. Assuming that all the parameters are known before we prepare the statement, step 5 can be done together with step 1, but step 6 still requires the output from step 3, so that one cannot be moved. It's also not possible to defer the requests to get the extended type info until after we do EXECUTE SYNC, because we are streaming the rows and the first row already needs that metadata available.

Possible solutions that come to my mind:
a) PREPARE twice. This is a dumb solution, but it works. Since steps 4, 5 and 6 may have destroyed our prepared statement, before step 7 we PREPARE it again.
b) Cache locally the information from pg_catalog that we need and update this cache every time we run a query. We would do this just before step 1, and it would prevent steps 1, 5 and 6 from reaching out to the DB. This this cache would be shared by all the connections in the pool. The first query run would need to fetch all the catalog (basically execute fetch_type_by_oid for all OIDs.), but subsequent ones would only need to fetch new OIDs (OIDs greater than the latest we have in the cache), if any. I'm not sure if updating the cache is even necessary... the compile time checks were run against some types, so I'm not sure if changing them afterwards is safe. Which brings me to the another possibility....
c) Use the data from the compile time checks. All the metadata information required could be extracted during compilation and stored in the binary. We would then at runtime just check this data.
d) Parse the query to know, before the PREPARE statement, the type info of the result columns. Fetch those during step 1, move step 5 also to step one. Send all commands in one round-trip.

I'm by far not an expert in this codebase, so I may have gotten things wrong. What do you thing @abonander ?
In any case, I'm willing to help solve this.

@vtselfa
Copy link

vtselfa commented Jul 26, 2023

It seems that pgcat (a pgbouncer) alternative has just added support for prepared statements. See it here https://postgresml.org/blog/making-postgres-30-percent-faster-in-production
If this is the case, no changes to sqlx would be needed.

@MTRNord
Copy link

MTRNord commented Sep 8, 2023

Hm so pgbouncer/pgbouncer#845 in theory works. But on testing I encounterd that sqlx fails to query with encountered unexpected or invalid data: expecting ParseComplete but received ParameterDescription in some cases. (also mentioned as pgbouncer/pgbouncer#845 (comment) )

Specifically, it is happening in this query:

#[instrument(skip(self, username))]
    async fn user_exists(&self, username: &str) -> bool {
        let exists = sqlx::query("SELECT 1 FROM users WHERE username = $1")
            .bind(username)
            .fetch_all(self.get_pool())
            .await;
        match exists {
            Ok(exists) => {
                debug!("[POSTGRES] [user_exists] {:?}", exists.len());
                exists.len() == 1
            }
            Err(e) => {
                if !matches!(e, sqlx::Error::RowNotFound) {
                    error!("[DB] Error checking if user exists: {}", e);
                }
                false
            }
        }
    }

I am not sure which side is buggy or not complete here but I thought it is worth mentioning so it isn't forgotten in the future. Since it matters it is tested with sqlx = { version = "0.7.1", features = ["postgres", "runtime-tokio-rustls"] } and commit 70ad45b7ec0d183caa65e15fef2e7b8ed6926957 of the PR I linked at the start.

Edit: This was a bug in the pgbouncer PR and is fixed now it seems :)

@albankurti
Copy link

To anyone still experiencing this issue where the database is hosted in Supabase please note the following comment:
supabase/supavisor#239 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
db:other DBs that are not explicitly supported but may work db:postgres Related to PostgreSQL enhancement New feature or request
Projects
None yet
Development

No branches or pull requests