-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spurious conn busy - bug in QueryRow() ? #635
Comments
More info: The query that fails for me, and it was the first query executed on a particular code path, is an UPSERT like this:
I added a query just above that, like this:
So it looks like this: http_handler() { Whenever the upsert starts to fail, the query just above it executes just fine, no errors. I suppose the upsert is different from the select in that it (potentially) modifies the database, while a select does not. Still, not sure how to solve the conn busy error, would appreciate help. |
I did some debugging, adding trace printouts to I believe there is a bug in But QueryRow doesn't seem to Please comment if I'm wrong, meanwhile I'll try switching from QueryRow to Query to see if the issue goes away. Edit: don't mind this comment. The simple reproducer case below fails with Query() the same way it fails with QueryRow() - i.e. with "conn busy". |
One more thing re: my earlier comment on Query continuing to work even when Upsert starting throwing "conn busy". As it turns out, Query is missing error propagation when pgConn.lock() fails (with "conn busy"). The call stack: |
A simple reproducer:
You should see It also can throw a different error:
Maybe something goes wrong in buffer / parameter management when the |
@kmansoft submitted a PR that should fix your issue. In the mean time, my current workaround was to acquire the pool manually for each query and then release it manually. |
Thank you @durango I'll be looking forward to it getting merged. In the meantime I'm wondering if I should switch from pgx connection to pgx pool. My simple test passes with pgxpool with no errors. Edit: no, pgxpool also fails, please see below. https://godoc.org/github.com/jackc/pgx/pgxpool But coming from GO's "sql" package, I was expecting pgx to perform pooled connection management behind the scenes. Is this assumption wrong? Should I always be using a pgxpool in a multi-coroutine setup? Or is that not necessary? A pgx API design question really. |
About pgxpool. It also fails on concurrent workloads.
Result:
The trick to triggering is that the number of coroutines needs to be 20 (10 works with no error). |
I was using pgxpool with Query and QueryRow and still hitting the same problem as you (also noted about the 10 vs 20 goroutines on my own so our experience seems to be very much he same). My fix wont work because the pool gets released to early for QueryRow unfortunately. |
@kmansoft my current working solution is to drop down to pgx's |
Sorry, it took me a while to get to this. There are a number of questions that have come up in this issue and I'll try to answer them all in this post.
4 (https://github.com/jackc/pgx/blob/master/pgxpool/pool.go#L15)
Set MaxConns in the pgxpool.Config struct or set it in the connection string with the pool_max_conns argument. (https://github.com/jackc/pgx/blob/master/pgxpool/pool.go#L183)
The underlying query started by QueryRow is closed when Scan is called on the returned Row. (https://github.com/jackc/pgx/blob/master/rows.go#L87)
The pgx package itself is strictly connection based. If you want a pool you need to use pgxpool (or some other syncronizatin / pooling mechanism). From the docs for pgx.Conn (https://godoc.org/github.com/jackc/pgx#Conn) "Conn is a PostgreSQL connection handle. It is not safe for concurrent usage.". @kmansoft Regarding the failing test case with pgxpool: The reason it is failing is the line As an aside, it would be less error prone to do a |
For the record, I used max pool settings this entire time (ranged from 20 - 400). Checked long-living queries [> 5 minutes] (there were none).
Saw this yesterday actually, forgot to close my PR / was trying to resolve my pgx issues. Tbh, I haven't really found a working solution for pgx other than using it as a connection driver. I had to drop down to *sql.DB and *sql.Conn. Something, somewhere, down the line connections were not being released despite closing/releasing them manually. Fwiw I was also using v4 of pgx and have noticed less problems on v3 for pooling, but that could just be biases in observations. I've also noticed less problems when I stopped using @kmansoft by any chance, are you connecting to a pgbouncer server? |
Ran into the same issue when closing the acquired connection from stdlib but not properly closing pgx.BatchResults after running SendBatch. |
This was a good read, thanks. For anyone coming across this thread from Google, the TLDR here is that in PostgreSQL land, you need to use either: on top of pgx, or else you will get the dreaded "conn busy" error. This is because, as jackc mentions above, the default connection size is 4. And increasing the connection size it is a bad idea because vanilla pgx is only designed to be used by one process/thread at a time.
These two blogs are recommended reading if you want to know more: |
It's embarrassing to admit this but I want to share why this problem happened to me in case anyone else encounters it. I have an HTTP server and was trying to re-use the same connection for all requests. If you're doing this, be sure to use a single root/main-level pool, and then acquire a new connection from the pool per incoming http request (or, if you're using GraphQL, a connection per resolver / dataloader). |
Using the pgxpool package to create a connection pool is safer than using a single connection, because the DB functions are concurrently accessed from multiple goroutines. jackc/pgx#635 (comment)
@warent do you return a pointer to the connection from the single main-level pool connect function, or use a global |
You'll probably want to use a global pool var in a separate database package, and create connections from that wherever you need them. I'm using GraphQL so my http handler and my pool are in the same file (main.go). When the graphql endpoint is called, I put a pointer to the pool in ctx which is passed into the resolvers, and then every graphql resolver creates its own connection as needed. Both approaches seem to work fine but in retrospect a global variable would probably be cleaner. |
jackc/pgx#635 Signed-off-by: Alexander Scheel <[email protected]>
Does it make sense to keep this issue open, or can we close it? |
Per my comment #750 (comment) , take care before just blindly adopting pgbouncer as a proxy or reverse proxy between pgx and postgres. In session-pooling mode it won't help you much, and in transaction-pooling mode it changes the semantics of many postgres operations in ways that'll be very important for you to account for in your app's database usage. Even worse in statement-pooling mode, which is best avoided whenever possible. If pgx works like other drivers in row-at-a-time mode, apps should try to consume the result-set promptly after a query, then close the result-set. The app should defer doing any expensive work (RPCs and other network I/O, nontrivial disk I/O, acquiring contended locks, etc) until after it reads the result and closes the result-set. Doing that should be enough to solve most problems unless the app is highly parallel and database-heavy. |
If you came here to find out why your connection does "conn busy" always make sure you do |
Hello,
I've just started using your library, pretty impressed, esp. with JSON handling.
I'm seeing spurious conn busy errors on my development machine. Once they start, they never stop until I restart the app.
The app uses pgx basically from two modules, one "does something" and stores "something" into pgsql, and another is an http API which fetches this data from pgsql. The failures always occur on the http side (which obviously uses separate coroutines from the first, "does something" part).
How can I track down what's causing these errors? I already checked my code for any leaked row sets or transactions, there don't seem to be none.
Second, related to the above. What is the default connection pool size if I create my connection like this?
And how can I change it (connection pool size)?
The text was updated successfully, but these errors were encountered: