Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README should clearly state some gotchas #750

Closed
wants to merge 1 commit into from
Closed

README should clearly state some gotchas #750

wants to merge 1 commit into from

Conversation

Zamiell
Copy link

@Zamiell Zamiell commented May 16, 2020

feel free to edit what I've done, obviously

@jackc
Copy link
Owner

jackc commented May 16, 2020

There is no requirement or even suggestion that people should use PgBouncer. They should use pgxpool or connect with database/sql which has its own built-in pool. PgBouncer only becomes relevant when you have multiple application servers.

I've added a getting started guide to the wiki and link to that from the readme -- hopefully that clarifies some things.

@Zamiell
Copy link
Author

Zamiell commented May 16, 2020

what about lines 32 and 33

@jackc
Copy link
Owner

jackc commented May 16, 2020

That addition would be fine, though it should link to https://pkg.go.dev/ instead of the old Go doc. The old Go doc doesn't support modules so would be for v3 instead of v4.

@montanaflynn
Copy link

I think simply linking to the wiki would be helpful, I found it hard to find about pgxpool or the need for it. This section of the wiki is very helpful:

https://github.com/jackc/pgx/wiki/Getting-started-with-pgx#using-a-connection-pool

Copy link

@ashish21ghosh ashish21ghosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is helpful to the people who are using pgBouncer like me. This should be somewhere in the document, or just a link for the people who are using pgBouncer. For me with pgBouncer, pgxpool was failing silently with less number of rows returned (2 out of 250).

Comment on lines +32 to +33
// The format for "DATABASE_URL" is here:
// https://godoc.org/github.com/jackc/pgconn#ParseConfig

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been searching the wiki and README looking for this documentation. Much thanks! I was a confused because the only examples I could find were not using the standard URL format even though the variable was called DATABASE_URL.

@bfontaine
Copy link
Contributor

@Zamiell would you mind editing the PR to apply the changes required by Jack and fix the conflicts so we can merge it? Thanks!

@bfontaine bfontaine added the needs response Needs a response from the issue/PR author label Oct 25, 2022
@Zamiell
Copy link
Author

Zamiell commented Oct 25, 2022

it's a simple readme update, feel free to edit what i have to your liking

@ringerc
Copy link
Contributor

ringerc commented Aug 10, 2023

This is because pgx is only designed to be used by one process/thread at a time, unlike the other Golang SQL drivers that you may be used to.

Even with pgxpool?

I assume this is because, unlike most drivers, pgx defaults to lazily reading result-sets, so the connection isn't released for use by another query as soon as Query() returns, it's kept busy until the current code closes the row-set? Essentially, pgx always uses one-row-at-a-time mode for resultset processing.

If so, being able to internally buffer and materialize small result-sets would help a lot for this, like libpq, PgJDBC, psycopg2 etc can do (and do by default). If Rows accumulated the full result for common operations with empty or small resultsets, connection utilisation would be greatly improved.

It can be done heuristically too - during post-Query() processing, when looking for the start of the result-set after the row descriptors etc, continue on to read result rows until some small internal limit on total size of rows read is reached. That way the row-set can release its handle on the connection early for most operations in which empty or small result-sets are returned. A hint passed to Query could control the default read-ahead.

Though one possible downside of doing it heuristically is that apps could see behaviour changes as result-sets get bigger, where some workload that previously immediately released the connection instead holds it over some expensive and slow operation it does when iterating over each result row. Explicit might be better, perhaps with a Query() option that requests that rows be immediately read and buffered that has explicit limits. Either a soft-limit (falls back to lazy) or hard-limit (errors if limit reached) on the result size and/or rowcount that will be read.

... but it looks like pgxpool should generally take care of this anyway, so is this proposed README change now somewhat obsolete?

@ringerc
Copy link
Contributor

ringerc commented Aug 10, 2023

@Zamiell It's not as simple as just saying "use pgbouncer" IMO.

There should be no need to if the app treats the window between Query() and closing of the result-set as a performance-critical window during which it must avoid making expensive calls, acquiring locks that might block, etc. If the app has anything expensive to do it should consume the result-set promptly and close it before proceeding. Unless an app is very database-heavy, in which case it'll need something that can use multiple concurrent connections anyway, that should be sufficient for most cases.

There are many downsides to using a pgbouncer proxy (or putting it as a reverse proxy in front of your app); it's not a lot of help in session-pooling mode. But in transaction-pooling mode many postgres features are unusable or difficult to use, such as session-level SET, some advisory locking modes, WITH HOLD cursors, some temp table modes, and much more. Handling of prepared statements is also awkward in transaction pooling mode and can cause ballooning memory use on the backends if your app uses many different kinds of statements. A blanket recommendation to "use pgbouncer" will probably do more harm than good.

Doesn't pgxpool address most of this anyway? Apps should still minimise the time they hold a particular connection busy, of course, but internal pooling should improve things significantly for apps that require it.

@paudley
Copy link
Contributor

paudley commented Aug 10, 2023

Experience with large production apps has made me conclude that pgbouncer is rarely the answer. It requires very careful planning, and the tradeoffs often surprise people new to it. I would certainly not recommend it for most users (or indeed at all). pgxpool is the correct answer to the question "How can I use pgx in a multithreaded app?".

@Zamiell
Copy link
Author

Zamiell commented Aug 26, 2023

closing this since the maintainers don't seem to be interested

@Zamiell Zamiell closed this Aug 26, 2023
@ringerc
Copy link
Contributor

ringerc commented Aug 27, 2023

Thanks for trying to contribute @Zamiell . I'm not a maintainer of this project btw. It's almost always helpful to suggest docs changes etc; even when it doesn't land up being something agreed on, it can highlight an issue or help others find an answer.

In this case I think pgxpool is probably a suitable answer for most applications, especially when combined with limiting how long result-sets are held open. Suggesting pgbouncer is useful but has too many gotchas to make sense to recommend for the general case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation needs response Needs a response from the issue/PR author
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants