Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using pg-gateway, pglite & prisma #4

Open
BracketJohn opened this issue Aug 10, 2024 · 21 comments
Open

Using pg-gateway, pglite & prisma #4

BracketJohn opened this issue Aug 10, 2024 · 21 comments
Labels
documentation Improvements or additions to documentation

Comments

@BracketJohn
Copy link

BracketJohn commented Aug 10, 2024

Improve documentation

I followed the pglite guide to get a local development-setup using pg-gateway, pglite and prisma running. I could not get it to work following the docs. I then tried to use the example/pglite of this repository to create a minimal reproduction. The minimal reproduction also did not work (see below).

Describe the problem

Minimum reproduction using the example/pglite fromm this repository.

First, clone the repo and install the required dependencies:

# clone this repo
git clone https://github.com/supabase-community/pg-gateway

# go into example dir
cd examples/pglite

# install dependencies
pnpm i 

# install prisma
pnpm install @prisma/client prisma

# create the prisma schema file
make prisma
touch prisma/schema.prisma

Add the following content into the schema.prisma:

datasource db {
  provider = "postgresql"
  url      = "postgres://postgres@localhost:5432"
}

model User {
  id    Int     @id @default(autoincrement())
}

Start the database:

pnpm dev

Attempt to push the schema in a parallel terminal:

> pnpm prisma db push

Prisma schema loaded from prisma/schema.prisma
Datasource "db": PostgreSQL database "postgres", schema "public" at "localhost:5432"

Error: Schema engine error:
unexpected message from server

Running it a second time still yields an error, albeit a different one:

❯ pnpm prisma db push
Prisma schema loaded from prisma/schema.prisma
Datasource "db": PostgreSQL database "postgres", schema "public" at "localhost:5432"

Error: Schema engine error:
ERROR: prepared statement "s0" already exists

Running these commands with DEBUG="*" set reveals that:

  • during the first iteration prisma fails: when running select version();,
  • during the second iteration the "prepared statement" that already exists is select version(); - so that is probably left over from the first run

Describe the improvement

Add a prisma-subsection that specifies any extra steps to get prisma to work.

Additional context

Exact package versions this is running on:

❯ pnpm list
Legend: production dependency, optional only, dev only

pglite-example /Users/n/.tmp/pg-gateway/examples/pglite

dependencies:
@prisma/client 5.18.0
pg-gateway 0.3.0-alpha.3
prisma 5.18.0

devDependencies:
@gregnr/pglite 0.2.0-dev.8
@types/node 20.14.15
tsx 4.17.0
typescript 5.5.4

As I'm unsure whether this is related to this package at all, desired behavior or a bug I opened this as a documentation issue - I hope that's alright! Thanks for this great package, it looks super promising and I'd love to use it to create a local postgres-development setup without having to spin up a full docker container every time!

@BracketJohn BracketJohn added the documentation Improvements or additions to documentation label Aug 10, 2024
@gregnr
Copy link
Contributor

gregnr commented Aug 10, 2024

Hey @BracketJohn thanks for the detailed issue. This is good to know - I haven't tried connecting prisma to pg-gateway yet so don't know exactly what might be going wrong. Regardless I am certainly open to adding this as another example. Give me a bit to debug this one and I'll get back to you 😃

@BracketJohn
Copy link
Author

Hey @gregnr 👋

Big thanks for the reply & looking into it - glad, that the issue is helpful!

Looking forward to anything you may find,

have a good one! (:

@BracketJohn
Copy link
Author

Hey @gregnr - quick q: Is there anything I can do to support resolving this / is there any insights you've already gathered? E.g., if it seems likely to you that it is unrelated to pg-gateway, I'd try to look into the other dependencies of the setup (:

Thanks and have a good one!

@gregnr
Copy link
Contributor

gregnr commented Aug 28, 2024

Hey @BracketJohn, sorry for the slowness on this. We're in the middle of a pretty major refactor which I was hoping to get done before moving forward with more examples/tests, but it's taking longer than I expected.

I just took a stab at Prisma + pg-gateway + PGlite and am also receiving the same error as you:

unexpected message from server

Right now I'm trying to determine if this is a pg-gateway issue or a PGlite issue. In case you're interested in the details - I'm logging the protocol messages back and forth between Prisma and PGlite and noticing that the trip-up seems to happen during extended query protocol messages. For some reason PGlite responds with nothing (empty Uint8Array) after Prisma sends a Sync message, whereas IIUC it should be a ReadyForQuery message instead.

I'm going to continue digging and will keep you updated as I learn more. This is almost certainly a bigger issue than Prisma so will be important to get to the bottom of it.

@gregnr
Copy link
Contributor

gregnr commented Aug 29, 2024

Just an update - I've narrowed the problem down to how PGlite handles the extended query protocol. A similar issue exists on PGlite that tracks this: electric-sql/pglite#223

Once that is solved Prisma (and all other PG clients using extended queries) should JustWork™ 😃

@BracketJohn
Copy link
Author

Thanks for the updates & in-depth info - very helpful and interesting 🙏 Will follow the fix of electric-sql/pglite#223 closely then!

@BracketJohn
Copy link
Author

BracketJohn commented Sep 10, 2024

So the pglite folks have fixed electric-sql/pglite#223 🎊 There also was a release 0.2.6 after the fix was merged, but: the release notes do not contain / reference the fixing commit electric-sql/pglite@1522c40.

Maybe just ambiguity, definitely worth for me to try out whether it works now! Will report back here, once I get to that.

@BracketJohn
Copy link
Author

@gregnr I just tried out [email protected], when installing it I get:

❯ pnpm i -D [email protected]
 ERR_PNPM_FETCH_404  GET https://registry.npmjs.org/@jsr%2Fstd__bytes: Not Found - 404

This error happened while installing the dependencies of [email protected]

@jsr/std__bytes is not in the npm registry, or you have no permission to fetch it.

I guess this is related to:

"dependencies": {
"@std/bytes": "npm:@jsr/std__bytes@^1.0.2",
"@std/crypto": "npm:@jsr/std__crypto@^1.0.3",
"@std/encoding": "npm:@jsr/std__encoding@^1.0.3"
},

Just sharing it in case you did not notice. I know that by using an unofficial beta-version this may be intended / unproblematic / fixed on a more mature release.

@gregnr
Copy link
Contributor

gregnr commented Sep 12, 2024

@BracketJohn you're too fast 😆 this bug was caused by those deps living under JSR and losing the @jsr:registry=https://npm.jsr.io scope after packaging/publishing. This is fixed now in 0.3.0-beta.2 👍

@gregnr
Copy link
Contributor

gregnr commented Sep 12, 2024

With regard to PGlite - 0.2.6 does indeed fix the extended query issues, but we noticed one more bug where error messages do not follow up with a ReadyForQuery (causing clients to hang after errors). This has been fixed and hopefully published soon in another release.

@BracketJohn
Copy link
Author

Good to hear! I saw that they just released 0.2.7 which includes the fix for the latest bug 🐛

@BracketJohn
Copy link
Author

BracketJohn commented Sep 13, 2024

I further test everything out with [email protected], [email protected] and [email protected].

I learned the following:

  • the @next-Readme pglite example script contains a tiny error, you need to add async, see [1]
  • pnpm prisma db push leads to an error, see [2]
  • nevertheless, db push seems to work (🎊) independent of the error being thrown
    • manual inspection of db after push shows all tables
    • the application using the db can startup and function
    • still this seems off / could hint at a persisting problem
  • I also tested it inside our CI. There one of our tests still leads to the prepared statement already exists-error, see [3]
    • this test attempts to create data that already exists and then "succeeds" if an error is being thrown, so I guess this could be related to the hang-on-error you described above
    • the test does succeed, it's just stdout / stderr that logs the error thrown by prisma
    • same as above: this seems off / could hint at a persisting problem
  • see the exact script I used in [4]

[1] diff for readme example to work

- const server = net.createServer((socket) => {
+ const server = net.createServer(async (socket) => {

[2] error thrown on pglite + pg-gateway side

> vite-node prisma/pglite.ts

Server listening on port 5432
Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:161:15)
    at writeGeneric (node:internal/stream_base_commons:152:3)
    at Socket._writeGeneric (node:net:952:11)
    at Socket._write (node:net:964:8)
    at writeOrBuffer (node:internal/streams/writable:570:12)
    at _write (node:internal/streams/writable:499:10)
    at Socket.Writable.write (node:internal/streams/writable:508:10)
    at Object.write (node:internal/webstreams/adapters:215:63)
    at invokePromiseCallback (node:internal/webstreams/util:180:10)
    at node:internal/webstreams/util:185:23 {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}
Client disconnected

[3] prepared statement already exists error

[nitro] [unhandledRejection] PrismaClientUnknownRequestError: 
Invalid `prisma.positionTemplate.findFirst()` invocation:


Error occurred during query execution:
ConnectorError(ConnectorError { user_facing_error: None, kind: QueryError(PostgresError { code: "42P05", message: "prepared statement \"s19\" already exists", severity: "ERROR", detail: None, column: None, hint: None }), transient: false })
    at _n.handleRequestError (/home/runner/work/the-project/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:122:7117)
    at _n.handleAndLogRequestError (/home/runner/work/the-project/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:122:6235)
    at _n.request (/home/runner/work/the-project/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:122:5919)
    at async l (/home/runner/work/the-project/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:131:9116)
    at async seed (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:9144:37)
    at async Object.run (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:9279:24)
    at async runTask (file:///home/runner/work/the-project/.nuxt/dev/index.mjs:1732:17)
    at async file:///home/runner/work/the-project/.nuxt/dev/index.mjs:6377:3 {
  clientVersion: '5.16.2'
}

[4] Script used to pglite + pg-gateway:

import net from 'node:net'
import { PGlite } from '@electric-sql/pglite'
import { fromNodeSocket } from 'pg-gateway/node'
import { join } from 'pathe'

// create a single instance of the db, so that subsequent requests use the same db
const db = new PGlite({ dataDir: join(import.meta.dirname, 'pglite-data') })

const server = net.createServer(async (socket) => {
  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup() {
      // Wait for PGlite to be ready before further processing
      await db.waitReady
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return
      }

      // Forward raw message to PGlite and send response to client
      return await db.execProtocolRaw(data)
    },
  })

  socket.on('end', () => {
    console.info('Client disconnected')
  })
})

server.listen(5432, () => {
  console.info('Server listening on port 5432')
})

@BracketJohn
Copy link
Author

I did further testing. While the pnpm prisma db push-command seems to work, albeit throwing an error, the pnpm prisma migrate command which is used to automatically generate & apply migrations (see docs here) does not seem to work: The command gets stuck. I'm using [email protected] which should've resolved the hang-bug that sounds related to the problem I'm encountering here.

Here's a minimal reproduction: pg-gateway-migrate-stuck-reproduction.zip. The README.md contains the (few) steps to reproduce.

@gregnr
Copy link
Contributor

gregnr commented Sep 24, 2024

Thanks for testing @BracketJohn. We did a bit of a deep dive into prisma and learned that prisma creates a shadow database during prisma migrate dev to detect schema drift. By default prisma creates this DB automatically via:

CREATE DATABASE "prisma_migrate_shadow_db_<uuid>"

Then opens a new connection to it concurrently with the regular DB. There are 2 issues going on:

  1. Our PGlite example currently ignores the database param sent by the client. We could grab this via clientParams state and pass it to PGlite in order to open the correct DB. But:
  2. PGlite is single-user, so concurrent connections to a single instance is not possible. Normally your options are:
    • Create a new PGlite instance every connection. This only really works if you want an ephemeral DB each connection that doesn't persist.
    • Reject more than 1 connection to the DB
    • Queue connections so that only one connection operates at a time

You could experiment with the queuing strategy, but I have a hunch prisma expects both connection to run concurrently. Another approach is to create a second temporary PGlite instance dedicated to act as the shadow DB, then manually configure prisma to use this as the shadow DB. It would roughly look like:

  1. Configure prisma manually with a shadow DB:

    datasource db {
      provider          = "postgresql"
      url               = env("DATABASE_URL")
      shadowDatabaseUrl = env("SHADOW_DATABASE_URL")
    }
    

    where:

    DATABASE_URL=postgres://postgres@localhost/postgres
    SHADOW_DATABASE_URL=postgres://postgres@localhost/prisma-shadow
    
  2. In pg-gateway, check to see which DB the client is connecting to and direct to the appropriate instance (untested):

    import { PGlite } from '@electric-sql/pglite'
    import { createServer } from 'node:net'
    import { join } from 'node:path'
    import { fromNodeSocket } from 'pg-gateway/node'
    
    // create a single instance of the db, so that subsequent requests use the same db
    const db = new PGlite({ dataDir: join(import.meta.dirname, 'pglite-data') })
    
    const server = createServer(async (socket) => {
      let activeDb = db;
    
      await fromNodeSocket(socket, {
        serverVersion: '16.3',
    
        auth: {
          // No password required
          method: 'trust',
        },
    
        async onStartup({ clientParams }) {
          // create a temp in-memory instance if connecting to the prisma shadow DB
          if (clientParams?.database === 'prisma-shadow') {
            activeDb = new PGlite();
          }
    
          // Wait for PGlite to be ready before further processing
          await activeDb.waitReady
        },
    
        // Hook into each client message
        async onMessage(data, { isAuthenticated }) {
          // Only forward messages to PGlite after authentication
          if (!isAuthenticated) {
            return
          }
    
          // Forward raw message to PGlite and send response to client
          return await activeDb.execProtocolRaw(data)
        },
      })
    
      socket.on('end', () => {
        console.info('Client disconnected')
      })
    })
    
    server.listen(5432, () => {
      console.info('Server listening on port 5432')
    })

@AndresGnu
Copy link

  • prepared statement already exists

My temporal solution for [3] prepared statement already exists is add this param ?pgbouncer=true in DATABASE_URL

https://www.prisma.io/docs/orm/prisma-client/setup-and-configuration/databases-connections/pgbouncer

@BracketJohn
Copy link
Author

BracketJohn commented Sep 25, 2024

My temporal solution for [3] prepared statement already exists is add this param ?pgbouncer=true in DATABASE_URL

Thanks, I'll try this out.

update: Just tried it: The current 0.3.0-beta version already resolves the prepared statement-problem in a better fashion than adding the pgbounce-param! @AndresGnu I suggest you try that version out, see the next branch for up-to-date instal instructions: #21


Hey @gregnr!

Thanks for taking the time & doing a deep dive into the reported topics, very helpful! 💯 I think the shadow DB strategy could work and will try that one out to resolve our migration-woes.

One thing I do not understand yet: Why does pnpm prisma db push cause an error to be logged, see my point here:

pnpm prisma db push leads to an error, see [2]

and the error that is shown documented here:

[2] error thrown on pglite + pg-gateway side

I'm pretty sure that pnpm prisma db push does not need a shadow-db to work: If it would, it would also get stuck. So, what is going on for the documnted error to be caused?

@AndresGnu
Copy link

@BracketJohn I am using version 0.3.0-beta.3. But in my case I am not managing only one prisma connection.

image
This is a diagram of my case

When I connect to only one of my two services it works correctly. But when the other service connects and starts sending requests I have the error of [3] prepared statement already exists

Implementing the pgbounce parameter still generates internal errors but they do not break the flow as it does without the parameter

This is my implementation.


const base = process.env.APP_DATA || '';
console.log(base);

const dbPath = resolve(base, './data/bot/db');
fs.ensureDirSync(dbPath);

function createDb() {
  return new PGlite(dbPath, {
    // debug: 5,
    // database: 'bot',
    extensions: { uuid_ossp },
  });
}
let closing = false;
let db = createDb();
const server = net.createServer(async (socket) => {
  // console.log(sockt.remoteAddress,socket);
  const conection = `${socket.remoteAddress}:${socket.remotePort}`;
  console.log('Client connected', conection);
  if (closing) {
    console.log('waiting for closing');
    await waitForFunction(async () => !closing);
  }
  if (!db) {
    db = createDb();
  }

  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup({ clientParams }) {
      console.log(clientParams?.database);

      console.log('onStartup');
      await waitForFunction(async () => {
        console.log('onStartup: waiting for transaction');
        return !db.isInTransaction();
      }, 100);
      // Wait for PGlite to be ready before further processing
      await db.waitReady;
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return;
      }
      await waitForFunction(async () => {
        console.log('onMessage: waiting for transaction');
        return !db.isInTransaction();
      }, 100);
      // Forward raw message to PGlite and send response to client
      const r = await db.execProtocolRaw(data);
      return r;
    },
  });

  socket.on('end', async () => {
    console.log('Client disconnected', conection);
    await waitForFunction(async () => {
      return !db.isInTransaction();
    });
    closing = true;
    await db?.close().catch(console.error);
    db = null;
    closing = false;
  });
});

server.listen(+process.env.PORT, () => {
  console.log('Server listening on port 5472');
});

@BracketJohn
Copy link
Author

So, I've done some extensive testing and playing around with different pg-gateway + pglite solutions on 3 of our internal production-grade projects. The TLDR of our experiences is:

  • connection queuing did not work well
  • the shadow-database-approach that @gregnr does indeed work well
  • crucially, setting the connection_limit and pgbouncer options that prisma supports makes a large difference in usability of the solution
  • Specifically, not only the database_url but also the shadow database url and direct url should be configured to get the best experience

Here are the prisma settings that worked best for us:

  // This `env(..)` value will be read from the `.env` file or from the environment.
  url = "postgres://postgres@localhost:5432/postgres?pgbouncer=true&connection_limit=1"

  // These two are required for development only. NOTE: If we ever use connection poolers such as `pgbouncer` or prisma accelerate, we'll need to make the `directUrl` dynamic and set it in production, see `directUrl` description here: https://www.prisma.io/docs/orm/reference/prisma-schema-reference
  shadowDatabaseUrl = "postgres://postgres@localhost/prisma-shadow?pgbouncer=true&connection_limit=1"
  directUrl         = "postgres://postgres@localhost:5432/postgres?connection_limit=1"

The exact combinations of connection_limit and pgboucner flags are important.

Here is the final script we used to run the database, supporting the shadow database approach:

/**
 * Script that starts a postgres database using pg-gateway (https://github.com/supabase-community/pg-gateway) and pglite (https://github.com/electric-sql/pglite).
 *
 * We use this database for local development with prisma ORM. The script also supports creating a `shadow-database`, which is a second, separate database
 * that prisma uses for certain commands, such as `pnpm prisma migrate dev`: https://www.prisma.io/docs/orm/prisma-migrate/understanding-prisma-migrate/shadow-database.
 *
 * To make use of the shadow-database add `/prisma-shadow` to the DSN you provide. This script will then spin up a second, in-memory-only database and connect you to it.
 *
 * This whole script approach is novel to us (before we used sqlite locally). Here is the PR that brought it all together: https://github.com/sidestream-tech/REDACTED
 */
import net from 'node:net'
import { PGlite } from '@electric-sql/pglite'
import { fromNodeSocket } from 'pg-gateway/node'

const db = new PGlite({ dataDir: 'pglite-data' })
let activeDb = db

const server = net.createServer(async (socket) => {
  activeDb = db

  console.info(`Client connected: ${socket.remoteAddress}:${socket.remotePort}`)
  await fromNodeSocket(socket, {
    serverVersion: '16.3',

    auth: {
      // No password required
      method: 'trust',
    },

    async onStartup({ clientParams }) {
      // create a temp in-memory instance if connecting to the prisma shadow DB
      if (clientParams?.database === 'prisma-shadow') {
        console.info(`Connecting client to shadow database`)
        activeDb = new PGlite()
      }

      // Wait for PGlite to be ready before further processing
      await activeDb.waitReady
    },

    // Hook into each client message
    async onMessage(data, { isAuthenticated }) {
      // Only forward messages to PGlite after authentication
      if (!isAuthenticated) {
        return
      }

      // Forward raw message to PGlite and send response to client
      return await activeDb.execProtocolRaw(data)
    },
  })

  socket.on('end', () => {
    console.info('Client disconnected')
  })
})

server.listen(5432, () => {
  console.info('Server listening on port 5432')
})

With this setup every command we tested worked flawlessly. You can see that prisma nicely respects the connection limit. The only "off" thing that still occurs is this error on the first pnpm prisma db push run:

Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:161:15)
    at writeGeneric (node:internal/stream_base_commons:152:3)
    at Socket._writeGeneric (node:net:952:11)
    at Socket._write (node:net:964:8)
    at writeOrBuffer (node:internal/streams/writable:570:12)
    at _write (node:internal/streams/writable:499:10)
    at Socket.Writable.write (node:internal/streams/writable:508:10)
    at Object.write (node:internal/webstreams/adapters:215:63)
    at invokePromiseCallback (node:internal/webstreams/util:180:10)
    at node:internal/webstreams/util:185:23 {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}

This error seems to be inconsequential and does not lead to any problems we've experienced.

With this, we are happy to use pg-gateway + pglite as a solution for our local development setups.

Thanks @gregnr for all the help & the nice debugging-tandem, I certainly enjoyed it 🙌

@gregnr
Copy link
Contributor

gregnr commented Sep 30, 2024

Amazing work @BracketJohn. Thanks for sharing your final implementation 🙌

I'm in the middle of implementing more robust connection handling logic (closing each side of the stream at the correct time, exposing close events you can hook into via promise, etc) which hopefully solves the above issue since it sounds related. This is likely the last piece before 0.3.0 is released. Will keep you updated.

@gregnr
Copy link
Contributor

gregnr commented Sep 30, 2024

@AndresGnu if I understand your use case correctly, it's worth reiterating that PGlite only operates on a single connection (single-user mode), so sending messages from multiple connections to the same PGlite instance will have undefined behaviour. For example:

  • Any session variables / prepared statements / etc will be shared between both connections
  • Any postgres wire protocol that operates over multiple messages (eg. extended query protocol) will lead to corrupt requests/responses if both connections send queries at the same time
  • If a transaction is opened by one connection, the other connection will be querying against that transaction

At a minimum I would suggest tracking extended queries and queue them between connections so that they're atomic, and also tracking and queuing transactions between connections so that they don't overlap. Though there are no guarantees these 2 things alone will make this work.

Future versions of pg-gateway will support onQuery hooks and possibly transaction monitoring that might make these things easier to track.

@BracketJohn
Copy link
Author

BracketJohn commented Oct 8, 2024

@gregnr happy to share & thanks again to you for being along on the ride plus working the pg-gateway side of things. We've now been using pg-gateway + pglite for local development and it is mostly working fine for us. There's some hiccups, but I want to oberserve further before documenting them here.

Getting rid of Error: write EPIPE would definitely be nice for us, it was one of the main feedback points by devs & PMs: They were confused on whether this is an actionable problem or something to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants