-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shape meta data not cleaned up properly on delete #1550
Labels
Comments
This seems to be the reason for the bug:
Thus, we must ensure that the ETS entry is removed when a table is altered. |
kevin-dp
added a commit
that referenced
this issue
Sep 2, 2024
Fixes #1550. This PR extends the shape log collector to also clean the cached column information from ETS when a relation changes. This ensures that when a table is migrated, we don't re-use the old table information from ETS but instead load it from Postgres and re-populate the cache with the new column info. **There's one corner case that this PR does not address:** - Create table, insert data - Sync a shape containing that table - Drop the table - Delete the shape - Recreate the table but with a different schema - Insert some data into the new table - Sync a shape containing the newly recreated table In the above scenario, when syncing the newly recreated table, we get the new data but in the old schema (so we only get the columns that also existed in the old schema). This is because Postgres logical replication stream does not inform us when a table is dropped. As a result, we can't detect that a table was dropped and thus don't know that we need to clean the cached column information. It's only when we get a Relation message that we know this. But Postgres only sends a Relation message the first time the data in the table changes and **we're subscribed to that table in the replication stream** (and that's only after syncing the shape). So, the data that was inserted before we synced the table in the last step, does not lead to a Relation message. Only if we insert data into the table after that sync step, will Postgres send a Relation message that will make us clean the cached column information. But note that at that point the row that was inserted in the previous step is already stored in storage in the format of the old schema.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When we delete a shape we are forgetting to clean up some meta data somewhere because the old schema information re-appears if we later create a new shape for the same table.
To reproduce this problem, first create a table:
Now, fetch the shape (i'm using HTTPie, can be done with curl too):
Ok, we get the shape with the 2 rows. This is fine.
Now drop the table:
Now, tell Electric to delete the shape:
Ok, the shape is deleted.
Let's recreate the table but with an extra column:
Let's fetch that table:
Now, the returned data is wrong as the rows only include the
c1
column and not thec2
column (also the schema only includesc1
).The text was updated successfully, but these errors were encountered: