Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqlite): Prevent the execution of 'migrate' script if user_version==0 #2031

Merged
merged 6 commits into from
Sep 14, 2023

Conversation

Ivansete-status
Copy link
Collaborator

@Ivansete-status Ivansete-status commented Sep 13, 2023

Description

Given a nwaku node with Store-SQLite mounted, it will try to apply a migration script because the user_version present in the SQLite file is 0, whereas the expected version is 7.

Changes

  • Avoid migrate the SQLite schema if the user_version == 0 and the # of columns in PK is 3. In this case, we also enforce the SchemaVersion 7.

How to replicate the issue:

In order to replicate the issue, the node should have a certain amount of messages stored in the database.

Use the next command, tailored by @vpavlin, to start the node, but change the docker image each time:

docker run -it --rm -v $PWD/data:/data:z XXXdocker-imageXXX --relay --store --store-message-db-url=sqlite:///data/wakustore.sqlite3 --log-level=DEBUG --dns-discovery=true --dns-discovery-url=enrtree://AOGECG2SPND25EEFMAJ5WF3KSGJNSGV356DSTL2YVLLZWIV6SAYBM@prod.waku.nodes.status.im --discv5-discovery=true --discv5-udp-port=9005 --discv5-enr-auto-update=True
  1. Start the node with: statusteam/nim-waku:v0.17.0. This will start normally.
  2. Start the node with: statusteam/nim-waku:v0.19.0. The node fails to start with the message: "failed to setup archive driver: failed to execute migration scripts: failed to execute migration statement".

How to validate the fix:

  1. Start the node with: statusteam/nim-waku:v0.17.0. This will start normally.
  2. Start the node with: quay.io/wakuorg/nwaku-pr:2031. This will start normally, and the next message should appear: "We found user_version 0 but the database schema reflects the user_version 7".
  3. Start the node with: statusteam/nim-waku:v0.19.0. This will start normally.

Issue

closes #2027

@Ivansete-status Ivansete-status self-assigned this Sep 13, 2023
@github-actions
Copy link

github-actions bot commented Sep 13, 2023

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2031

Built from 54bdf98

@jm-clius
Copy link
Contributor

Not sure I follow this fix - why would we consider nodes with user_version=0 to be up to date? How will we update their schemas in future? And do we know why the user_version would have been reset to 0 at all? If we know why this would have happened, I would imagine a temporary fix script is that for such nodes the user_version should be set to the latest target_version and migration skipped.

@vpavlin
Copy link
Member

vpavlin commented Sep 13, 2023

I am not sure how to reproduce the original issue - I ran

docker run -it --rm -v $PWD/data:/data:z statusteam/nim-waku:v0.17.0 --relay=true --store=true --store-message-db-url=sqlite:///data/wakustore.sqlite3

then

docker run -it --rm -v $PWD/data:/data:z statusteam/nim-waku:v0.19.0 --relay=true --store=true --store-message-db-url=sqlite:///data/wakustore.sqlite3

And I get

INF 2023-09-13 15:33:51.702+00:00 database schema is outdated                topics="sqlite" tid=1 file=db_sqlite.nim:444 userVersion=0 targetVersion=7

In both executions, but no errors

@vpavlin
Copy link
Member

vpavlin commented Sep 13, 2023

Ok, reproduced it - it needs some messages in the db (which makes sense, since it is caused by some constraint change)

docker run -it --rm -v $PWD/data:/data:z statusteam/nim-waku:v0.17.0 --relay --store --store-message-db-url=sqlite:///data/wakustore.sqlite3 --log-level=DEBUG --dns-discovery=true --dns-discovery-url=enrtree://AOGECG2SPND25EEFMAJ5WF3KSGJNSGV356DSTL2YVLLZWIV6SAYBM@prod.waku.nodes.status.im --discv5-discovery=true --discv5-udp-port=9005 --discv5-enr-auto-update=True

wait to get messages in (or send your own non-ephemeral messages..)

THen I got

INF 2023-09-13 15:47:54.442+00:00 database schema is outdated                topics="sqlite" tid=1 file=db_sqlite.nim:444 userVersion=0 targetVersion=7
DBG 2023-09-13 15:47:54.442+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS  Message(\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:47:54.442+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS  Message(\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:47:54.442+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:47:54.454+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:47:54.455+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message_backup SELECT id, timestamp, contentTopic, pubsubTopic, payload, version FROM Message;"
ERR 2023-09-13 15:47:54.468+00:00 failed to execute migration statement      topics="sqlite" tid=1 file=db_sqlite.nim:465 statement="INSERT INTO Message_backup SELECT id, timestamp, contentTopic, pubsubTopic, payload, version FROM Message;" error="constraint failed"
ERR 2023-09-13 15:47:54.468+00:00 4/7 Mounting protocols failed              topics="wakunode main" tid=1 file=wakunode2.nim:85 error="failed to setup archive driver: failed to execute migration scripts: failed to execute migration statement"

@Ivansete-status
Copy link
Collaborator Author

Thanks for the comments!
Yes, this is a workaround that doesn't fix the actual problem. The way how @vpavlin runs it makes it to properly run the migration scripts. But, if it runs from a docker compose, then it can't find the migration scripts and crashes.

I will look for a better solution. This cannot be merged indeed.

@vpavlin
Copy link
Member

vpavlin commented Sep 13, 2023

No, I do not agree, it is actually executing migrations and failing to insert messages from Message table into Message_backup table for some reason (some constraint is failing - PK, NULL, unique index?) But it does not say what kind of constraint

@Ivansete-status
Copy link
Collaborator Author

Okay I see, in my case I saw that the migration scripts are being run properly, having the table empty:

INF 2023-09-13 15:50:54.911+00:00 database schema is outdated                topics="sqlite" tid=1 file=db_sqlite.nim:444 userVersion=0 targetVersion=7
DBG 2023-09-13 15:50:54.911+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS  Message(\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.911+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS  Message(\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.911+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.921+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        timestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.921+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message_backup SELECT id, timestamp, contentTopic, pubsubTopic, payload, version FROM Message;"
DBG 2023-09-13 15:50:54.921+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT INTO Message_backup SELECT id, timestamp, contentTopic, pubsubTopic, payload, version FROM Message;"
DBG 2023-09-13 15:50:54.921+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE Message;"
DBG 2023-09-13 15:50:54.923+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE Message;"
DBG 2023-09-13 15:50:54.923+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB PRIMARY KEY,\n        receiverTimestamp REAL NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp REAL NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.924+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB PRIMARY KEY,\n        receiverTimestamp REAL NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp REAL NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.924+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message (id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp)\n    SELECT id, timestamp, contentTopic, pubsubTopic, payload, version, 0\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.924+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT INTO Message (id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp)\n    SELECT id, timestamp, contentTopic, pubsubTopic, payload, version, 0\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.924+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.925+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.925+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        receiverTimestamp REAL NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp REAL NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.926+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message_backup (\n        id BLOB PRIMARY KEY,\n        receiverTimestamp REAL NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp REAL NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.926+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message_backup SELECT id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp FROM Message;"
DBG 2023-09-13 15:50:54.926+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT INTO Message_backup SELECT id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp FROM Message;"
DBG 2023-09-13 15:50:54.926+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE Message;"
DBG 2023-09-13 15:50:54.927+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE Message;"
DBG 2023-09-13 15:50:54.927+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB PRIMARY KEY,\n        receiverTimestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.928+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB PRIMARY KEY,\n        receiverTimestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp INTEGER NOT NULL\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.928+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message (id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp)\n    SELECT id, CAST(receiverTimestamp*1000000000 AS INTEGER), contentTopic, pubsubTopic, payload, version, CAST(senderTimestamp*1000000000 AS INTEGER)\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.928+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT INTO Message (id, receiverTimestamp, contentTopic, pubsubTopic, payload, version, senderTimestamp)\n    SELECT id, CAST(receiverTimestamp*1000000000 AS INTEGER), contentTopic, pubsubTopic, payload, version, CAST(senderTimestamp*1000000000 AS INTEGER)\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.928+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.928+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.928+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="ALTER TABLE Message RENAME TO Message_backup;"
DBG 2023-09-13 15:50:54.930+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="ALTER TABLE Message RENAME TO Message_backup;"
DBG 2023-09-13 15:50:54.931+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB,\n        receiverTimestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp INTEGER NOT NULL,\n        CONSTRAINT messageIndex PRIMARY KEY (senderTimestamp, id, pubsubTopic)\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.932+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS Message(\n        id BLOB,\n        receiverTimestamp INTEGER NOT NULL,\n        contentTopic BLOB NOT NULL,\n        pubsubTopic BLOB NOT NULL,\n        payload BLOB,\n        version INTEGER NOT NULL,\n        senderTimestamp INTEGER NOT NULL,\n        CONSTRAINT messageIndex PRIMARY KEY (senderTimestamp, id, pubsubTopic)\n    ) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.932+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT INTO Message\n    SELECT *\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.932+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT INTO Message\n    SELECT *\n    FROM Message_backup;"
DBG 2023-09-13 15:50:54.932+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.933+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE Message_backup;"
DBG 2023-09-13 15:50:54.933+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE INDEX IF NOT EXISTS i_msg ON Message (contentTopic, pubsubTopic, senderTimestamp, id);"
DBG 2023-09-13 15:50:54.934+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE INDEX IF NOT EXISTS i_msg ON Message (contentTopic, pubsubTopic, senderTimestamp, id);"
DBG 2023-09-13 15:50:54.934+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="ALTER TABLE message RENAME COLUMN receiverTimestamp TO storedAt;"
DBG 2023-09-13 15:50:54.936+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="ALTER TABLE message RENAME COLUMN receiverTimestamp TO storedAt;"
DBG 2023-09-13 15:50:54.937+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP INDEX IF EXISTS i_msg;"
DBG 2023-09-13 15:50:54.938+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP INDEX IF EXISTS i_msg;"
DBG 2023-09-13 15:50:54.938+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE INDEX IF NOT EXISTS i_query ON message (contentTopic, pubsubTopic, storedAt, id);"
DBG 2023-09-13 15:50:54.938+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE INDEX IF NOT EXISTS i_query ON message (contentTopic, pubsubTopic, storedAt, id);"
DBG 2023-09-13 15:50:54.939+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP INDEX IF EXISTS i_rt;"
DBG 2023-09-13 15:50:54.939+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP INDEX IF EXISTS i_rt;"
DBG 2023-09-13 15:50:54.939+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE INDEX IF NOT EXISTS i_ts ON message (storedAt);"
DBG 2023-09-13 15:50:54.939+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE INDEX IF NOT EXISTS i_ts ON message (storedAt);"
DBG 2023-09-13 15:50:54.939+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="ALTER TABLE message RENAME TO message_backup;"
DBG 2023-09-13 15:50:54.942+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="ALTER TABLE message RENAME TO message_backup;"
DBG 2023-09-13 15:50:54.942+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="CREATE TABLE IF NOT EXISTS message(\n  pubsubTopic BLOB NOT NULL,\n  contentTopic BLOB NOT NULL,\n  payload BLOB,\n  version INTEGER NOT NULL,\n  timestamp INTEGER NOT NULL,\n  id BLOB,\n  storedAt INTEGER NOT NULL,\n  CONSTRAINT messageIndex PRIMARY KEY (storedAt, id, pubsubTopic)\n) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.943+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="CREATE TABLE IF NOT EXISTS message(\n  pubsubTopic BLOB NOT NULL,\n  contentTopic BLOB NOT NULL,\n  payload BLOB,\n  version INTEGER NOT NULL,\n  timestamp INTEGER NOT NULL,\n  id BLOB,\n  storedAt INTEGER NOT NULL,\n  CONSTRAINT messageIndex PRIMARY KEY (storedAt, id, pubsubTopic)\n) WITHOUT ROWID;"
DBG 2023-09-13 15:50:54.943+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="INSERT OR IGNORE INTO message(pubsubTopic, contentTopic, payload, version, timestamp, id, storedAt)\n  SELECT pubsubTopic, contentTopic, payload, version, senderTimestamp, id, storedAt\n  FROM message_backup;"
DBG 2023-09-13 15:50:54.943+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="INSERT OR IGNORE INTO message(pubsubTopic, contentTopic, payload, version, timestamp, id, storedAt)\n  SELECT pubsubTopic, contentTopic, payload, version, senderTimestamp, id, storedAt\n  FROM message_backup;"
DBG 2023-09-13 15:50:54.943+00:00 executing migration statement              topics="sqlite" tid=1 file=db_sqlite.nim:461 statement="DROP TABLE message_backup;"
DBG 2023-09-13 15:50:54.944+00:00 migration statement executed succesfully   topics="sqlite" tid=1 file=db_sqlite.nim:468 statement="DROP TABLE message_backup;"
DBG 2023-09-13 15:50:54.945+00:00 database user_version updated              topics="sqlite" tid=1 file=db_sqlite.nim:473 userVersion=7
DBG 2023-09-13 15:50:54.945+00:00 finished message store's sqlite database migration topics="waku archive migration" tid=1 file=migrations.nim:37

@jm-clius
Copy link
Contributor

Right, so this may have been caused by a migration that was interrupted? One impact is that potentially there might now be inconsistent schemas on upgraded nodes (since the migration failed) and a temporary backup tables that were never deleted. The fix would be to ensure that the migration is fixed and that impacted nodes can properly "get themselves out of the situation"?

@vpavlin
Copy link
Member

vpavlin commented Sep 13, 2023

Migration 00004_extendPrimaryKey "broke it"

Because it changes the PK to also inclide senderTimestamp, so when the user_version is wrong and the migrations try to re-run from scratch, since the 00004 already happened, the older migrations could have duplicates for the old PK (which was only id column) - we can have same id accross multiple messages after 00004 is applied since the PK consists of multiple columns

So if the user_version is 0, nwaku re-runs all the migrations and migration script 00002 will fail because some rows may have conflicting id values

So the real issue is really actually missing user_version which breaks the whole migration process (and forces nwaku to start from scratch, although the DB is already past that point)

@vpavlin
Copy link
Member

vpavlin commented Sep 13, 2023

My proposal would to

  1. check the user_version and if it is 0, then we are starting from scratch, or it is highly probably the screwed up v0.17.0 release
  2. Analyze the schema and decide if it actually matches 0.17.0
  3. If it does match, set the user_version in the DB

Execute all the above before migrations kick in, so when they do, we actually go from the right schema version

THen keep this workaround in for a few releases to make sure people have enough time to migrate from 0.17.0 and then remove the check

WDYT?

@Ivansete-status
Copy link
Collaborator Author

My proposal would to ...

I like the idea of checking the current schema, as you mentioned to me on Discord, and if the PK has three columns, then we assume that the version is 7, and we write 7 to the user_version, and skip the migration procedure.

@Ivansete-status
Copy link
Collaborator Author

Btw, the schema version was set to 7 in the next commit: 4509f4f

And therefore, started to be present from the nwaku 0.13.0.

@Ivansete-status
Copy link
Collaborator Author

Not sure I follow this fix - why would we consider nodes with user_version=0 to be up to date? How will we update their schemas in future? And do we know why the user_version would have been reset to 0 at all? If we know why this would have happened, I would imagine a temporary fix script is that for such nodes the user_version should be set to the latest target_version and migration skipped.

The main issue is that, when starting a node from scratch with for example 0.17.0, it doesn't set the SchemaVersion at all, although it is creating a database whose scheme reflects the expectations for version 7.

In the PR I'm applying @vpavlin's suggestion: if user_version == 0 && num_columns_in_PK == 3, then enforce SchemaVersion (user_version) to 7 within the database, and that fixes the database itself and then, a 0.19.0 version can start correctly.

@Ivansete-status
Copy link
Collaborator Author

🔔 The migration problem started to occur in version 0.14.0. In other words, the node started with SchemaVersion 7 (three columns in PK) but didn't set the user_version to 7 because the "migration" scripts didn't kick off.

The reason why the migration scripts didn't start is because the migration.nim file was moved to a deeper folder but the relative path was not updated. That can be seen by looking for "migrations.nim" in https://github.com/waku-org/nwaku/compare/v0.13.0..v0.14.0

image

info "We found user_version 0 but the database schema reflects the user_version 7"
## Force the correct schema version
? db.setUserVersion( 7 )
return ok()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in case where

  1. User is running v0.17.0
  2. We update schema in v0.21.0
  3. User updates to v0.21.0

IIUC the user_version gets set to 7 and then migrate proc returns and the migration does not happen - which is wrong - right? IMO the migration should continue with corrected user_version

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!
In this case, the migration should not happen because the database is using the schema version 7 (three columns in PK) but the user_version value was 0.

If we update the schema in 0.21.0 we will need to revisit this logic of course. In fact, I think we need to modify this logic the next time we modify the schema somehow.

Copy link
Contributor

@jm-clius jm-clius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, as it seems like this will indeed fix the immediate issue. What bothers me though is that the CREATE script will still create a DB now with user_version 0 and this logic just updates it to 7? I think the CREATE script itself should set the user_version to the correct schema number.

@Ivansete-status
Copy link
Collaborator Author

Thanks so much indeed for the feedback and comments guys!

I recently applied a better approach, advised by @vpavlin, and we aim to have a future-proof solution.

We'd need to keep this workaround during 10 version (?)

I'll elaborate more on the actual problem in the related issue

@vpavlin
Copy link
Member

vpavlin commented Sep 14, 2023

Approving, as it seems like this will indeed fix the immediate issue. What bothers me though is that the CREATE script will still create a DB now with user_version 0 and this logic just updates it to 7? I think the CREATE script itself should set the user_version to the correct schema number.

Ah, you are right, I haven't realized - if the DB is initialized from scratch, it will have user_version = 0, only if it got migrated it will have 7, so we should set the user_verion on init of the DB, which should prevent future issues with this

Copy link
Member

@vpavlin vpavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the upgrade from "broken" v0.17.0 instance to this PR and it worked.

I think we are ok even if we merge it like this, but we need to resolve the cause (which seems to be not setting user_version on DB init) soon-ish

@Ivansete-status Ivansete-status merged commit aa3e1a6 into master Sep 14, 2023
16 checks passed
@Ivansete-status Ivansete-status deleted the fix-user-version-migrate-sqlite branch September 14, 2023 12:12
vpavlin pushed a commit that referenced this pull request Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: wrong user_version in sqlite database that blocks the run of a Waku node
3 participants