bug: failed to mount waku archive protocol when using postgres #2242

jakubgs · 2023-11-23T12:47:37Z

Problem

Current master as well as v0.22.0 release are stuck in a restart loop when using PostgreSQL database:

ERR 2023-11-23 12:37:19.449+00:00 4/7 Mounting protocols failed
    topics="wakunode main" tid=1 file=wakunode2.nim:89
    error="failed to mount waku archive protocol: error in mountArchive: failed execution of retention policy: failed to get Page size:"

Impact

Can't use current codebase on status.test fleet being switched to PostgreSQL.

To reproduce

Use Nim-Waku node with PostgreSQL.

Expected behavior

It works.

Screenshots/logs

If applicable, add screenshots or logs to help explain your problem.

nwaku version/commit hash

Both v0.22.0 and current master.

Additional context

store-db add more appropriate db settings for current db hw status-im/infra-status#37

The text was updated successfully, but these errors were encountered:

jakubgs · 2023-11-23T12:57:15Z

I just checked and this issue appears in v0.21.0 as well.

jakubgs · 2023-11-23T13:06:10Z

The shards.test fleet is running aeb77a3e which is current master but doesn't have this problem:

[email protected]:~ % d
CONTAINER ID   NAMES            IMAGE                              CREATED        STATUS
920badb37699   nim-waku-store   wakuorg/nwaku:deploy-shards-test   20 hours ago   Up 20 hours (healthy)

[email protected]:~ % d inspect wakuorg/nwaku:deploy-shards-test | grep commit
                "commit": "aeb77a3e",

Ivansete-status · 2023-11-23T14:18:40Z

Thanks for the detailed info!
We have a bug related to the "size" retention policy that doesn't work if the database doesn't exist.

method execute*(p: SizeRetentionPolicy,
                driver: ArchiveDriver):
                Future[RetentionPolicyResult[void]] {.async.} =
  ## when db size overshoots the database limit, shread 20% of outdated messages 

  # get page size of database
  let pageSizeRes = await driver.getPagesSize()
  let pageSize: int64 = int64(pageSizeRes.valueOr(0) div 1024)

  if pageSize == 0:
    return err("failed to get Page size: " & pageSizeRes.error)

Is something that we need to fix.

In order to allow the node to start, let's change to either the "time" or "capacity" retention policies:
e.g.

capacity:20000000

jakubgs · 2023-11-23T14:25:27Z

Good catch, thanks, will try this in a bit.

jakubgs · 2023-11-23T15:08:20Z

Indeed, I can confirm setting the retention not based on size does fix the issue at startup:

[email protected]:~ % d
CONTAINER ID   NAMES      IMAGE                              CREATED         STATUS
3d6868ff0701   nim-waku   wakuorg/nwaku:deploy-status-test   2 minutes ago   Up 2 minutes (healthy)

It's currently broken: waku-org/nwaku#2242 Signed-off-by: Jakub Sokołowski <[email protected]>

jakubgs added the bug Something isn't working label Nov 23, 2023

jakubgs mentioned this issue Nov 23, 2023

Switch Status fleet to use PostgreSQL DB status-im/infra-status-legacy#37

Closed

jakubgs assigned Ivansete-status Nov 23, 2023

jakubgs added a commit to status-im/infra-status-legacy that referenced this issue Nov 23, 2023

status-node: drop use of broken size retention policy

e6646aa

It's currently broken: waku-org/nwaku#2242 Signed-off-by: Jakub Sokołowski <[email protected]>

ABresting mentioned this issue Nov 23, 2023

fix: extended Postgres code to support retention policy + refactoring #2244

Merged

3 tasks

ABresting self-assigned this Nov 23, 2023

ABresting closed this as completed in #2244 Nov 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: failed to mount waku archive protocol when using postgres #2242

bug: failed to mount waku archive protocol when using postgres #2242

jakubgs commented Nov 23, 2023

jakubgs commented Nov 23, 2023

jakubgs commented Nov 23, 2023

Ivansete-status commented Nov 23, 2023

jakubgs commented Nov 23, 2023

jakubgs commented Nov 23, 2023

bug: failed to mount waku archive protocol when using postgres #2242

bug: failed to mount waku archive protocol when using postgres #2242

Comments

jakubgs commented Nov 23, 2023

Problem

Impact

To reproduce

Expected behavior

Screenshots/logs

nwaku version/commit hash

Additional context

jakubgs commented Nov 23, 2023

jakubgs commented Nov 23, 2023

Ivansete-status commented Nov 23, 2023

jakubgs commented Nov 23, 2023

jakubgs commented Nov 23, 2023