You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
EDIT: When deployed on August 24, 2022, the PR reduced peak RAM use by over 200GB (out of over 300GB total reduction). Initial estimate of -150GB was based on old checkpoint file. By August, checkpoint file grew substantially so memory savings were better. Duration is about 16 minutes today (Sep 7), it was 46-58 minutes in mid-August, and it was 11-17 hours in Dec 2021 depending on system load.
Problem
Recent increase in transactions is causing WAL files to get created more frequently, causing checkpoints to happen more frequently, increasing checkpoint file size, and increasing ledger state size in memory. These increases are causing checkpointing to consume too much RAM and take more than 2x longer than earlier this year.
File Size
Checkpoint Frequency
Early 2022
53 GB
0-2 times per day
July 8, 2022
126 GB
every 2 hours
Without PR #1944 the system checkpointing would currently be:
taking well over 20-30 hours each time, making it impossible to complete every 2 hours
requiring more operational RAM, making OOM crashes very frequent
creating billions more allocations and gc pressure, consuming CPU cycles and slowing down EN
After PR #1944 reduced Mtrie flattening and serialization phase to under 5 minutes (which sometimes took 17 hours on mainnet16), creating a separate MTrie state currently accounts for most of the duration and memory used by checkpointing. This opens up new possibilities such as reusing ledger state to significantly reduce duration and operational RAM of checkpointing again.
We can avoid creating a separate MTrie state during checkpoint creation. This can reduce peak RAM use by (very roughly) about 150GB and reduce checkpoint duration by 24 minutes (estimates based on snapshot of July 8, 2022). Memory savings will increase over time.
Determine if it's feasible to avoid creating a separate MTrie state during checkpoint creation. If the poof-of-concept doesn't reveal showstoppers then proceed with new PR.
Currently WAL/Checkpoints are disconnected from the mtrie progression - checkpoint is essentially a state after given complete WAL segment, and segments creation is an implementation detail for mForest and WAL.
If, however, we were able to signal the moment of new WAL creation, we should stop evicting mtries/keep a separate index and use it to create new checkpoint without reallocating memory.
fxamacker
changed the title
[Execution State] Determine if it's feasible to avoid creating separate MTrie state during checkpoint creation
[Execution State] Avoid creating separate MTrie state during checkpoint creation to reduce peak RAM use by 152GB and checkpoint duration by 24 minutes
Jul 12, 2022
fxamacker
changed the title
[Execution State] Avoid creating separate MTrie state during checkpoint creation to reduce peak RAM use by 152GB and checkpoint duration by 24 minutes
[Execution State] Avoid creating separate MTrie state during checkpoint creation to reduce peak RAM use by ~150GB and checkpoint duration by 24 minutes
Aug 12, 2022
fxamacker
changed the title
[Execution State] Avoid creating separate MTrie state during checkpoint creation to reduce peak RAM use by ~150GB and checkpoint duration by 24 minutes
[Execution State] Avoid creating separate MTrie state during checkpoint creation for about -330GB peak RAM use and -32 minutes duration
Sep 8, 2022
fxamacker
changed the title
[Execution State] Avoid creating separate MTrie state during checkpoint creation for about -330GB peak RAM use and -32 minutes duration
[Execution State] Avoid creating separate MTrie state during checkpoint creation for about -200GB peak RAM use and -32 minutes duration
Sep 8, 2022
EDIT: When deployed on August 24, 2022, the PR reduced peak RAM use by over 200GB (out of over 300GB total reduction). Initial estimate of -150GB was based on old checkpoint file. By August, checkpoint file grew substantially so memory savings were better. Duration is about 16 minutes today (Sep 7), it was 46-58 minutes in mid-August, and it was 11-17 hours in Dec 2021 depending on system load.
Problem
Recent increase in transactions is causing WAL files to get created more frequently, causing checkpoints to happen more frequently, increasing checkpoint file size, and increasing ledger state size in memory. These increases are causing checkpointing to consume too much RAM and take more than 2x longer than earlier this year.
Without PR #1944 the system checkpointing would currently be:
After PR #1944 reduced Mtrie flattening and serialization phase to under 5 minutes (which sometimes took 17 hours on mainnet16), creating a separate MTrie state currently accounts for most of the duration and memory used by checkpointing. This opens up new possibilities such as reusing ledger state to significantly reduce duration and operational RAM of checkpointing again.
Updates epic #1744
The Proposed Solution
We can avoid creating a separate MTrie state during checkpoint creation. This can reduce peak RAM use by (very roughly) about 150GB and reduce checkpoint duration by 24 minutes (estimates based on snapshot of July 8, 2022). Memory savings will increase over time.
Determine if it's feasible to avoid creating a separate MTrie state during checkpoint creation. If the poof-of-concept doesn't reveal showstoppers then proceed with new PR.
Proof-of-concept [EN Performance] [POC] Reduce operational RAM by 152+ GB and checkpoint duration by 24 mins by reusing ledger state #2770 - show it is feasible to avoid creating separate MTrie state.
PR [EN Performance] Reuse ledger state for about -200GB peak RAM, -160GB disk i/o, and about -32 minutes duration #2792 This has fewer edge cases to handle than PR 2770 and looser coupling between layers but may require extra locks compared.
The text was updated successfully, but these errors were encountered: