You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Historically DataShards could exchange large readsets (with actual read results), so it makes sense that DataShard doesn't keep them in memory after sending, and re-reads them on every resend attempts. However, modern "generic" readsets are usually very small (3 bytes), and it doesn't make any sense to re-read them on every disconnect. We should just store these small readsets in memory and avoid unnecessary transactions.
Additionally, we don't use column families for OutReadSets, so readset data is stored with all other columns. This means we read all data at init time anyway. We should probably move data column into a separate column family, or keep that data in memory until we send these readsets for the first time.
Next, the current progress queue may be quadratic in nature, since we don't clear sent readsets hashset when adding them to the progress queue. This means in an unlikely case where pipe fails again before the queue is drained we could re-add these readsets multiple times unnecessarily. This should be an easy fix too.
The text was updated successfully, but these errors were encountered:
Historically DataShards could exchange large readsets (with actual read results), so it makes sense that DataShard doesn't keep them in memory after sending, and re-reads them on every resend attempts. However, modern "generic" readsets are usually very small (3 bytes), and it doesn't make any sense to re-read them on every disconnect. We should just store these small readsets in memory and avoid unnecessary transactions.
Additionally, we don't use column families for
OutReadSets
, so readset data is stored with all other columns. This means we read all data at init time anyway. We should probably move data column into a separate column family, or keep that data in memory until we send these readsets for the first time.Next, the current progress queue may be quadratic in nature, since we don't clear sent readsets hashset when adding them to the progress queue. This means in an unlikely case where pipe fails again before the queue is drained we could re-add these readsets multiple times unnecessarily. This should be an easy fix too.
The text was updated successfully, but these errors were encountered: