You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.
Grenad is a append only key value store, which means that it can only create key value by sorting the keys first and merging the conflicting ones, these key values are also called sorted-string tables (SSTable).
The library also supports sorters, this is a datastructure that allows the user to give keys in any specific order, grenad will sort the key-values in memory and the user can define the maximum amount of memory to use, once this sorter reach the max memory specified it sort and merge them in memory then dump them on disk.
The grenad library, and more specifically the sorter datastructure, is used in the indexing system, keys and values are given to the sorter and the sorter is limited to the max memory specified at program startup, divided by the number of threads dedicated to the indexing task and then divided by the number of sorters in each of the threads.
This comes to be a problem, sometimes the sorter datastructure "blocks" the indexing system, it tries to dump one (1) entry to disk, this is because we use RoaringBitmap and that the max memory of the sorter can only accept one value in memory. RoaringBitmap sometimes needs to represent u32s with multiple containers that takes much space and therefore this serialized value triggers the chunk disk dumping.
There is maybe multiple solutions to fix that:
Increase the amount of memory allocated to the sorters by either:
Reducing the number of threads used for indexing, this gives more memory to the individual sorter for each threads.
Increasing the max memory allocated to the program itself.
Share the same allocator for all of the threads and grenad sorters, combined with an allocator wrapper it would help balance memory usage.
We could also use something like a memmapped Vec, like zkp-mmap-vec but with resize features.
We could also move the data that is too big into another file, it would work well with an enum that specifies where the data is located and oversized data is moved into another place (like a file).
The text was updated successfully, but these errors were encountered:
Grenad is a append only key value store, which means that it can only create key value by sorting the keys first and merging the conflicting ones, these key values are also called sorted-string tables (SSTable).
The library also supports sorters, this is a datastructure that allows the user to give keys in any specific order, grenad will sort the key-values in memory and the user can define the maximum amount of memory to use, once this sorter reach the max memory specified it sort and merge them in memory then dump them on disk.
The grenad library, and more specifically the sorter datastructure, is used in the indexing system, keys and values are given to the sorter and the sorter is limited to the max memory specified at program startup, divided by the number of threads dedicated to the indexing task and then divided by the number of sorters in each of the threads.
This comes to be a problem, sometimes the sorter datastructure "blocks" the indexing system, it tries to dump one (1) entry to disk, this is because we use RoaringBitmap and that the max memory of the sorter can only accept one value in memory. RoaringBitmap sometimes needs to represent u32s with multiple containers that takes much space and therefore this serialized value triggers the chunk disk dumping.
There is maybe multiple solutions to fix that:
The text was updated successfully, but these errors were encountered: