You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the S3 scenario, combining opendal's AsyncRead and AsyncSeek abstractions with object_store::layers::LruCacheLayer results in the LRU cache storing data from the seek point to the end of the file almost every time a seek + read operation is performed.
Improvement Plan:
Puffin and Index should abandon using AsyncRead + AsyncSeek as input sources. Instead, we should use RangeReader, which takes Range<u64> as read parameters. This way, it can adapt to opendal, accurately read the range from object store, and avoid caching unnecessary content.
What type of bug is this?
Other
What subsystems are affected?
Datanode
Minimal reproduce step
What did you expect to see?
no duplicate ranges in cached files
What did you see instead?
too many duplications ranges like
0 ~ 20480
,4 ~ 20480
2G data files will consume ~60G disk cache
What operating system did you use?
ArchLinux AMD64
What version of GreptimeDB did you use?
nightly
Relevant log output and stack trace
No response
The text was updated successfully, but these errors were encountered: