Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch everything during a read in one go. #9

Open
whilo opened this issue Feb 23, 2023 · 4 comments
Open

Fetch everything during a read in one go. #9

whilo opened this issue Feb 23, 2023 · 4 comments

Comments

@whilo
Copy link
Member

whilo commented Feb 23, 2023

At the moment header, metadata and value are fetched sequentially, when needed, but this could be done at once to reduce latency by inspecting the :operation in the env, e.g. https://github.com/replikativ/konserve/blob/e2c1cb45708006a62a1df2133261620a2b70c3c8/src/konserve/impl/defaults.cljc#L400.

@whilo
Copy link
Member Author

whilo commented Feb 26, 2023

konserve-s3 already fetches everything at once https://github.com/replikativ/konserve-s3/blob/1d8a512f93765739557c5412788161a0f323ad58/src/konserve_s3/core.clj#L161, but there it would also be reasonable to consider :operation and pick what to fetch (and also do a range request to not fetch very large blobs, but only the first megabyte or so, if only metadata is needed).

@alekcz
Copy link
Collaborator

alekcz commented Apr 3, 2023

In JDBC we're just pulling the required column. header, meta, and data are all separate columns. So data isn't fetched unnecessarily. @whilo are you comfortable for me to close this?

@whilo
Copy link
Member Author

whilo commented Apr 4, 2023

Yes, we don't fetch redundant data, but we know already from :operation which columns we need when we fetch the header, e.g. header and metadata for https://github.com/replikativ/konserve/blob/main/src/konserve/impl/defaults.cljc#L333. Having to do multiple round trips to the SQL server increases latency accordingly (3x at the moment). In konserve-s3 I decided to always fetch everything because for Datahike that is fine (except for GC which only reads metadata) https://github.com/replikativ/konserve-s3/blob/main/src/konserve_s3/core.clj#L170, but ideally I should also dispatch on the :operation.

@whilo
Copy link
Member Author

whilo commented Apr 6, 2023

It is enough to distinguish the :operation :read-meta which does not need the value (so fetch header and metadata at once), all others need (header, metadata and value).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants