-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: introduce CasManager to support chunk dedup at runtime #1626
base: master
Are you sure you want to change the base?
Commits on Sep 27, 2024
-
storage: add helper copy_file_range
Add helper copy_file_range() which: - avoid copy data into userspace - may support reflink on xfs etc Signed-off-by: Jiang Liu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for beb5cfc - Browse repository at this point
Copy the full SHA beb5cfcView commit details -
storage: improve copy_file_range
- improve copy_file_range when target os is not linux - add more comprehensive tests Signed-off-by: Yadong Ding <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 38b5708 - Browse repository at this point
Copy the full SHA 38b5708View commit details -
storage: implement CasManager to support chunk dedup at runtime
Implement CasManager to support chunk dedup at runtime. The manager provides to major interfaces: - add chunk data to the CAS database - check whether a chunk exists in CAS database and copy it to blob file by copy_file_range() if the chunk exists. Signed-off-by: Jiang Liu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 57985b8 - Browse repository at this point
Copy the full SHA 57985b8View commit details -
storage: add garbage collection in CasMgr
- Changed `delete_blobs` method in `CasDb` to take an immutable reference (`&self`) instead of a mutable reference (`&mut self`). - Updated `dedup_chunk` method in `CasMgr` to correctly handle the deletion of non-existent blob files from both the file descriptor cache and the database. - Implemented the `gc` (garbage collection) method in `CasMgr` to identify and remove blobs that no longer exist on the filesystem, ensuring the database and cache remain consistent. Signed-off-by: Yadong Ding <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f737a3c - Browse repository at this point
Copy the full SHA f737a3cView commit details -
storage: enable chunk deduplication for file cache
Enable chunk deduplication for file cache. It works in this way: - When a chunk is not in blob cache file yet, inquery CAS database whether other blob data files have the required chunk. If there's duplicated data chunk in other data files, copy the chunk data into current blob cache file by using copy_file_range(). - After downloading a data chunk from remote, save file/offset/chunk-id into CAS database, so it can be reused later. Co-authored-by: Jiang Liu <[email protected]> Co-authored-by: Yading Ding <[email protected]> Signed-off-by: Yadong Ding <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for c2e5bfa - Browse repository at this point
Copy the full SHA c2e5bfaView commit details -
docs: add documentation for cas
Add documentation for cas. Signed-off-by: Jiang Liu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a938849 - Browse repository at this point
Copy the full SHA a938849View commit details
Commits on Oct 1, 2024
-
smoke: add smoking test for cas and chunk dedup
Add smoking test case for cas and chunk dedup. Signed-off-by: Yadong Ding <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a9b8fe4 - Browse repository at this point
Copy the full SHA a9b8fe4View commit details