Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repository - going away from transactions, log, refcounts? #7377

Closed
ThomasWaldmann opened this issue Feb 24, 2023 · 7 comments · Fixed by #8332
Closed

repository - going away from transactions, log, refcounts? #7377

ThomasWaldmann opened this issue Feb 24, 2023 · 7 comments · Fixed by #8332
Assignees
Milestone

Comments

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Feb 24, 2023

current state (current master branch, borg 1.x, borg 0.x, attic)

A borg repository is primarily a key/value store (with some aux functions).

The key is the chunk id (== MAC(plaintext)), the value is the compressed/encrypted/authenticated data.

borg uses transactions and a LOG when writing to the repo:

  • start of transaction (usually triggered by PUT/DEL)
  • writes more objects by appending PUT entries to the log
  • deletes objects by appending DEL entries to the log
  • commits (appends a COMMIT entry to the log)
  • end of transaction (S: saves repo index and hints, C: saves chunks index and files cache)

LOG means that new stuff is always appended at the end of the last/current segment file. In general, old segment files are never modified in place.

borg compact defrags non-compact segment files:

  • a segment file contains PUTs, DELs, COMMITs
  • if a PUT(id) is later deleted by a DEL(id), it creates a logical hole in a segment file (that object is not used any more), making it non-compact
  • compaction / defragging works by reading all still-needed objects from an old segment file and appending them to a new segment file. after that is finished, the old segment file is deleted (and that frees disk space because the new segment file is smaller).

advantages of this approach

  • transactions and append-only log are a very safe approaches (even if stuff crashes it usually can roll back to previous state and be fine again)
  • segment files are medium size files: not too large, not too small, not too many
    • works well even with not very scalable filesystems
    • has little overhead due to fs block / cluster size
    • can be copied or deleted rather quickly (not many fs objects)

disadvantages of this approach

  • borg compact can cause lots of I/O when shuffling objects from old non-compact segments to new compact segments
  • borg compact needs some space on the fs to be able to work. bad if your fs is 100% full...
  • compaction code is rather complex, same for transaction management
  • to quickly access objects, the repository needs an index mapping id -> (segment, offset, flags)
  • borg currently loads the repo index (hashtable) into memory. RAM usage is about 44b * object_count + free space in hashtable. if you have a lot of files and/or a lot of data volume, repo index can need GBs of RAM.
  • to implement this, some special borg code is needed with access to the repo filesystem
  • hard to work like this without locking the repository against concurrent access.
@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Feb 24, 2023

alternative implementation

We'll need a key/value store, of course.
The aux functions will need checking (e.g. the nonce reservation stuff won't be needed for borg2 repos).

But maybe we could go away from transactions and LOG-like behaviour and rather use garbage collection (gc) to clean up anything that is not referenced (either because it is not used any more, because all references got deleted or because it was never used, because something crashed before the action was completed).

We could delegate a lot of stuff borg used to do with own code to the filesystem:

  • the key would directly map to a fs path (e.g. data/01/0123/0123456789abcdef...)
  • object exists/read/write/delete would map to fs operations on the respective file
  • besides "data", there could be other namespaces (e.g. archives, manifest, keys, config, ...)

For speed reasons, the borg client could still have a chunks index, e.g. id -> (refcount, size) (maybe something simpler without precise refcount could be sufficient also for a lot of operations).

Why/how we could live without transactions

borg create

  • borg create would write file content chunks, then it references these chunks from an item (which is written to a metadata stream), the metadata stream is referenced by writing an archive item, the archive item will be referenced after writing a manifest entry. The manifest entry is the root of the tree of references - until it is written none of the other references can be discovered / will be considered by borg check.
  • if anything crashes before writing the manifest entry, we'll just have a lot of objects with too high reference counts:
    • some new chunks (which did not exist in the repo before). borg gc would recompute the correct refcount and delete them as it finds refcount==0.
    • some old chunks would have a too high refcount (but are not orphans because older backups reference them) - borg gc would recompute the correct refcount, but not delete them as refcount>0.
    • in general, too high refcounts are not a critical issue, they just might waste some space until the next borg check.

borg delete / borg prune

  • deleting an archive could just kill its manifest entry and leave everything else to the next borg check to clean up
  • super easy and fast, but space would only be freed by next borg gc

borg gc (maybe as part of borg check?)

  • needs to lock the repo against any concurrent access
  • needs to establish correct sets of all existing objects in the repo and all used/referenced objects. then compute all unreferenced objects.
  • needs to run client-side as archives (and other data structures containing references) are encrypted
  • removes all unreferenced objects (deletes files, cleans up empty dirs), frees repo space

advantages

  • simpler borg code
  • no "repo index" needed to find an object
  • no segment files, thus no need for compact_segments (borg compact).
  • easier to implement concurrent operations in the repo (some stuff like borg check / borg gc must still lock the repo. reference counting also hard to parallelise.)

disadvantages

  • needs a well scalable fs as it would contain magnitudes more objects than the backup source filesystems (due to number of backups, due to chunking)
  • likely higher space usage for the repo due to fs overhead (fs block / chunk size)
  • copying or deleting a repo would take rather long (lots of fs objects)
  • likely we would still need some clientside hashtables for good speed (to avoid network roundtrips, to avoid fs access)
  • borg check would be slower (more fs operations, smaller files)

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Feb 24, 2023

refcounts discussion moved to #7379.

@RonnyPfannschmidt
Copy link
Contributor

Extra consideration could be done for something like lmdb or git style packs

@ThomasWaldmann ThomasWaldmann changed the title repository (and indexes/caches) - going away from transactions and log? repository - going away from transactions, log, refcounts? Feb 24, 2023
@ThomasWaldmann ThomasWaldmann self-assigned this Feb 24, 2023
@jorickert
Copy link
Contributor

I think duplicacy uses this file system based approach. They managed to implement the GC part to be lock free too, relying on atomic properties of the used file system

@ThomasWaldmann ThomasWaldmann removed their assignment Jun 8, 2023
@ThomasWaldmann ThomasWaldmann added this to the ng1 milestone Sep 13, 2023
@DiagonalArg
Copy link

DiagonalArg commented Nov 8, 2023

I think duplicacy uses this file system based approach. They managed to implement the GC part to be lock free too, relying on atomic properties of the used file system

Indeed. What's missing from duplicacy is the capacity to fuse-mount backups. There have been a couple of PR's but apparently their performance is poor. It's the only reason I haven't been using it.

@ThomasWaldmann ThomasWaldmann self-assigned this Aug 13, 2024
@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Aug 13, 2024

Updates:

In my experimental branch, I implemented a new Repository class:

  • working with borgstore
  • without repo index (each chunk is a separate store object)
  • without transactions
  • without log-like behaviour
  • as there are no segment files, no compact_segments is needed
  • borg compact is a garbage collector now, deleting unused objects from the store
  • the single "manifest-with-list-of-archives" chunk of borg 1.x was split into:
    • a config/manifest store object with some metadata
    • an archives/* store namespace with 1 store object per archive

@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Aug 22, 2024

Hmm, thinking about whether we still need checkpoint archives...

borg 1.x rolled back the repo if there was no commit, thus we saved/committed checkpoint archives to save progress in case borg create gets interrupted by having the checkpoint archive reference chunks and having an intermediate COMMIT in the repo, so that rollback does not roll back all progress.

borg2 (as in #8332) does not do repo transactions any more and only "cleans up" if the user is invoking borg compact (which ideally is NOT after each borg create, but less frequently), thus we don't need checkpoint archives any more?

removing archive checkpointing would make the code quite a bit simpler.

update: removed checkpoint archives and part files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants
@RonnyPfannschmidt @ThomasWaldmann @DiagonalArg @jorickert and others