Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add changelog entry for validating exclude patterns * Update 030_preparing_a_new_repo.rst * prune: Fix crash on empty snapshot * prune: Don't print stack trace if snapshot can't be loaded * Update github.com/minio/minio-go/v7 to v7.0.27 This version adds support for Cloudflare R2, as discussed in restic#3757 * Update gopkg.in/yaml This fixes a panic in invalid input, but I think we aren't affected. * internal/restic: Custom ID.MarshalJSON This skips an allocation. internal/archiver benchmarks, Linux/amd64: name old time/op new time/op delta ArchiverSaveFileSmall-8 3.94ms ± 6% 3.91ms ± 6% ~ (p=0.947 n=20+20) ArchiverSaveFileLarge-8 304ms ± 3% 301ms ± 4% ~ (p=0.265 n=18+18) name old speed new speed delta ArchiverSaveFileSmall-8 1.04MB/s ± 6% 1.05MB/s ± 6% ~ (p=0.803 n=20+20) ArchiverSaveFileLarge-8 142MB/s ± 3% 143MB/s ± 4% ~ (p=0.421 n=18+19) name old alloc/op new alloc/op delta ArchiverSaveFileSmall-8 17.9MB ± 0% 17.9MB ± 0% -0.01% (p=0.000 n=19+19) ArchiverSaveFileLarge-8 382MB ± 2% 382MB ± 1% ~ (p=0.687 n=20+19) name old allocs/op new allocs/op delta ArchiverSaveFileSmall-8 540 ± 1% 528 ± 0% -2.19% (p=0.000 n=19+19) ArchiverSaveFileLarge-8 1.93k ± 3% 1.79k ± 4% -7.06% (p=0.000 n=20+20) * Fix linter check * archiver: Remove cleanup goroutine from BufferPool This isn't doing anything. Channels should get cleaned up by the GC when the last reference to them disappears, just like all other data structures. Also inlined BufferPool.Put in Buffer.Release, its only caller. * cmd/restic: Remove trailing "..." from progress messages These were added after message since the last refactor of the progress printing code. Also skips an allocation in the common case. * migrate: Cleanup option to request repository check * archiver: remove tomb usage * archiver: free workers once finished * get rid of tomb package * backend/sftp: Support atomic rename ... if the server has [email protected]. OpenSSH introduced this extension in 2008: openssh/openssh-portable@7c29661 * internal/repository: Fix LoadBlob + fuzz test When given a buf that is big enough for a compressed blob but not its decompressed contents, the copy at the end of LoadBlob would skip the last part of the contents. Fixes restic#3783. * fix handling of maxKeys in SearchKey * fix flaky key test * tweak password test count changelog * all: Move away from pkg/errors, easy cases github.com/pkg/errors is no longer getting updates, because Go 1.13 went with the more flexible errors.{As,Is} function. Use those instead: errors from pkg/errors already support the Unwrap interface used by 1.13 error handling. Also: * check for io.EOF with a straight ==. That value should not be wrapped, and the chunker (whose error is checked in the cases changed) does not wrap it. * Give custom Error methods pointer receivers, so there's no ambiguity when type-switching since the value type will no longer implement error. * Make restic.ErrAlreadyLocked private, and rename it to alreadyLockedError to match the stdlib convention that error type names end in Error. * Same with rest.ErrIsNotExist => rest.notExistError. * Make s3.Backend.IsAccessDenied a private function. * backend: Move semaphores to a dedicated package ... called backend/sema. I resisted the temptation to call the main type sema.Phore. Also, semaphores are now passed by value to skip a level of indirection when using them. * restic prune: Merge three loops over the index There were three loops over the index in restic prune, to find duplicates, to determine sizes (in pack.Size) and to generate packInfos. These three are now one loop. This way, prune doesn't need to construct a set of duplicate blobs, pack.Size doesn't need to contain special logic for prune's use case (the onlyHdr argument) and pack.Size doesn't need to construct a map only to have it immediately transformed into a different map. Some quick testing on a 160GiB local repo doesn't show running time or memory use of restic prune --dry-run changing significantly. * cmd/restic, limiter: Move config knowledge to internal packages The GlobalOptions struct now embeds a backend.TransportOptions, so it doesn't need to construct one in open and create. The upload and download limits are similarly now a struct in internal/limiter that is embedded in GlobalOptions. * Revert "restic prune: Merge three loops over the index" This reverts commit 8bdfcf7. Should fix restic#3809. Also needed to make restic#3290 apply cleanly. * repository: index saving belongs into the MasterIndex * repository: add Save method to MasterIndex interface * repository: make flushPacks private * repository: remove unused index.Store * repository: inline index.encode * repository: remove unused index.ListPack * repository: remove unused (Master)Index.Count * repository: Properly set id for finalized index As MergeFinalIndex and index uploads can occur concurrently, it is necessary for MergeFinalIndex to check whether the IDs for an index were already set before merging it. Otherwise, we'd loose the ID of an index which is set _after_ uploading it. * repository: remove MasterIndex.All() * repository: hide MasterIndex.FinalizeFullIndexes / FinalizeNotFinalIndexes * repository: simplify CreateIndexFromPacks * repository: remove unused packIDToIndex field * repository: cleanup * drop unused repository.Loader interface * redact http authorization header in debug log output * redacted keys/token in backend config debug log * redact swift auth token in debug output * Return real size from SaveBlob * Print number of bytes added to the repo This includes optional compression and crypto overhead. * stats: return storage size for raw-data mode raw-data summed up the size of the blob plaintexts. However, with compression this makes little sense as the storage size in the repository is lower due to compression. Thus sum up the actual size each blob takes in the repository. * Account for pack header overhead at each entry This will miss the pack header crypto overhead and the length field, which only amount to a few bytes per pack file. * extend compression feature changelog entry * rebuild-index: correctly rebuild index for mixed packs For mixed packs, data and tree blobs were stored in separate index entries. This results in warning from the check command and maybe other problems. * check: Print full ids The short ids are not always unique. In addition, recovering from damages is easier when having the full ids as that makes it easier to access the corresponding files. * check: remove dead code * Don't crash if SecretString is uninitialized * tag: Remove unnecessary flush call * repository: Rework blob saving to use an async pack uploader Previously, SaveAndEncrypt would assemble blobs into packs and either return immediately if the pack is not yet full or upload the pack file otherwise. The upload will block the current goroutine until it finishes. Now, the upload is done using separate goroutines. This requires changes to the error handling. As uploads are no longer tied to a SaveAndEncrypt call, failed uploads are signaled using an errgroup. To count the uploaded amount of data, the pack header overhead is no longer returned by `packer.Finalize` but rather by `packer.HeaderOverhead`. This helper method is necessary to continue returning the pack header overhead directly to the responsible call to `repository.SaveBlob`. Without the method this would not be possible, as packs are finalized asynchronously. * archiver: Limit blob saver count to GOMAXPROCS Now with the asynchronous uploaders there's no more benefit from using more blob savers than we have CPUs. Thus use just one blob saver for each CPU we are allowed to use. * archiver: Reduce tree saver concurrency Large amount of tree savers have no obvious benefit, however they can increase the amount of (potentially large) trees kept in memory. * repository: Limit to a single pending pack file Use only a single not completed pack file to keep the number of open and active pack files low. The main change here is to defer hashing the pack file to the upload step. This prevents the pack assembly step to become a bottleneck as the only task is now to write data to the temporary pack file. The tests are cleaned up to no longer reimplement packer manager functions. * Document connections and compression option * Add changelog for async pack uploads * adapt workers based on whether an operation is CPU or IO-bound Use runtime.GOMAXPROCS(0) as worker count for CPU-bound tasks, repo.Connections() for IO-bound task and a combination if a task can be both. Streaming packs is treated as IO-bound as adding more worker cannot provide a speedup. Typical IO-bound tasks are download / uploading / deleting files. Decoding / Encoding / Verifying are usually CPU-bound. Several tasks are a combination of both, e.g. for combined download and decode functions. In the latter case add both limits together. As the backends have their own concurrency limits restic still won't download more than repo.Connections() files in parallel, but the additional workers can decode already downloaded data in parallel. * Document automatic CPU/IO-concurrency * Fix data race in blob_saver After the `BlobSaver` job is submitted, the buffer can be released and reused by another `FileSaver` even before `BlobSaver.Save` returns. That FileSaver will the change `buf.Data` leading to wrong backup statistics. Found by `go test -race ./...`: WARNING: DATA RACE Write at 0x00c0000784a0 by goroutine 41: github.com/restic/restic/internal/archiver.(*FileSaver).saveFile() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:176 +0x789 github.com/restic/restic/internal/archiver.(*FileSaver).worker() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af github.com/restic/restic/internal/archiver.NewFileSaver.func2() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d golang.org/x/sync/errgroup.(*Group).Go.func1() /home/michael/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 +0x91 Previous read at 0x00c0000784a0 by goroutine 29: github.com/restic/restic/internal/archiver.(*BlobSaver).Save() /home/michael/Projekte/restic/restic/internal/archiver/blob_saver.go:57 +0x1dd github.com/restic/restic/internal/archiver.(*BlobSaver).Save-fm() <autogenerated>:1 +0xac github.com/restic/restic/internal/archiver.(*FileSaver).saveFile() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:191 +0x855 github.com/restic/restic/internal/archiver.(*FileSaver).worker() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:242 +0x2af github.com/restic/restic/internal/archiver.NewFileSaver.func2() /home/michael/Projekte/restic/restic/internal/archiver/file_saver.go:88 +0x5d golang.org/x/sync/errgroup.(*Group).Go.func1() /home/michael/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57 +0x91 * Fix minor typo in docs * Wording: change repo to repository * Restore: validate provided patterns * Add testRunRestoreAssumeFailure function * Test restore fails when using invalid patterns * Fix wording in changelog template * Add changelog entry * Added hint for --compression max in migration process Added hint for --compression max in migration process. Since this is a onetime process users should be aware of this and consider this step. * Wording: replace further repo occurrences with repository * doc: update sample help output * doc: Rework hint to repack with max compression * doc: Add note about using rclone for Google Drive It wasn't clear that Google Cloud Storage and Google Drive are two different services and that one should use the rclone backend for the latter. This commit adds a note with this information. * azure: add SAS authentication option * azure: Strip ? prefix from sas token * backup: clarify usage string Using the `--files-from` options it is possible to run `backup` without specifying any source paths directly on the command line. * prune: Enhance treatment of duplicates * prune: handle very high duplication of some blobs Suggested-By: Alexander Weiss <[email protected]> * prune: code cleanups * repository: extract LoadTree/SaveTree The repository has no real idea what a Tree is. So these methods never belonged there. * repository: extract Load/StoreJSONUnpacked A Load/Store method for each data type is much clearer. As a result the repository no longer needs a method to load / store json. * mock: move to internal/backend * limiter: move to internal/backend * crypto: move crypto buffer helpers * restorer: extract hardlinks index from restic package * backend: extract readerat from restic package * check: complain about mixed pack files * check: Complain about usage of s3 legacy layout * check: Deprecate `--check-unused` Unused blobs are not a problem but rather expected to exist now that prune by default does not remove every unused blob. However, the option has caused questions from users whether a repository is damaged or not, so just remove that option. Note that the remaining code is left intact as it is still useful for our test cases. * checker: Fix S3 legacy layout detection * Fix S3 legacy layout migration * Add changelog for stricter checks * archiver: remove dead attribute from FutureNode * archiver: cleanup Saver interface * archiver: remove unused fileInfo from progress callback * archiver: unify FutureTree/File into futureNode There is no real difference between the FutureTree and FutureFile structs. However, differentiating both increases the size of the FutureNode struct. The FutureNode struct is now only 16 bytes large on 64bit platforms. That way is has a very low overhead if the corresponding file/directory was not processed yet. There is a special case for nodes that were reused from the parent snapshot, as a go channel seems to have 96 bytes overhead which would result in a memory usage regression. * archiver: Incrementally serialize tree nodes That way it is not necessary to keep both the Nodes forming a Tree and the serialized JSON version in memory. * archiver: reduce memory usage for large files FutureBlob now uses a Take() method as a more memory-efficient way to retrieve the futures result. In addition, futures are now collected while saving the file. As only a limited number of blobs can be queued for uploading, for a large file nearly all FutureBlobs already have their result ready, such that the FutureBlob object just consumes memory. * Add changelog for the optimized tree serialization * Remove stale comments from backend/sftp The preExec and postExec functions were removed in 0bdb131 from 2018. * Speed up restic init over slow SFTP links pkg/sftp.Client.MkdirAll(d) does a Stat to determine if d exists and is a directory, then a recursive call to create the parent, so the calls for data/?? each take three round trips. Doing a Mkdir first should eliminate two round trips for 255/256 data directories as well as all but one of the top-level directories. Also, we can do all of the calls concurrently. This may reintroduce some of the Stat calls when multiple goroutines try to create the same parent, but at the default number of connections, that should not be much of a problem. * Add environment variable RESTIC_COMPRESSION * prune: separate collecting/printing/pruning * prune: split into smaller functions * prune: Add internal integrity check After repacking every blob that should be kept must have been repacked. We have seen a few cases in which a single blob went missing, which could have been caused by a bitflip somewhere. This sanity check might help catch some of these cases. * repository: try to recover from invalid blob while repacking If a blob that should be kept is invalid, Repack will now try to request the blob using LoadBlob. Only return an error if that fails. * prune: move code * repository: Test fallback to existing blobs * Add changelog for restic#3837/restic#3840 * internal/restic: Handle EINVAL for xattr on Solaris Also make the errors a bit less verbose by not prepending the operation, since pkg/xattr already does that. Old errors looked like Listxattr: xattr.list /myfiles/.zfs/snapshot: invalid argument * Add possibility to set snapshot ID (used in test) * Generalize fuse snapshot dirs implemetation + allow "/" in tags and snapshot template * Make snapshots dirs in mount command customizable * fuse: cleanup test * fuse: remove unused MetaDir * add option for setting min pack size * prune: add repack-small parameter * prune: reduce priority of repacking small packs * repository: prevent header overfill * document minPackSize * rework pack size parameter documentation * update restic help snippets in documentation * Add changelog for packsize option * rename option to --pack-size * Only repack small files if there are multiple of them * Always repack very small pack files * s3: Disable multipart uploads below 200MB * Add note that pack-size is not an exact limit * Reword prune --repack-small description * repository: StreamPack in parts if there are too large gaps For large pack sizes we might be only interested in the first and last blob of a pack file. Thus stream a pack file in multiple parts if the gaps between requested blobs grow too large. * Remove unused hooks mechanism * debug: enable debug support for release builds * debug: support roundtripper logging also for release builds Different from debug builds do not use the eofDetectRoundTripper if logging is disabled. * update documentation to reflect DEBUG_LOG for release builds * Add changelog for DEBUG_LOG available in release builds * fuse: Redesign snapshot dirstruct Cleanly separate the directory presentation and the snapshot directory structure. SnapshotsDir now translates the dirStruct into a format usable by the fuse library and contains only minimal special case rules. All decisions have moved into SnapshotsDirStructure which now creates a fully preassembled tree data structure. * Mention --snapshot-template and --time-template in changelog * mount: remove unused inode field from root node * mount: Fix parent inode used by snapshots dir * Update tests to Go 1.19 * Bump golangci-lint version * restic: Use stable sorting in snapshot policy sort.Sort is not guaranteed to be stable. Go 1.19 has changed the sorting algorithm which resulted in changes of the sort order. When comparing snapshots with identical timestamp but different paths and tags lists, there is not meaningful order among them. So just keep their order stable. * stats: Add snapshots count to json output * Fix typo with double percentage in help text * doc: Update link to GCS documentation Updates the link to Google Cloud Storage documentation about creating a service account key. * doc: Update more links to GCS documentation * forget: Error when invalid unit is given in duration policy * doc: Fix typo in compression section * forget: Fail test if duration parsing error is missing * comment cleanup gofmt reformatted the comment * mount: Map slashes in tags to underscores Suggested-by: greatroar <> * copy: replace --repo2 with --from-repo `init` and `copy` use `--repo2` with two different meaning which has proven to be confusing for users. `--from-repo` now consistently marks a source repository from which data is read. `--repo` is now always the target/destination repository. * mount: Only remember successful snapshot refreshes If the context provided by the fuse library is canceled before the index was loaded this could lead to missing snapshots. * gofmt all files Apparently the rules for comment formatting have changed with go 1.19. * doc: document aws session token * helper: don't setup cmd paths twice * helper: cleanups * helper: Reduce number of parallel builds a bit The go compiler is already parallelized. The high concurrency caused my podman container to hit a resource limit. * helper: download modules as first step There's no use in running that step in parallel. * repository: Do not report ignored packs in EachByPack Ignored packs were reported as an empty pack by EachByPack. The most immediate effect of this is that the progress bar for rebuilding the index reports processing more packs than actually exist. * Add note that larger packs increase disk wear * doc: fix typo * update dependencies * downgrade bazil/fuse again to retain macOS support * remain compatible with go 1.15 * restic: Cleanup xattr error handling for Solaris Since xattr 0.4.8 (pkg/xattr#68) returns ENOTSUP similar to Linux. * rclone: Return a permanent error if rclone already exited rclone can exit early for example when the connection to rclone is relayed for example via ssh: `-o rclone.program='ssh [email protected] forced-command'` * Polish changelog entries * doc: Improve/clarify preparing and versions of repositories * Further changelog polishing * Fix typo in the environment variable name for --from-password-file * Prepare changelog for 0.14.0 * Generate CHANGELOG.md for 0.14.0 * Update manpages and auto-completion * Add version for 0.14.0 * go mod tidy run * fix some merge errors * tweaked linting to avoid merge nightmares * took out lint because its not working and bugged me and updated the version for netapp * updated to 1.8 --------- Co-authored-by: Lorenz Bausch <[email protected]> Co-authored-by: MichaelEischer <[email protected]> Co-authored-by: Arigbede Moses <[email protected]> Co-authored-by: Alexander Neumann <[email protected]> Co-authored-by: Alexander Neumann <[email protected]> Co-authored-by: greatroar <[email protected]> Co-authored-by: Jayson Wang <[email protected]> Co-authored-by: mattxtaz <[email protected]> Co-authored-by: lbausch <[email protected]> Co-authored-by: JsBergbau <[email protected]> Co-authored-by: rawtaz <[email protected]> Co-authored-by: Roger Gammans <[email protected]> Co-authored-by: Alexander Weiss <[email protected]> Co-authored-by: Kyle Brennan <[email protected]> Co-authored-by: greatroar <@> Co-authored-by: Leo R. Lundgren <[email protected]> Co-authored-by: bmason <[email protected]>
- Loading branch information