Commit graph

1446 commits

Author SHA1 Message Date
Michael Eischer
c147422ba5 repository: special case SaveBlob for all zero chunks
Sparse files contain large regions containing only zero bytes. Checking
that a blob only contains zeros is possible with over 100GB/s for modern
x86 CPUs. Calculating sha256 hashes is only possible with 500MB/s (or
2GB/s using hardware acceleration). Thus we can speed up the hash
calculation for all zero blobs (which always have length
chunker.MinSize) by checking for zero bytes and then using the
precomputed hash.

The all zeros check is only performed for blobs with the minimal chunk
size, and thus should add no overhead most of the time. For chunks which
are not all zero but have the minimal chunks size, the overhead will be
below 2% based on the above performance numbers.

This allows reading sparse sections of files as fast as the kernel can
return data to us. On my system using BTRFS this resulted in about
4GB/s.
2022-09-24 21:39:39 +02:00
Michael Eischer
34fe1362da restorer: move zeroPrefixLen to restic package 2022-09-24 21:39:39 +02:00
Michael Eischer
a5ebd5de4b restorer: Fix race condition in partialFile.WriteAt
The restorer can issue multiple calls to WriteAt in parallel. This can
result in unexpected orderings of the Truncate and WriteAt calls and
sometimes too short restored files.
2022-09-24 21:39:39 +02:00
Michael Eischer
5b6a77058a Enable sparseness only conditionally
We can either preallocate storage for a file or sparsify it. This
detects a pack file as sparse if it contains an all zero block or
consists of only one block. As the file sparsification is just an
approximation, hide it behind a `--sparse` parameter.
2022-09-24 21:20:00 +02:00
greatroar
5d4568d393 Write sparse files in restorer
This writes files by using (*os.File).Truncate, which resolves to the
truncate system call on Unix.

Compared to the naive loop,

	for _, b := range p {
		if b != 0 {
			return false
		}
	}

the optimized allZero is about 10× faster:

name       old time/op    new time/op     delta
AllZero-8    1.09ms ± 1%     0.09ms ± 1%    -92.10%  (p=0.000 n=10+10)

name       old speed      new speed       delta
AllZero-8  3.84GB/s ± 1%  48.59GB/s ± 1%  +1166.51%  (p=0.000 n=10+10)
2022-09-24 21:18:48 +02:00
Michael Eischer
eb83402d39
Merge pull request #3935 from miles170/master
Only display the message if there were locks to be removed
2022-09-24 20:53:13 +02:00
Michael Eischer
ef58ddd7b1
Merge pull request #3923 from MichaelEischer/fix-flaky-cache-test
cache: fix flaky TestFileSaveConcurrent on windows
2022-09-24 20:52:55 +02:00
Michael Eischer
7fc178aaf4 internal/cache: extend description of cache sharing test failure 2022-09-24 13:07:01 +02:00
Miles Liu
1acbda18f8
Only display the message if there were locks to be removed
`restic unlock` now only shows `successfully removed locks` if there were locks to be removed.
In addition, it also reports the number of the removed lock files.
2022-09-24 19:02:24 +08:00
Michael Eischer
da1a359c8b
Merge pull request #3927 from MichaelEischer/faster-index-each
Speed up MasterIndex.Each
2022-09-24 12:35:23 +02:00
Michael Eischer
041a51512a
Merge pull request #3780 from jkmw/fix/2578
Remove existing path before restoring a symlink
2022-09-24 12:34:42 +02:00
Michael Eischer
1ebd57247a repository: optimize MasterIndex.Each
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.

> name                old time/op  new time/op  delta
> MasterIndexEach-16   6.68s ±24%   0.96s ± 2%  -85.64%  (p=0.008 n=5+5)
2022-09-24 12:21:59 +02:00
Michael Eischer
825b95e313 repository: add benchmark for MasterIndex.Each 2022-09-24 12:21:59 +02:00
greatroar
1220fe9650 internal/cache: Concurrent use of cache not working on Windows 2022-09-17 19:49:44 +02:00
Jerome Küttner
ef618bdd3f use os.Remove if path already exists on symlink restore 2022-09-14 08:14:31 +02:00
Michael Eischer
8b9778d537
Merge pull request #3900 from MichaelEischer/b2-init-timeout
Add timeout for the initial connection to B2
2022-09-10 23:28:59 +02:00
Michael Eischer
17c27400f8
Merge pull request #3921 from MichaelEischer/filter-cleanup-error-handling
filter: deduplicate error handling for pattern validation
2022-09-10 23:24:50 +02:00
Michael Eischer
be9ccc186e
Merge pull request #3875 from MichaelEischer/fix-fuse-context-cancel
mount: Fix input/output errors for canceled syscalls
2022-09-10 23:20:29 +02:00
Michael Eischer
8e0ca80547 filter: deduplicate error handling for pattern validation 2022-09-09 23:12:41 +02:00
Michael Eischer
8b4dd70013 migrate: Report why an migration cannot be applied
Just returning that `Migration upgrade cannot be applied: check failed`
is not too useful when running `migrate upgrade_repo_v2`.
2022-09-03 11:49:31 +02:00
Michael Eischer
6c69f08a7b
Merge pull request #3905 from DRON-666/haspaths-linear
Reduce quadratic time complexity of `Snapshot.HasPaths`
2022-08-30 20:35:56 +02:00
DRON-666
d0f1060df7 Fix quadratic time complexity of Snapshot.HasPaths 2022-08-30 04:38:17 +03:00
Michael Eischer
e5b2c4d571 b2: sniff the error that caused init retry loops 2022-08-28 17:46:03 +02:00
Michael Eischer
dc2db2de5e b2: cancel connection setup after a minute
If the connection to B2 fails, the library enters an endless loop.
2022-08-28 14:56:17 +02:00
Michael Eischer
7682149c9d repository: cleanup copy connection count check 2022-08-28 11:40:56 +02:00
Michael Eischer
b03277ead5 repository: don't hang when copying using a single connection 2022-08-28 11:40:31 +02:00
Fred
be6baaec12 Add success callback to the backend 2022-08-27 22:27:15 +02:00
Fred
baf58fbaa8 Add unit tests 2022-08-27 22:21:06 +02:00
Fred
d629333efe Add function to notify of success after retrying 2022-08-27 22:21:06 +02:00
Michael Eischer
908f7441fe
Merge pull request #3885 from MichaelEischer/delete-fixes
Improve reliability of upload retries and B2 file deletions
2022-08-26 22:30:50 +02:00
Michael Eischer
4c90d91d4d backend: Test that failed uploads are not removed for backends with atomic replace 2022-08-26 21:20:52 +02:00
Michael Eischer
cf0a8d7758 sftp: Only connect once for repository creation
This is especially useful if ssh asks for a password or if closing the
initial connection could return an error due to a problematic server
implementation.
2022-08-26 20:50:40 +02:00
Michael Eischer
dd7cd5b9b3 fuse: remove unused context parameter 2022-08-26 20:48:48 +02:00
Michael Eischer
a0c1ae9f90 mount: Correctly return context.Canceled for interrupted syscalls
bazil/fuse expects us to return context.Canceled to signal that a
syscall was successfully interrupted. Returning a wrapped version of
that error however causes the fuse library to signal an EIO (input/output
error). Thus unwrap context.Canceled errors before returning them.
2022-08-26 20:48:48 +02:00
MichaelEischer
f7808245aa
Merge pull request #3878 from MichaelEischer/cheaper-cache-load
cache: Just try to open cache entry without calling stat first
2022-08-26 20:33:36 +02:00
MichaelEischer
bee15dd555
Merge pull request #3879 from MichaelEischer/mem-optimize
Some random (minor) memory-allocation optimizations
2022-08-26 20:33:02 +02:00
MichaelEischer
be90a565cc
Merge pull request #3887 from MichaelEischer/rclone-permanent-error
rclone: Return a permanent error if rclone already exited
2022-08-24 21:19:00 +02:00
Michael Eischer
506d92e87c rclone: Return a permanent error if rclone already exited
rclone can exit early for example when the connection to rclone is
relayed for example via ssh: `-o rclone.program='ssh user@example.org
forced-command'`
2022-08-23 22:05:04 +02:00
Michael Eischer
7681a63fdb restic: Cleanup xattr error handling for Solaris
Since xattr 0.4.8 (https://github.com/pkg/xattr/pull/68) returns ENOTSUP
similar to Linux.
2022-08-23 21:25:15 +02:00
Michael Eischer
623556bab6 b2: Increase list size to maximum
Just request as many files as possible in one call to reduce the number
of network roundtrips.
2022-08-21 11:20:03 +02:00
Michael Eischer
de0162ea76 backend/retry: Overwrite failed uploads instead of deleting them
For backends which are able to atomically replace files, we just can
overwrite the old copy, if it is necessary to retry an upload. This has
the benefit of issuing one operation less and might be beneficial if a
backend storage, due to bugs or similar, could mix up the order of the
upload and delete calls.
2022-08-21 11:14:53 +02:00
Michael Eischer
fc506f8538 b2: Repeat deleting until all file versions are removed
When hard deleting the latest file version on B2, this uncovers earlier
versions. If an upload required retries, multiple version might exist
for a file. Thus to reliably delete a file, we have to remove all
versions of it.
2022-08-21 11:11:00 +02:00
Michael Eischer
cc4728d287 repository: Do not report ignored packs in EachByPack
Ignored packs were reported as an empty pack by EachByPack. The most
immediate effect of this is that the progress bar for rebuilding the
index reports processing more packs than actually exist.
2022-08-21 10:38:40 +02:00
Michael Eischer
7a992fc794 repository: Reduce buffer reallocations in ForAllIndexes
Previously the buffer was grown incrementally inside `repo.LoadUnpacked`.
But we can do better as we already know how large the index will be.
Allocate a bit more memory to increase the chance that the buffer can be
reused in the future.
2022-08-19 21:13:40 +02:00
Michael Eischer
77b1980d8e repository: MasterIndex.Packs: reduce allocations 2022-08-19 21:10:43 +02:00
Michael Eischer
6ff9517e45 repository: MasterIndex.ListPacks / Index.EachByPack allow earlier GC
Allow earlier garbage collection of some of the intermediate data
structures.
2022-08-19 21:06:33 +02:00
Michael Eischer
ce902aac67 cache: Just try to open cache entry without calling stat first
Instead of first checking whether a file is in the repository cache and
then opening it, we just can open the file. This saves one stat call. If
the file is in the cache, everything is fine and otherwise the code
follows its normal fallback path.
2022-08-19 20:59:06 +02:00
MichaelEischer
0d9ac78437
Merge pull request #3873 from MichaelEischer/gofmt-comments
gofmt comments
2022-08-19 19:54:30 +02:00
MichaelEischer
7e96a5af62
Merge pull request #3872 from MichaelEischer/fuse-fix
mount: Only remember successful snapshot refreshes
2022-08-19 19:21:29 +02:00
Michael Eischer
f414db987d gofmt all files
Apparently the rules for comment formatting have changed with go 1.19.
2022-08-19 19:12:26 +02:00
Michael Eischer
522406b4f0 mount: Only remember successful snapshot refreshes
If the context provided by the fuse library is canceled before the index
was loaded this could lead to missing snapshots.
2022-08-19 19:07:07 +02:00
Michael Eischer
af50fe9ac0 mount: Map slashes in tags to underscores
Suggested-by: greatroar <>
2022-08-19 18:17:57 +02:00
Michael Eischer
2ea6c82cf6 comment cleanup
gofmt reformatted the comment
2022-08-18 20:15:38 +02:00
Michael Eischer
bb27f7408c forget: Fail test if duration parsing error is missing 2022-08-18 20:14:09 +02:00
Leo R. Lundgren
6f517858e8 forget: Error when invalid unit is given in duration policy 2022-08-10 13:37:26 +02:00
MichaelEischer
9ad3ad5972
Merge pull request #3850 from lbausch/go1.19
Update tests to Go 1.19
2022-08-07 14:56:17 +02:00
MichaelEischer
2930a102de
Merge pull request #3731 from metalsp0rk/feature/min-packsize-flag
Feature: min packsize flag
2022-08-07 14:54:45 +02:00
Michael Eischer
f3fdc66b32
restic: Use stable sorting in snapshot policy
sort.Sort is not guaranteed to be stable. Go 1.19 has changed the
sorting algorithm which resulted in changes of the sort order. When
comparing snapshots with identical timestamp but different paths and
tags lists, there is not meaningful order among them. So just keep their
order stable.
2022-08-07 14:10:40 +02:00
Michael Eischer
0b7291b8b2 mount: Fix parent inode used by snapshots dir 2022-08-07 13:03:32 +02:00
greatroar
cfa80e2c6b mount: remove unused inode field from root node 2022-08-07 13:03:26 +02:00
MichaelEischer
74ae76036f
Merge pull request #2913 from aawsome/mount-snapshot-slashes
mount: Make snapshots dir structure customizable
2022-08-07 12:27:59 +02:00
Michael Eischer
caa17988a3 fuse: Redesign snapshot dirstruct
Cleanly separate the directory presentation and the snapshot directory
structure. SnapshotsDir now translates the dirStruct into a format
usable by the fuse library and contains only minimal special case rules.
All decisions have moved into SnapshotsDirStructure which now creates a
fully preassembled tree data structure.
2022-08-07 12:13:06 +02:00
Michael Eischer
1ed775e3a8 debug: support roundtripper logging also for release builds
Different from debug builds do not use the eofDetectRoundTripper if
logging is disabled.
2022-08-05 23:49:39 +02:00
Michael Eischer
38becfc436 debug: enable debug support for release builds 2022-08-05 23:49:39 +02:00
Michael Eischer
82c268c917 Remove unused hooks mechanism 2022-08-05 23:49:39 +02:00
Michael Eischer
7266f07c87 repository: StreamPack in parts if there are too large gaps
For large pack sizes we might be only interested in the first and last
blob of a pack file. Thus stream a pack file in multiple parts if the
gaps between requested blobs grow too large.
2022-08-05 23:48:36 +02:00
Michael Eischer
7f3b2be1e8 s3: Disable multipart uploads below 200MB 2022-08-05 23:48:36 +02:00
Michael Eischer
1b076cda97 rename option to --pack-size 2022-08-05 23:47:43 +02:00
Kyle Brennan
1e3f05c3f1 repository: prevent header overfill 2022-08-05 23:47:12 +02:00
Michael Eischer
0a6fa602c8 add option for setting min pack size 2022-08-05 23:47:12 +02:00
Michael Eischer
2db7733ee3 fuse: remove unused MetaDir 2022-08-05 23:46:46 +02:00
Michael Eischer
f678f7cb04 fuse: cleanup test 2022-08-05 23:46:46 +02:00
Alexander Weiss
1751afae26 Make snapshots dirs in mount command customizable 2022-08-05 23:46:46 +02:00
Alexander Weiss
57f4003f2f Generalize fuse snapshot dirs implemetation
+ allow "/" in tags and snapshot template
2022-08-05 23:46:46 +02:00
Alexander Weiss
696c18e031 Add possibility to set snapshot ID (used in test) 2022-08-05 23:46:46 +02:00
MichaelEischer
04a8ee80fb
Merge pull request #3829 from MichaelEischer/prune-refactor
Split prune into slightly small functions
2022-08-05 23:29:52 +02:00
greatroar
ad6ac680af internal/restic: Handle EINVAL for xattr on Solaris
Also make the errors a bit less verbose by not prepending the operation,
since pkg/xattr already does that. Old errors looked like

    Listxattr: xattr.list /myfiles/.zfs/snapshot: invalid argument
2022-08-01 12:45:17 +02:00
Michael Eischer
73053674d9 repository: Test fallback to existing blobs 2022-07-30 17:37:07 +02:00
Michael Eischer
623770eebb repository: try to recover from invalid blob while repacking
If a blob that should be kept is invalid, Repack will now try to request
the blob using LoadBlob. Only return an error if that fails.
2022-07-30 17:37:07 +02:00
greatroar
2bdc40e612 Speed up restic init over slow SFTP links
pkg/sftp.Client.MkdirAll(d) does a Stat to determine if d exists and is
a directory, then a recursive call to create the parent, so the calls
for data/?? each take three round trips. Doing a Mkdir first should
eliminate two round trips for 255/256 data directories as well as all
but one of the top-level directories.

Also, we can do all of the calls concurrently. This may reintroduce some
of the Stat calls when multiple goroutines try to create the same
parent, but at the default number of connections, that should not be
much of a problem.
2022-07-30 13:09:08 +02:00
greatroar
23ebec717c Remove stale comments from backend/sftp
The preExec and postExec functions were removed in
0bdb131521 from 2018.
2022-07-30 13:07:25 +02:00
Michael Eischer
4a10ebed15 archiver: reduce memory usage for large files
FutureBlob now uses a Take() method as a more memory-efficient way to
retrieve the futures result. In addition, futures are now collected
while saving the file. As only a limited number of blobs can be queued
for uploading, for a large file nearly all FutureBlobs already have
their result ready, such that the FutureBlob object just consumes
memory.
2022-07-23 14:45:07 +02:00
Michael Eischer
b817681a11 archiver: Incrementally serialize tree nodes
That way it is not necessary to keep both the Nodes forming a Tree and
the serialized JSON version in memory.
2022-07-23 14:45:07 +02:00
Michael Eischer
c206a101a3 archiver: unify FutureTree/File into futureNode
There is no real difference between the FutureTree and FutureFile
structs. However, differentiating both increases the size of the
FutureNode struct.

The FutureNode struct is now only 16 bytes large on 64bit platforms.
That way is has a very low overhead if the corresponding file/directory
was not processed yet.

There is a special case for nodes that were reused from the parent
snapshot, as a go channel seems to have 96 bytes overhead which would
result in a memory usage regression.
2022-07-23 14:45:07 +02:00
Michael Eischer
32f4997733 archiver: remove unused fileInfo from progress callback 2022-07-23 14:16:23 +02:00
Michael Eischer
dcb00fd2d1 archiver: cleanup Saver interface 2022-07-23 14:16:23 +02:00
Michael Eischer
79321a195c archiver: remove dead attribute from FutureNode 2022-07-23 14:16:23 +02:00
Michael Eischer
5a6f2f9fa0 Fix S3 legacy layout migration 2022-07-23 11:19:32 +02:00
Michael Eischer
04e49924fb checker: Fix S3 legacy layout detection 2022-07-23 11:19:32 +02:00
Michael Eischer
fcb3ddf181 check: Complain about usage of s3 legacy layout 2022-07-23 11:19:32 +02:00
Michael Eischer
8b8bd4e8ac check: complain about mixed pack files 2022-07-23 11:19:32 +02:00
MichaelEischer
443cc49afd
Merge pull request #3830 from MichaelEischer/cleanup-repo
Extract Load/SaveTree/JSONUnpacked from repository
2022-07-23 10:46:13 +02:00
MichaelEischer
1f5369e072
Merge pull request #3831 from MichaelEischer/move-code
Move code out of the restic package and consolidate backend specific code
2022-07-23 10:33:05 +02:00
Michael Eischer
9729e6d7ef backend: extract readerat from restic package 2022-07-17 15:29:09 +02:00
Michael Eischer
c44b21d366 restorer: extract hardlinks index from restic package 2022-07-17 13:45:42 +02:00
Michael Eischer
8c11fc3ec9 crypto: move crypto buffer helpers 2022-07-17 13:42:23 +02:00
Michael Eischer
a0cef9f247 limiter: move to internal/backend 2022-07-17 13:40:15 +02:00
Michael Eischer
163ab9c025 mock: move to internal/backend 2022-07-17 13:40:06 +02:00
Michael Eischer
89d3ce852b repository: extract Load/StoreJSONUnpacked
A Load/Store method for each data type is much clearer. As a result the
repository no longer needs a method to load / store json.
2022-07-17 13:22:00 +02:00
Michael Eischer
fbcbd5318c repository: extract LoadTree/SaveTree
The repository has no real idea what a Tree is. So these methods never
belonged there.
2022-07-17 13:11:28 +02:00