Commit graph

1431 commits

Author SHA1 Message Date
greatroar
22147e1e02 all: Minor cleanups
if x { return true } return false => return x

	fmt.Sprintf("%v", x) => fmt.Sprint(x) or x.String()

The fmt.Sprintf idiom is still used in the SecretString tests, where it
serves security hardening.
2022-10-16 10:50:39 +02:00
greatroar
d03460010f internal/restic: Fix ID.UnmarshalJSON, ParseID
ID.UnmarshalJSON accepted non-JSON input with ' as the string delimiter.
Also, the error message for non-hex input was less informative than it
could be and it performed too many checks.

Changed ParseID to keep the error messages consistent.
2022-10-16 10:39:52 +02:00
greatroar
0e155fd9a6 internal/restic: Fix UID/GID parsing
The helper function uidGidInt used strconv.ParseInt instead of
ParseUint, so it silently ignored some invalid user/group IDs.

Also, improve the error message. "Invalid UID" is more informative than
having "ParseInt" twice (*strconv.NumError displays the function name).

Finally, the user.User struct can be passed by pointer to get reduce
code size.
2022-10-14 18:21:00 +02:00
greatroar
e0b743c64d internal/restic: Remove unused ID.EqualString 2022-10-14 18:20:11 +02:00
greatroar
6922360179 ui/backup: Remove unused ProgressReporter type, Progress field 2022-10-14 14:36:19 +02:00
greatroar
d4aadfa389 all: Drop ctxhttp
This package is no longer needed, since we can use the stdlib's
http.NewRequestWithContext.

backend/rclone already did, but it needed a different error check due to
a difference between net/http and ctxhttp.

Also, store the http.Client by value in the REST backend (changed to a
pointer when ctxhttp was introduced) and use errors.WithStack instead
of errors.Wrap where the message was no longer accurate. Errors from
http.NewRequestWithContext will start with "net/http" or "net/url", so
they're easy to identify.
2022-10-14 14:33:49 +02:00
Michael Eischer
1a6160d152
Merge pull request #3880 from MichaelEischer/archiver-savedir-cleanup
archiver: Improve handling of "file xxx already present" error
2022-10-08 21:48:14 +02:00
Michael Eischer
5278ab51c8 archiver: Check that duplicates are only ignored if identical 2022-10-08 21:38:36 +02:00
Michael Eischer
403b01b788 backup: Only return a warning for duplicate directory entries
The backup command failed if a directory contains duplicate entries.
Downgrade the severity of this problem from fatal error to a warning.
This allows users to still create a backup.
2022-10-08 21:38:21 +02:00
Michael Eischer
d7d7b4ab27 archiver: refactor TreeSaverTest 2022-10-08 21:29:32 +02:00
Michael Eischer
8e38c43c27 archiver: let FutureNode.Take return an error if no data is available
This ensures that we cannot accidentally store an invalid node.
2022-10-08 21:28:39 +02:00
Michael Eischer
2b88cd6eab archiver: Restructure SaveTree to work like SaveDir
SaveTree did not use the TreeSaver but rather managed the tree
collection and upload itself. This prevents using the parallelism
offered by the TreeSaver and duplicates all related code. Using the
TreeSaver can provide some speed-ups as all steps within the backup tree
now rely on FutureNodes. This can be especially relevant for backups
with large amounts of explicitly specified files.

The main difference between SaveTree and SaveDir is, that only the
former can save tree blobs in which nodes have a different name than the
actual file on disk. This is the result of resolving name conflicts
between multiple files with the same name. The filename that must be
used within the snapshot is now passed directly to
restic.NodeFromFileInfo. This ensures that a FutureNode already contains
the correct filename.
2022-10-08 21:28:39 +02:00
Michael Eischer
2e3f1c08c5 repository: split index into a separate package 2022-10-08 21:15:34 +02:00
Michael Eischer
5760ba6989
Merge pull request #3949 from MichaelEischer/simplify-mixedpacks
repository: remove IsMixedPack and add replacement for checker
2022-10-08 21:14:14 +02:00
Michael Eischer
5600f11696 rclone: Fix stderr handling if command exits unexpectedly
According to the documentation of exec.Cmd Wait() must not be called
before completing all reads from the pipe returned by StdErrPipe(). Thus
return a context that is canceled once rclone has exited and use that as
a precondition to calling Wait(). This should ensure that all errors
printed to stderr have been copied first.
2022-10-08 20:16:06 +02:00
Michael Eischer
b8acad4da0 rclone: return rclone error instead of canceled context
When rclone fails during the connection setup this currently often
results in a context canceled error. Replace this error with the exit
code from rclone.
2022-10-08 20:15:24 +02:00
greatroar
07e5c38361 errors: Drop Cause in favor of Go 1.13 error handling
The only use cases in the code were in errors.IsFatal, backend/b2,
which needs a workaround, and backend.ParseLayout. The last of these
requires all backends to implement error unwrapping in IsNotExist.
All backends except gs already did that.
2022-10-08 13:08:08 +02:00
Michael Eischer
4bb5240720 repository: remove unused PrefixLength 2022-10-03 12:15:53 +02:00
Michael Eischer
999fe29976 repository: hide prepareCache 2022-10-03 12:15:53 +02:00
Michael Eischer
ddcf549eba repository: remove IsMixedPack and add replacement for checker
Repositories with mixed packs are probably quite rare by now. When
loading data blobs from a mixed pack file, this will no longer trigger
caching that file. However, usually tree blobs are accessed first such
that this shouldn't make much of a difference.

The checker gets a simpler replacement.
2022-10-03 12:03:59 +02:00
Michael Eischer
401e432e9d lock: Do not ignore invalid lock files
While searching for lock file from concurrently running restic
instances, restic ignored unreadable lock files. These can either be
in fact invalid or just be temporarily unreadable. As it is not really
possible to differentiate between both cases, just err on the side of
caution and consider the repository as already locked.

The code retries searching for other locks up to three times to smooth
out temporarily unreadable lock files.
2022-10-03 00:19:46 +02:00
Michael Eischer
d92957dd78 lock: Implement strict lock expiry monitoring
Restic continued e.g. a backup task even when it failed to renew the
lock or failed to do so in time. For example if a backup client enters
standby during the backup this can allow other operations like `prune`
to run in the meantime (after calling `unlock`). After leaving standby
the backup client will continue its backup and upload indexes which
refer pack files that were removed in the meantime.

This commit introduces a goroutine explicitly monitoring for locks that
are not refreshed in time. To simplify the implementation there's now a
separate goroutine to refresh the lock and monitor for timeouts for each
lock. The monitoring goroutine would now cause the backup to fail as the
client has lost it's lock in the meantime.

The lock refresh goroutines are bound to the context used to lock the
repository initially. The context returned by `lockRepo` is also
cancelled when any of the goroutines exits. This ensures that the
context is cancelled whenever for any reason the lock is no longer
refreshed.
2022-10-03 00:19:46 +02:00
Michael Eischer
7ce4cb7908
Merge pull request #3947 from MichaelEischer/fix-cache-verify-test
cache: Fix file descriptor leak in TestBackendRemoveBroken
2022-10-03 00:19:26 +02:00
Michael Eischer
430ab32941 cache: Fix file descriptor leak in TestBackendRemoveBroken 2022-10-03 00:06:44 +02:00
Michael Eischer
2e606ca70b backup: rework read concurrency 2022-10-02 22:55:14 +02:00
Michael Eischer
9ec7eee803
Merge pull request #3521 from MichaelEischer/redownload-broken-files
Redownload files with wrong hash
2022-10-02 22:50:03 +02:00
Michael Eischer
e89fc2a29d
Merge pull request #3943 from MichaelEischer/find-match-only-valid-ids
ignore filenames which are not IDs when expanding a prefix
2022-09-27 20:56:48 +02:00
Michael Eischer
5d3c5b9e50 restic: ignore filenames which are not IDs when expanding a prefix
Some backends generate additional files for each existing file, e.g.

1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef
1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef.sha256

For some commands this leads to an "multiple IDs with prefix" error when
trying to reference a snapshot.
2022-09-27 20:30:40 +02:00
Leo R. Lundgren
ebe9f2c969 rclone/sftp: Improve handling of ErrDot errors
Restic now yields a more informative error message when exec.ErrDot occurs.
2022-09-25 16:19:03 +02:00
Michael Eischer
34c1a83340 cache: Drop cache entry if it cannot be processed
Failing to process data requested from the cache usually indicates a
problem with the returned data. Assume that the cache entry is somehow
damaged and retry downloading it once.
2022-09-25 11:55:09 +02:00
Michael Eischer
aa3b1925b4 cache: Simplify loadFromCacheOrDelegate 2022-09-25 11:35:35 +02:00
Michael Eischer
5c6b6edefe retry index, lock and snapshot loading on hash mismatch 2022-09-25 11:35:35 +02:00
Michael Eischer
822422ef03 retry key loading on hash mismatch 2022-09-25 11:35:35 +02:00
Michael Eischer
78d2312ee9
Merge pull request #3854 from MichaelEischer/sparsefiles
restore: Add support for sparse files
2022-09-24 22:04:02 +02:00
Michael Eischer
19afad8a09 restore: support sparse restores also on windows 2022-09-24 21:39:39 +02:00
Michael Eischer
c147422ba5 repository: special case SaveBlob for all zero chunks
Sparse files contain large regions containing only zero bytes. Checking
that a blob only contains zeros is possible with over 100GB/s for modern
x86 CPUs. Calculating sha256 hashes is only possible with 500MB/s (or
2GB/s using hardware acceleration). Thus we can speed up the hash
calculation for all zero blobs (which always have length
chunker.MinSize) by checking for zero bytes and then using the
precomputed hash.

The all zeros check is only performed for blobs with the minimal chunk
size, and thus should add no overhead most of the time. For chunks which
are not all zero but have the minimal chunks size, the overhead will be
below 2% based on the above performance numbers.

This allows reading sparse sections of files as fast as the kernel can
return data to us. On my system using BTRFS this resulted in about
4GB/s.
2022-09-24 21:39:39 +02:00
Michael Eischer
34fe1362da restorer: move zeroPrefixLen to restic package 2022-09-24 21:39:39 +02:00
Michael Eischer
a5ebd5de4b restorer: Fix race condition in partialFile.WriteAt
The restorer can issue multiple calls to WriteAt in parallel. This can
result in unexpected orderings of the Truncate and WriteAt calls and
sometimes too short restored files.
2022-09-24 21:39:39 +02:00
Michael Eischer
5b6a77058a Enable sparseness only conditionally
We can either preallocate storage for a file or sparsify it. This
detects a pack file as sparse if it contains an all zero block or
consists of only one block. As the file sparsification is just an
approximation, hide it behind a `--sparse` parameter.
2022-09-24 21:20:00 +02:00
greatroar
5d4568d393 Write sparse files in restorer
This writes files by using (*os.File).Truncate, which resolves to the
truncate system call on Unix.

Compared to the naive loop,

	for _, b := range p {
		if b != 0 {
			return false
		}
	}

the optimized allZero is about 10× faster:

name       old time/op    new time/op     delta
AllZero-8    1.09ms ± 1%     0.09ms ± 1%    -92.10%  (p=0.000 n=10+10)

name       old speed      new speed       delta
AllZero-8  3.84GB/s ± 1%  48.59GB/s ± 1%  +1166.51%  (p=0.000 n=10+10)
2022-09-24 21:18:48 +02:00
Michael Eischer
eb83402d39
Merge pull request #3935 from miles170/master
Only display the message if there were locks to be removed
2022-09-24 20:53:13 +02:00
Michael Eischer
ef58ddd7b1
Merge pull request #3923 from MichaelEischer/fix-flaky-cache-test
cache: fix flaky TestFileSaveConcurrent on windows
2022-09-24 20:52:55 +02:00
Michael Eischer
7fc178aaf4 internal/cache: extend description of cache sharing test failure 2022-09-24 13:07:01 +02:00
Miles Liu
1acbda18f8
Only display the message if there were locks to be removed
`restic unlock` now only shows `successfully removed locks` if there were locks to be removed.
In addition, it also reports the number of the removed lock files.
2022-09-24 19:02:24 +08:00
Michael Eischer
da1a359c8b
Merge pull request #3927 from MichaelEischer/faster-index-each
Speed up MasterIndex.Each
2022-09-24 12:35:23 +02:00
Michael Eischer
041a51512a
Merge pull request #3780 from jkmw/fix/2578
Remove existing path before restoring a symlink
2022-09-24 12:34:42 +02:00
Michael Eischer
1ebd57247a repository: optimize MasterIndex.Each
Sending data through a channel at very high frequency is extremely
inefficient. Thus use simple callbacks instead of channels.

> name                old time/op  new time/op  delta
> MasterIndexEach-16   6.68s ±24%   0.96s ± 2%  -85.64%  (p=0.008 n=5+5)
2022-09-24 12:21:59 +02:00
Michael Eischer
825b95e313 repository: add benchmark for MasterIndex.Each 2022-09-24 12:21:59 +02:00
greatroar
1220fe9650 internal/cache: Concurrent use of cache not working on Windows 2022-09-17 19:49:44 +02:00
Jerome Küttner
ef618bdd3f use os.Remove if path already exists on symlink restore 2022-09-14 08:14:31 +02:00