The Save methods of the BlobSaver, FileSaver and TreeSaver return early
on when the archiver is stopped due to an error. For that they select on
both the tomb.Dying() and context.Done() channels, which can lead to a
race condition when the tomb is killed due to an error: The tomb first
closes its Dying channel before canceling all child contexts.
Archiver.SaveDir only aborts its execution once the context was
canceled. When the tomb killing is paused between closing its Dying
channel and canceling the child contexts, this lets the
FileSaver/TreeSaver.Save methods return immediately, however, ScanDir
still reads further files causing the test case to fail.
As a killed tomb always cancels all child contexts and as the Savers
always use a context bound to the tomb, it is sufficient to just use
context.Done() as escape hatch in the Save functions. This fixes the
mismatch between SaveDir and Save.
Adjust the tests to use contexts bound to the tomb for all interactions
with the Savers.
The previous implementation was repeating the implementation that is
found inside of io.WriteString. Simplify by making use of the stdlib's
implementation.
The username and hostname for new keys can be specified with the new
--user and --host flags, respectively. The flags are used only by the
`key add` command and are otherwise ignored.
This allows adding keys with for a desired user and host without having
to run restic as that particular user on that particular host, making
automated key management easier.
Co-authored-by: James TD Smith <ahktenzero@mohorovi.cc>
The method only ever receives *hashing.Writers, which don't implement
io.Closer. These come from packerManager.findPacker and have their
actual writers closed in Repository.savePacker. Moving the closing logic
to hashing.Writer results in "file already closed" errors.
When loading a blob, restic first looks up pack files containing the
blob. To avoid unnecessary work an already cached pack file is preferred.
However, if there is only a single pack file to choose from (which is
the normal case) sorting the one-element list won't change anything.
Therefore avoid the unnecessary cache check in that case.
The previous benchmark spent much of its time allocating RNGs and
generating too many random numbers. It now spends 90% of its time
hashing and half of the rest writing to files.
name old time/op new time/op delta
PackerManager-8 319ms ± 1% 247ms ± 1% -22.48% (p=0.000 n=20+18)
name old speed new speed delta
PackerManager-8 143MB/s ± 1% 213MB/s ± 1% +48.63% (p=0.000 n=10+18)
name old alloc/op new alloc/op delta
PackerManager-8 635kB ± 0% 92kB ± 0% -85.48% (p=0.000 n=10+19)
name old allocs/op new allocs/op delta
PackerManager-8 1.64k ± 0% 1.43k ± 0% -12.76% (p=0.000 n=10+20)
Archivers TestMetadataChanged incorrectly clears the Extended Attributes
from the expected metadata of the temporary file. This is incorrect as on
SELinux enabled filesystem, as the kernel will automaticly add a SElinux
label. However, since ExtendedAttributes{} != ExtendedAttributes{nil} we
still need to clear them if there are no attributes found.
The pool was used improperly, causing more allocations to be
performed than without it.
name old time/op new time/op delta
SaveAndEncrypt-8 36.8ms ± 2% 36.9ms ± 2% ~ (p=0.218 n=10+10)
name old speed new speed delta
SaveAndEncrypt-8 114MB/s ± 2% 114MB/s ± 2% ~ (p=0.218 n=10+10)
name old alloc/op new alloc/op delta
SaveAndEncrypt-8 21.1MB ± 0% 21.0MB ± 0% -0.44% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
SaveAndEncrypt-8 79.0 ± 0% 77.0 ± 0% -2.53% (p=0.000 n=10+10)
The `dump`, `find`, `forget`, `ls`, `mount`, `restore`, `snapshots`,
`stats` and `tag` commands will now take into account multiple
`--host` and `-H` flags.
Much simpler implementation that guarantees each required pack
is downloaded only once (and hence does not need to manage
pack cache). Also improves large file restore performance.
Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
internal/archiver.readdir and internal/fs.ReadDir were unused.
internal/fs.ReadDirNames and internal/archiver.readdirnames were doing
nearly the same thing, except one sorted its output and opened with
fs.O_NOFOLLOW. Both were only used in internal/archiver.
Each of the random test files was split into the same five blobs. The
test fails once the fifth blob is passed on to `SaveBlob`. That is for
certain interleavings of goroutine execution it would be possible for
the test to trigger the testErr just after storing the first file.
The fixed test uses a different file content for each of the nine files
and fails after writing the fourth blob. The file content is also small
enough to ensure that for each file only a single blob is saved. This
guarantees that the test cannot fail before reading the first four
files. FileReadConcurrency = 2 allows up to two files queued for
processing. Therefore the test can at most open the sixth file before it
has to save the fourth file / blob which triggers the testErr.
internal/ui/jsonstatus and termstatus sound similar but are not related
in any way. Instead `internal/ui/backup` and `internal/ui/jsonstatus/status`
are the counterparts. Rename the latter to `internal/ui/json/backup` to
make this clear.
jsonstatus wrote the JSON output without synchronization to the
stdio_wrapper which caused mangling between different status lines.
Use the Print and Error methods of termstatus instead which use a
central goroutine to synchronize output.
I was running "golangci-lint" and found this two warnings
internal/checker/checker.go:135:18: (*Checker).LoadIndex$3 - result 0 (error) is always nil (unparam)
final := func() error {
^
internal/repository/repository.go:457:18: (*Repository).LoadIndex$3 - result 0 (error) is always nil (unparam)
final := func() error {
^
It turns out that these functions are used only in "RunWorkers(...)",
which is used only two times in whole project right after this "final"
functions.
And because these "final" functions always return "nil", I've
descided, that it would be better to remove requriments for "final" func
to return error to avoid magick "return nil" at their end.
In some (rare) cases "fake" UID 51234 may exist in a system running a
test. When this is the case, `cmp.Equal(want, node3)` will fail based on
difference between empty string and an actual username present in a
system.
Fixes github issue #2372
Restic used to quit if the repository password was typed incorrectly once.
Restic will now ask the user again for the repository password if typed incorrectly.
The user will now get three tries to input the correct password before restic quits.
Return valid directory info from Lstat() for parent directories of the
specified filename. Previously only "/" and "." were valid directories.
Also set directory mode as this is checked by archiver.
Closes#2063
Windows does not have a concept of a `change time` in the sense as Unix
has it: the field `CreationTime` of the `Win32FileAttributeData` struct
is not updated when attributes or content is changed. So from now on
we're using the `LastWriteTime` as the `change time` on Windows.
Since I could not remember what the value for `Check` means this commit
renames it to `SameFile`: when set to true, the test should make sure
that `FileChanged` should return false (=file is unmodified).
Sometimes restic gets bogus timestamps which cannot be converted to
JSON, because the stdlib JSON encoder returns an error if the year is
not within [0, 9999]. We now make sure that we at least record _some_
timestamp and cap the year either to 0000 or 9999. Before, restic would
refuse to save the file at all, so this improves the status quo.
This fixes#2174 and #1173
This commit is a followup to the addition of the --group-by flag for the
snapshots command. Adding the grouping code there introduced duplicated
code (the forget command also does grouping). This commit refactors
boths sides to only use shared code.
This commit changes the signatures for repository.LoadAndDecrypt and
utils.LoadAll to allow passing in a []byte as the buffer to use. This
buffer is enlarged as needed, and returned back to the caller for
further use.
In later commits, this allows reducing allocations by reusing a buffer
for multiple calls, e.g. in a worker function.
The `s3.storage-class` option can be passed to restic (using `-o`) to
specify the storage class to be used for S3 objects created by restic.
The storage class is passed as-is to S3, so it needs to be understood by
the API. On AWS, it can be one of `STANDARD`, `STANDARD_IA`,
`ONEZONE_IA`, `INTELLIGENT_TIERING` and `REDUCED_REDUNDANCY`. If
unspecified, the default storage class is used (`STANDARD` on AWS).
You can mix storage classes in the same bucket, and the setting isn't
stored in the restic repository, so be sure to specify it with each
command that writes to S3.
Closes#706
This change allows passing no arguments to rclone, using `-o
rclone.args=""`. It is helpful when running rclone remotely via SSH
using a key with a forced command (via `command=` in `authorized_keys`).
This commit changes the internal file system implementation for reading
data from stdin, it now returns an error when no bytes could be read. I
think it's worth failing in this case, the user instructed restic to
read some data from stdin, and no data was read at all. Maybe it was in
a pipe and some earlier stage failed.
See #2135 for a short discussion.
When restic reads the backup from stdin, the number of bytes processed
was always displayed as zero. The reason is that the UI for the archive
uses the total bytes as returned by the scanner, which is zero for
stdin. So instead we keep track of the real number of bytes processed
and print that at the end.
Closes#2136
Make restic forget --keep-within accept time ranges measured in hours and choose
accordingly which snapshots to keep and which to forget. Add relative tests.
don't create fileInfo structs for empty files. this saves memory.
this also avoids extra serial scan of all fileInfo, which should
make restore faster and more consistent.
Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
reworked restore error callback to use file location
path instead of much heavier Node. this reduced restore
memory usage by as much as 50% in some of my tests.
Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
* uses less memory as common prefix is only stored once
* stepping stone for simpler error callback api, which
will allow further memory footprint reduction
Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
When the scanner is slower than the actual backup, the tomb cancels the
context passed to Scan(), which then returns ctx.Err(). In the end, the
main function prints an error message that is not helpful ("Context
cancelled") and exits with an error code although no error occurred.
The code now ignores the error in the context and just uses it for
cancellation. The scanner is not supposed to return an error anyway.
Closes#1978
Sometimes, users run restic without retaining the local cache
directories. This was reported several times in the past.
Restic will now print a message whenever a new cache directory is
created from scratch (i.e. it did not exist before), so users have a
chance to recognize when the cache is not kept between different runs of
restic.
This commit adds a command called `self-update` which downloads the
latest released version of restic from GitHub and replacing the current
binary with it. It does not rely on any external program (so it'll work
everywhere), but still verifies the GPG signature using the embedded GPG
public key.
By default, the `self-update` command is hidden behind the `selfupdate`
built tag, which is only set when restic is built using `build.go`. The
reason for this is that downstream distributions will then not include
the command by default, so users are encouraged to use the
platform-specific distribution mechanism.
Adds a SelectByName method to the archive and scanner which only require
the filename as input, and can thus be run before calling lstat on the
file. Can speed up scanning significantly if a lot of filename excludes
are used.
Accessing beyond the end of the file now removes the file from the cache
because it is assumed to be truncated. Usually, this means that the data
is fetched directly from the backend instead.
traverseTree() was meant to call enterDir() whenever a directory is
selected for restore, either explicitly or implicitly (=contains a file
which is to be restored). After restoring a file, leaveDir() is called
in reverse order for all intermediate directories so that the metadata
can be restored.
When a directory is selected implicitly, the metadata for it is
restored. This is different from the previous restorer behavior, which
created implicitly selected intermediate directories with permissions
0700 (only user can read/write it).
This commit changes the behavior back to the old one. Only a directory
is explicitly selected for restore, enterDir()/leaveDir() are called for
it. Otherwise, only visitNode() is called, so visitNode() needs to make
sure the parent directory exists. If the directory is explicitly
included, leaveDir() will then restore the metadata correctly.
When we decide to change the behavior (restore metadata for all
intermediate directories, even if selected implicitly), we should do
that in the selection functions, not here.
This finally resolves#1870
This fixes#1833, which consists of two different bugs:
* The `defer` in `cacheFile()` may remove a channel from the
`inProgress` map although it is not responsible for downloading the
file
* If the download fails, goroutines waiting for the file to be cached
assumed that the file was there, there was no way to signal the
error.
When the archiver is faster than the scanner, restic deadlocks. This
commit adds a `finished` channel to the struct in `ui/backup.go` so that
scanner results are ignored when the archiver is already finished.
Closes#1834
If our ssh process has died, not only the next, but all subsequent
calls to clientError() should indicate the error.
restic output when the ssh process is killed with "kill -9":
Save(<data/afb68adbf9>) returned error, retrying after 253.661803ms: Write: failed to send packet header: write |1: file already closed
Save(<data/afb68adbf9>) returned error, retrying after 580.752212ms: ssh command exited: signal: killed
Save(<data/afb68adbf9>) returned error, retrying after 790.150468ms: ssh command exited: signal: killed
Save(<data/afb68adbf9>) returned error, retrying after 1.769595051s: ssh command exited: signal: killed
[...]
error in cleanup handler: ssh command exited: signal: killed
Before this patch:
Save(<data/de698d934f>) returned error, retrying after 252.84163ms: Write: failed to send packet header: write |1: file already closed
Save(<data/de698d934f>) returned error, retrying after 660.236963ms: OpenFile: failed to send packet header: write |1: file already closed
Save(<data/de698d934f>) returned error, retrying after 568.049909ms: OpenFile: failed to send packet header: write |1: file already closed
Save(<data/de698d934f>) returned error, retrying after 2.428813824s: OpenFile: failed to send packet header: write |1: file already closed
[...]
error in cleanup handler: failed to send packet header: write |1: file already closed
This commit changes how the worker goroutines for saving e.g. blobs
interact. Before, it was possible to get stuck sending an instruction to
archive a file or dir when no worker goroutines were available any more.
This commit introduces a `done` channel for each of the worker pools,
which is set to the channel returned by `tomb.Dying()`, so it is closed
when the first worker returned an error.
This commit changes the archiver so that low-level errors saving data to
the repo are returned to the caller (instead of being handled by the
error callback function). This correctly bubbles up errors like a full
temp file system and makes restic abort early and makes all other worker
goroutines exit.
This now keeps the cursor at the first column of the first status line
so that messages printed to stdout or stderr by some other part of the
progarm will still be visible. The message will overwrite the status
lines, but those are easily reprinted on the next status update.