If a pack file is missing try to determine the contained pack ids based
on the repository index. This helps with assessing the damage to a
repository before running `rebuild-index`.
Just passing the list of blobs to packsToBlobs would also work in most
cases, however, it could cause unexpected results when multiple pack
files have the same prefix. Forget found prefixes to prevent this.
Apparently readahead was disabled by default. Enable readahead with the
Linux default size of 128kB. Larger values seem to have no effect.
This can speed up reading from the fuse mount by at least factor 5.
Speedup for a 1G random file stored in a local repository:
(Only one result shown, but times were quite stable, restarted restic
after each command)
$ dd if=/dev/urandom bs=1M count=1024 of=rand
$ shasum -a 256 tmp/rand
75dd9b374e712577d64672a05b8ceee40dfc45dce6321082d2c2fd51d60c6c2d tmp/rand
before: $ time shasum -a 256 fuse/snapshots/latest/tmp/rand
75dd9b374e712577d64672a05b8ceee40dfc45dce6321082d2c2fd51d60c6c2d fuse/snapshots/latest/tmp/rand
real 0m18.294s
user 0m4.522s
sys 0m3.305s
before: $ time cat fuse/snapshots/latest/tmp/rand > /dev/null
real 0m14.924s
user 0m0.000s
sys 0m4.625s
after: $ time shasum -a 256 fuse/snapshots/latest/tmp/rand
75dd9b374e712577d64672a05b8ceee40dfc45dce6321082d2c2fd51d60c6c2d fuse/snapshots/latest/tmp/rand
real 0m6.106s
user 0m3.115s
sys 0m0.182s
after: $ time cat fuse/snapshots/latest/tmp/rand > /dev/null
real 0m3.096s
user 0m0.017s
sys 0m0.241s
This patch adds a `--latest` option to limit snapshots list to the n
last snapshots. It is very similar to the `--last` one but does not
limit to one entry. It also deprecates the `--last` flag usage in
favor of `--latest 1`
Output example:
$ restic snapshots --latest 2
repository 0d3eb989 opened successfully, password is correct
ID Time Host Tags Paths
------------------------------------------------------------
5a33bdcc 2020-12-14 12:30:00 local /home
73887d8e 2020-12-15 12:30:00 local /home
------------------------------------------------------------
2 snapshots
Signed-off-by: Sébastien Gross <seb•ɑƬ•chezwam•ɖɵʈ•org>
Previously the progress bar / status update interval used
stdoutIsTerminal to determine whether it is possible to update the
progress bar or not. However, its implementation differed from the
detection within the backup command which included additional checks to
detect the presence of mintty on Windows. mintty behaves like a terminal
but uses pipes for communication.
This adds stdoutCanUpdateStatus() which calls the same terminal detection
code used by backup. This ensures that all commands consistently switch
between interactive and non-interactive terminal mode.
stdoutIsTerminal() now also returns true whenever stdoutCanUpdateStatus()
does so. This is required to properly handle the special case of mintty.
The `init` and `copy` commands can now use `--repository-file2` flag and
the `$RESTIC_REPOSITORY_FILE2` environment variable.
This also fixes the conflict with the `--repository-file` and `--repo2`
flag.
Tests are added for the initSecondaryGlobalOpts function.
This adds a NOK function to the test helper functions. This NOK tests if
err is not nil, and otherwise fail the test.
With the NOK function a couple of sad paths are tested in the
initSecondaryGlobalOpts function.
In total the tests checks wether the following are passed correct:
- Password
- PasswordFile
- Repo
- RepositoryFile
The following situation must return an error to pass the test:
- no Repo or RepositoryFile defined
- Repo and RepositoryFile defined both
This avoids problems when for some reason the JSON encoding changes.
This also ensures forward compatibility with future restic versions
which might e.g. add new fields to the tree metadata.
This commit changes the error message so that a list of file names is
printed. Before, just the raw map was printed, which is not a great user
interface.
For example `restic find --show-pack-id --blob f78dc991 5b9e4366 ddd8c7d4`
would previously only expand one blob if all of them belong to the same
file.
This assigns an id to each tree root and then keeps track of how many
tree loads (i.e. trees referenced for the first time) are pending per
tree root. Once a tree root and its subtrees were fully processed there
are no more pending tree loads and the tree root is reported as
processed.
When a file system is mounted at a directory, lstat() returns attributes
of the root node of the mounted file system, including the device ID of
the other file system. The previous code used when --one-file-system is
specified excluded the directory itself because of that.
This commit changes the code so that mountpoints are kept as empty
directories, its attributes set to the root note of the mounted file
system. The behavior mimics `tar`, which does the same.
Note that this fix only solves the statistics problem, if
all duplicates are marked for repacking.
If not all duplicates are marked for repacking, we lack the
information which
The situation that not all duplicates are marked for repacking can occur
when using the `max-repack-size` option
UnusedBlobs now directly reads the list of existing blobs from the
repository index. This removes the need for the blobStatusExists flag,
which in turn allows converting the blobRefs map into a BlobSet.
Add a callback to the PruneOptions struct which calculates the number of
bytes allowed to be unused after prune is done. This way, the logic is
closer to the option parsing code.
Also, add an explicit option `unlimited` for the use case when storage
does not matter but bandwidth and time do. Internally, this sets the
maximum number of unused bytes to MaxUint64.
Rework the documentation slightly so that no more "packs" are
mentioned and it talks about "files" instead.
Make it clear in the documentation that the percentage given to
`--max-unused` is relative to the whole repository size after pruning is
done. If specified, it must be below 100%, otherwise the repository
would contain 100% of unused data, which is pointless.
I had a hard time coming up with the correct formula to calculate the
maximum number of unused bytes based on the number of used bytes. For a
fraction `p` (0 ≤ p < 1), a repo with `u` bytes used, and the number of
unused bytes `x` the following holds:
x ≤ p * (u+x)
⇔ x ≤ p*u + p*x
⇔ x - p*x ≤ p*u
⇔ x * (1-p) ≤ p*u
⇔ x ≤ p/(1-p) * u
The VSS support works for 32 and 64-bit windows, this includes a check that
the restic version matches the OS architecture as required by VSS. The backup
operation will fail the user has not sufficient permissions to use VSS.
Snapshotting volumes also covers mountpoints but skips UNC paths.
The io.Reader interface does not support contexts, such that it is
necessary to embed the context into the backendReaderAt struct. This has
the problem that a reader might suddenly stop working when it's
contained context is canceled. However, this is now problem here as the
reader instances never escape the calling function.
Now that lockRepo receives a context, it is possible that it is canceled
before a lock was created. Thus `unlockRepo` must be able to handle this
case.
The archiver first called the Select function for a path before checking
whether the Lstat on that path actually worked. The RejectFuncs in
exclude.go worked around this by checking whether they received a nil
os.FileInfo. Checking first is more obvious and requires less code.
This removes the requirement on `restic self-update --output` to point
to a path of an existing file, to overwrite. In case the specified
path does exist we still want to verify that it's a regular file,
rather than a directory or a device, which gets overwritten.
We also want to verify that a path to a new file exists within an
existing directory. The alternative being running into that issue
after the actual download, etc has completed.
While at it I also replace `errors.Errorf` with the more appropriately
verbose `errors.Fatalf`.
Resolves#2491
As an alternative to -r, this allows to read the repository URL
from a file in order to prevent certain types of information leaks,
especially for URLs containing credentials.
Fixes#1458, fixes#2900.
This allows creating multiple repositories with identical chunker
parameters which is required for working deduplication when copying
snapshots between different repositories.
When the diff calculation compares two trees with identical id then no
differences between them can ever show up. Optimize for that case by
simply traversing the tree only once to collect all referenced blobs for
a proper calculation of added and removed blobs.
Just skipping the common subtrees is not possible as this would skew the
results if the added or removed blobs are shared with one of the
subtrees.
There was an issue that prevented the dump command from working
correctly when either:
* `/` contained multiple nodes (e.g. `restic backup /`)
* dumping a file in the first sublevel was attempted (e.g. `/foo`)
The help messages suggested that the `ls` command work without
explicitly passing a snapshot ID. However, this was never the case:
without a snapshot ID the command just failed with the error
`Ignoring "", it is not a snapshot id`.
Fixes#2299
Use the `Original` field of the copied snapshot to store a persistent
snapshot ID. This can either be the ID of the source snapshot if
`Original` was not yet set or the previous value stored in the
`Original` field. In order to still copy snapshots modified using the
tags command the source snapshot is compared to all snapshots in the
destination repository which have the same persistent ID. Snapshots are
only considered equal if all fields except `Original` and `Parent`
match. That way modified snapshots are still copied while avoiding
duplicate copies at the same time.
The standard UNIX-style ordering of command-line arguments places
optional flags before other positional arguments. All of restic's
commands support this ordering, but some of the usage strings showed the
flags after the positional arguments (which restic also parses just
fine). This change updates the doc strings to reflect the standard
ordering.
Because the `restic help` command comes directly from Cobra, there does
not appear to be a way to update the argument ordering in its usage
string, so it maintains the non-standard ordering (positional arguments
before optional flags).
Add a copy command to copy snapshots between repositories. It allows the user
to specify a destination repository, password, password-file, password-command
or key-hint to supply the necessary details to open the destination repository.
You need to supply a list of snapshots to copy, snapshots which already exist
in the destination repository will be skipped.
Note, when using the network this becomes rather slow, as it needs to read the
blocks, decrypt them using the source key, then encrypt them again using the
destination key before finally writing them out to the destination repository.
The old behavior was problematic in the context of rebuild-index as it
could leave old, possibly invalid index files behind without returning a
fatal error.
Prune calls rebuildIndex before removing any data from the repository.
For this use case failing to delete an old index MUST be treated as a
fatal error. Otherwise the index could still contain an old index file
that refers to blobs/packs that were later on deleted by prune. Later
backup runs will assume that the affected blobs already exist in the
repository which results in a backup which misses data.
The previous check only approximately verified whether all required
blobs were found. However, after forgetting a few snapshots the
repository contains lots of unused blobs whose number can be sufficient
to make up for missing packs.
When coupled with a malfunctioning backend that temporarily returns broken
data this could cause restic to regard the corresponding packs as
invalid and thereby delete data that's still in use. This change lets
restic play it safe and refuse to delete anything if data is missing.
Do not lock the repository if --no-lock global flag is set. This allows
to mount repositories which are archived on a read only system.
Signed-off-by: Sébastien Gross <seb•ɑƬ•chezwam•ɖɵʈ•org>
The seen BlobSet always contained a subset of the entries in blobs.
Thus use blobs instead and avoid the memory overhead of the second set.
Suggested-by: Alexander Weiss <alex@weissfam.de>
If a data blob and a tree blob with the same ID (= same content) exist,
then the checker did not report a data or tree blob as unused when the
blob of the other type was still in use.
The backup command used to return a zero exit code as long as a snapshot
could be created successfully, even if some of the source files could not
be read (in which case the snapshot would contain the rest of the files).
This made it hard for automation/scripts to detect failures/incomplete
backups by looking at the exit code. Restic now returns the following exit
codes for the backup command:
- 0 when the command was successful
- 1 when there was a fatal error (no snapshot created)
- 3 when some source data could not be read (incomplete snapshot created)
That site might not have supported https:// when those links were
originally added. It does now.
Also dropping the _spec.html_ ending of the url, there being a `<link
rel="canonical" ...>` tag suggesting that that no longer being the
preferred address.
cmd/restic/globals.go already provides Printf, Println and Warnf wrapper
which get their output streams from the globalOptions object. This
allows for stream replacements when testing.
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
-> also cleans up pending index entries when they are saved in the index
-> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
-> removes the need of an index uploader
-> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
errors.Fatalf wraps a error and just keeps an error message as a string.
This prevents the `restic.IsAlreadyLocked(err)` check from working as
the error is no longer an ErrAlreadyLocked.
Just add an additional remark to the error using `errors.WithMessage`.
This command can only be built on Darwin, FreeBSD and Linux
(and if we upgrade bazil.org/fuse, only FreeBSD and Linux:
https://github.com/bazil/fuse/issues/224).
Listing the few supported operating systems explicitly here makes
porting restic to new platforms easier.
`term.Print` sends the output via a channel to a goroutine which
actually prints the message. This may race with the password prompt
printed by `OpenRepository` resulting in a missing prompt.
restic uses a cleanup hook to ensure that it restores the terminal
configuration to a sane state, when restic is interrupted while reading
a password from the terminal. However, this causes a problem, when
restic runs in a background job, as reconfiguring a terminal will cause
a SIGTTOU to be sent to restic pausing it. Therefore, restic seems to
hang on shutdown when it was running in the background.
This commit changes the behavior to only restore the terminal
configuration if restic was interrupted while reading a password from
the terminal. As reading a password from the terminal requires that
restic is in the foreground, this should avoid restic getting stopped.
Fixes#2298
Issue introduced in #402
In a damaged repository with a missing blob, the error message tried to
dereference the subtreeID field of the current node, which is a file
however. Said field is set to nil for a file thus causing a segfault
when dereferenced.
Fix this by using the actual parentTreeID.
The username and hostname for new keys can be specified with the new
--user and --host flags, respectively. The flags are used only by the
`key add` command and are otherwise ignored.
This allows adding keys with for a desired user and host without having
to run restic as that particular user on that particular host, making
automated key management easier.
Co-authored-by: James TD Smith <ahktenzero@mohorovi.cc>
The `dump`, `find`, `forget`, `ls`, `mount`, `restore`, `snapshots`,
`stats` and `tag` commands will now take into account multiple
`--host` and `-H` flags.
internal/ui/jsonstatus and termstatus sound similar but are not related
in any way. Instead `internal/ui/backup` and `internal/ui/jsonstatus/status`
are the counterparts. Rename the latter to `internal/ui/json/backup` to
make this clear.
Restic used to quit if the repository password was typed incorrectly once.
Restic will now ask the user again for the repository password if typed incorrectly.
The user will now get three tries to input the correct password before restic quits.
The help text for `restic stats` lists a number of modes in a list.
Make sure the "more info" text is a separate paragraph rather than
being part of the list.
With this change it is possible to dump a folder to stdout as a tar. The
It can be used just like the normal dump command:
`./restic dump fa97e6e1 "/data/test/" > test.tar`
Where `/data/test/` is a a folder instead of a file.
This commit is a followup to the addition of the --group-by flag for the
snapshots command. Adding the grouping code there introduced duplicated
code (the forget command also does grouping). This commit refactors
boths sides to only use shared code.
This commit moves the code which is used to group snapshots in the
snapshots command into an own function to deduplicate code shared by the
snapshots command and forget command.
This commit will add json tags to the structs for json output, so all
json variables of the snapshot command output are lowercase and
snake-case.
Furthermore it adds some internal code changes based on the feedback in
the pull request #2087.
This commit adds a --group-by option to the snapshots command, which
behaves similar to the --group-by option of forget. Valid option values
are "host, paths, tags". If this option is given, the output of
snapshots will be divided into multiple tables, according to the value
given (i.e. "host" will create a table of snapshots for each host, that
has a snapshot in the list). Also the JSON output will be grouped.
The default behavior (when --group-by is not given) has not changed.
More to this discussion can be found in issue #2037.
Reading the password from non-terminal stdin used io.ReadFull with a
byte slice of length 1000.
We are now using a Scanner to read one line of input, independent of its
length.
Additionally, if stdin is not a terminal, the password is read only
once instead of twice (in an effort to detect typos).
Fixes#2203
Signed-off-by: Peter Schultz <peter.schultz@classmarkets.com>
This commit changes the signatures for repository.LoadAndDecrypt and
utils.LoadAll to allow passing in a []byte as the buffer to use. This
buffer is enlarged as needed, and returned back to the caller for
further use.
In later commits, this allows reducing allocations by reusing a buffer
for multiple calls, e.g. in a worker function.
This patch makes it more explicit what is meant by the CACHEDIR.TAG file.
It not only has to have this particular name, but also a specific content
(described at http://bford.info/cachedir/spec.html), which is not immediately
obvious to the user.
This adds a test of the json output of the forget command, by running it
once, asking it to keep one snapshot, and verifying that the output has
the right number of snapshots listed in the Keep and Remove fields of
the result.
This commit changes the logic slightly: checking the permissions in the
fuse mount when nobody else besides the current user can access the fuse
mount does not sense. The current user has access to the repo files in
addition to the password, so they can access all data regardless of what
the fuse mount does.
Enabling `--allow-root` allows the root user to access the files in the
fuse mount, for this user no permission checks will be done anyway.
The code now enables `DefaultPermissions` automatically when
`--allow-other` is set, it can be disabled with
`--no-default-permissions` to restore the old behavior.
This option restores the previous behavior of `mount` by disabling the "DefaultPermissions" FUSE option. This allows any user that can access the mountpoint to read any file from the snapshot. Normal FUSE rules apply, so `allow-root` or `allow-other` can be used to allow users besides the mounting user to access these files.
This enforces the Unix permissions of the snapshot files within the mounted filesystem, which will only allow users to access snapshot files if they had access to the file outside of the snapshot.
Make restic forget --keep-within accept time ranges measured in hours and choose
accordingly which snapshots to keep and which to forget. Add relative tests.
The default value of the `--host` flag was set to 'H' (the shorthand
version of the flag), this caused the snapshot lookup to fail.
Also add shorthand `-H` for `backup` command.
Closes#2040
Some time ago we changed the paths in the repo to always use a slash for
separation, it seems we missed that the `dump` command still uses the
`filepath` package, so on Windows backslashes are used.
Closes#2079
reworked restore error callback to use file location
path instead of much heavier Node. this reduced restore
memory usage by as much as 50% in some of my tests.
Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
Sometimes, users run restic without retaining the local cache
directories. This was reported several times in the past.
Restic will now print a message whenever a new cache directory is
created from scratch (i.e. it did not exist before), so users have a
chance to recognize when the cache is not kept between different runs of
restic.
With --blob, --tree and --pack, the find command now lists the snapshots
that contain a specific tree or blob, or the snapshots that contain
blobs belonging to a given pack.
It also displays the pack ID a blob belongs to.
A list of IDs can be given, as long as the IDs are all of the same type.
This commit adds a command called `self-update` which downloads the
latest released version of restic from GitHub and replacing the current
binary with it. It does not rely on any external program (so it'll work
everywhere), but still verifies the GPG signature using the embedded GPG
public key.
By default, the `self-update` command is hidden behind the `selfupdate`
built tag, which is only set when restic is built using `build.go`. The
reason for this is that downstream distributions will then not include
the command by default, so users are encouraged to use the
platform-specific distribution mechanism.
This commit introduces two functions: withinDir() and
approachingMatchingTree()
Both bind the list of directories with a closure, so we don't need to
iterate over the list in the function passed to Walk(). This reduces the
indentation level and since we can just use return, we don't need the
breaks any more.
The case that len(dirs) == 0 can also be handled by the functions with a
return, which saves another indentation level.
The main function body of the function passed to Walk() was reduced to
three cases:
* Within one of the dirs: Print the node, and if recursive operation is
requested, directly return, so the walker continues recursive
traversal
* Approaching one of the dirs: don't print anything, but continue
recursive traversal.
* Nothing of the two: abort walking this branch of the tree.
Adds a SelectByName method to the archive and scanner which only require
the filename as input, and can thus be run before calling lstat on the
file. Can speed up scanning significantly if a lot of filename excludes
are used.
Before:
$ restic list
Fatal: type not specified
After:
$ restic list
Fatal: type not specified, usage: list [blobs|packs|index|snapshots|keys|locks]
Closes#1783
This patch should fix the following panic when trying to backup the
root filesystem with thre --one-file-system flag:
% restic backup --one-file-system /
(...)
panic: item /, device id 2082 not found, allowedDevs: map[/:2082]
This commit changes how the worker goroutines for saving e.g. blobs
interact. Before, it was possible to get stuck sending an instruction to
archive a file or dir when no worker goroutines were available any more.
This commit introduces a `done` channel for each of the worker pools,
which is set to the channel returned by `tomb.Dying()`, so it is closed
when the first worker returned an error.