These commands filter the snapshots according to some criteria which
essentially requires loading the index before filtering the snapshots.
Thus create a copy of the snapshots list beforehand and use it later on.
Load tree blobs with more than 50MB only from a single goroutine. Very
large tree blobs with for example 400 MB size can otherwise require
roughly 1GB * streamTreeParallelism memory.
This ensures that restic won't create lots of new lock files without
deleting them later on.
In some cases a Delete operation on a backend can return a "File does
not exist" error even though the Delete operation succeeded. This can
for example be caused by request retries. This caused restic to forget
about the new lock file and continue trying to remove the old (already
deleted) lock file.
Currently, `restic backup` (if a `--parent` is not provided)
will choose the most recent matching snapshot as the parent snapshot.
This makes sense in the usual case,
where we tag the snapshot-being-created with the current time.
However, this doesn't make sense if the user has passed `--time`
and is currently creating a snapshot older than the latest snapshot.
Instead, choose the most recent snapshot
which is not newer than the snapshot-being-created's timestamp,
to avoid any time travel.
Impetus for this change:
I'm using restic for the first time!
I have a number of existing BTRFS snapshots
I am backing up via restic to serve as my initial set of backups.
I initially `restic backup`'d the most recent snapshot to test,
then started backing up each of the other snapshots.
I noticed in `restic cat snapshot <id>` output
that all the remaining snapshots have the most recent as the parent.
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
Ensure that only snapshots made in the past are taken into account when running restic forget with the within switches (--keep-within, --keep-within- hourly, and friends)
Allow keeping hourly/daily/weekly/monthly/yearly snapshots for a given time period.
This adds the following flags/parameters to restic forget:
--keep-within-hourly duration
--keep-within-daily duration
--keep-within-weekly duration
--keep-within-monthly duration
--keep-within-yearly duration
Includes following changes:
- Add tests for --keep-within-hourly (and friends)
- Add documentation for --keep-within-hourly (and friends)
- Add changelog for --keep-within-hourly (and friends)
This assigns an id to each tree root and then keeps track of how many
tree loads (i.e. trees referenced for the first time) are pending per
tree root. Once a tree root and its subtrees were fully processed there
are no more pending tree loads and the tree root is reported as
processed.
The io.Reader interface does not support contexts, such that it is
necessary to embed the context into the backendReaderAt struct. This has
the problem that a reader might suddenly stop working when it's
contained context is canceled. However, this is now problem here as the
reader instances never escape the calling function.
The list operation used by RemoveStaleLocks or RemoveAllLocks will
already be canceled by the passed in context. Therefore we can also just
cancel the remove operation as the unlock command won't process all lock
files anyways.
Makes the following corrections to the "Applying Policy:" output:
- keep the last 1 snapshots snapshots => keep 1 latest snapshots
- keep the last 1 snapshots, 3 hourly, 5 yearly snapshots => keep 1 latest, 3 hourly, 5 yearly snapshots
The seen BlobSet always contained a subset of the entries in blobs.
Thus use blobs instead and avoid the memory overhead of the second set.
Suggested-by: Alexander Weiss <alex@weissfam.de>
If a data blob and a tree blob with the same ID (= same content) exist,
then the checker did not report a data or tree blob as unused when the
blob of the other type was still in use.
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
-> also cleans up pending index entries when they are saved in the index
-> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
-> removes the need of an index uploader
-> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
The `dump`, `find`, `forget`, `ls`, `mount`, `restore`, `snapshots`,
`stats` and `tag` commands will now take into account multiple
`--host` and `-H` flags.
Windows does not have a concept of a `change time` in the sense as Unix
has it: the field `CreationTime` of the `Win32FileAttributeData` struct
is not updated when attributes or content is changed. So from now on
we're using the `LastWriteTime` as the `change time` on Windows.
Sometimes restic gets bogus timestamps which cannot be converted to
JSON, because the stdlib JSON encoder returns an error if the year is
not within [0, 9999]. We now make sure that we at least record _some_
timestamp and cap the year either to 0000 or 9999. Before, restic would
refuse to save the file at all, so this improves the status quo.
This fixes#2174 and #1173
This commit is a followup to the addition of the --group-by flag for the
snapshots command. Adding the grouping code there introduced duplicated
code (the forget command also does grouping). This commit refactors
boths sides to only use shared code.
This commit changes the signatures for repository.LoadAndDecrypt and
utils.LoadAll to allow passing in a []byte as the buffer to use. This
buffer is enlarged as needed, and returned back to the caller for
further use.
In later commits, this allows reducing allocations by reusing a buffer
for multiple calls, e.g. in a worker function.
Make restic forget --keep-within accept time ranges measured in hours and choose
accordingly which snapshots to keep and which to forget. Add relative tests.
This commit fixes a bug introduced in
e9ea268847: When an invalid lock is
encountered (e.g. if the file is empty), the code used to ignore that,
but now returns the error.
Now, invalid files are ignored for the normal lock check, and removed
when `restic unlock --remove-all` is run.
Closes#1652
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.
The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.
This is a bug fix: Before, when the worker function fn in List() of the
RetryBackend returned an error, the operation is retried with the next
file. This is not consistent with the documentation, the intention was
that when fn returns an error, this is passed on to the caller and the
List() operation is aborted. Only errors happening on the underlying
backend are retried.
The error leads to restic ignoring exclusive locks that are present in
the repo, so it may happen that a new backup is written which references
data that is going to be removed by a concurrently running `prune`
operation.
The bug was reported by a user here:
https://forum.restic.net/t/restic-backup-returns-0-exit-code-when-already-locked/484
When looking up a blob in the master index, with several
indexes present in the master index, a significant amount of time
is spent generating errors for each failed lookup. However, these
errors are often used to check if a blob is present, but the contents
are not inspected making the overhead of the error not useful.
Instead, change Index.Lookup (and Index.LookupSize) to instead return
a boolean denoting if the blob was found instead of an error. Also change
all the calls to these functions to handle the new function signature.
benchmark old ns/op new ns/op delta
BenchmarkMasterIndexLookupSingleIndex-6 820 897 +9.39%
BenchmarkMasterIndexLookupMultipleIndex-6 12821 2001 -84.39%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 5378 492 -90.85%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 17026 1649 -90.31%
benchmark old allocs new allocs delta
BenchmarkMasterIndexLookupSingleIndex-6 9 9 +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6 59 19 -67.80%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 22 6 -72.73%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 72 16 -77.78%
benchmark old bytes new bytes delta
BenchmarkMasterIndexLookupSingleIndex-6 160 160 +0.00%
BenchmarkMasterIndexLookupMultipleIndex-6 3200 240 -92.50%
BenchmarkMasterIndexLookupSingleIndexUnknown-6 1232 48 -96.10%
BenchmarkMasterIndexLookupMultipleIndexUnknown-6 4272 128 -97.00%
Add a RESTIC_PROGRESS_FPS environment variable to limit the interval
at which the progress indicator updates (allowed values: 1-60).
The default rate of 60 FPS can cause high terminal CPU load on some
systems, like iTerm2 on macOS with font anti-aliasing enabled.
Usage:
RESTIC_PROGRESS_FPS=1 restic ...
RESTIC_PROGRESS_FPS=60 restic ...
- be explicit when discarding returned errors from .Close(), etc.
- remove named return values from funcs when naked return not used
- fix some "err" shadowing when redeclaration not needed
This commits adds rudimentary support for a cache directory, enabled by
default. The cache directory is created if it does not exist. The cache
is used if there's anything in it, newly created snapshot and index
files are written to the cache automatically.
An exclude filter is basically a 'wildcard but foo', so even if a
childMayMatch, other children of a dir may not, therefore childMayMatch
does not matter, but we should not go down unless the dir is selected
for restore.
This improves restore performance by several orders of magniture by not
going through the whole tree recursively when we can anticipate that no
match will ever occur.