Merge pull request #3656 from lgommans/forget-security

forget: Update docs for readability and append-only considerations
This commit is contained in:
Alexander Neumann 2022-03-24 21:36:19 +01:00 committed by GitHub
commit 6087c4ad75
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 168 additions and 109 deletions

View file

@ -16,8 +16,9 @@ var cmdForget = &cobra.Command{
Long: ` Long: `
The "forget" command removes snapshots according to a policy. Please note that The "forget" command removes snapshots according to a policy. Please note that
this command really only deletes the snapshot object in the repository, which this command really only deletes the snapshot object in the repository, which
is a reference to data stored there. In order to remove this (now unreferenced) is a reference to data stored there. In order to remove the unreferenced data
data after 'forget' was run successfully, see the 'prune' command. after "forget" was run successfully, see the "prune" command. Please also read
the documentation for "forget" to learn about important security considerations.
EXIT STATUS EXIT STATUS
=========== ===========

View file

@ -657,7 +657,8 @@ credentials) is encrypted/decrypted locally, then sent/received via
A more advanced version of this setup forbids specific hosts from removing A more advanced version of this setup forbids specific hosts from removing
files in a repository. See the `blog post by Simon Ruderich files in a repository. See the `blog post by Simon Ruderich
<https://ruderich.org/simon/notes/append-only-backups-with-restic-and-rclone>`_ <https://ruderich.org/simon/notes/append-only-backups-with-restic-and-rclone>`_
for details. for details and the documentation for the ``forget`` command to learn about
important security considerations.
The rclone command may also be hard-coded in the SSH configuration or the The rclone command may also be hard-coded in the SSH configuration or the
user's public key, in this case it may be sufficient to just start the SSH user's public key, in this case it may be sufficient to just start the SSH

View file

@ -14,17 +14,16 @@
Removing backup snapshots Removing backup snapshots
######################### #########################
All backup space is finite, so restic allows removing old snapshots. All backup space is finite, so restic allows removing old snapshots. This can
This can be done either manually (by specifying a snapshot ID to remove) be done either manually (by specifying a snapshot ID to remove) or by using a
or by using a policy that describes which snapshots to forget. For all policy that describes which snapshots to forget. For all remove operations, two
remove operations, two commands need to be called in sequence: commands need to be called in sequence: ``forget`` to remove snapshots, and
``forget`` to remove a snapshot and ``prune`` to actually remove the ``prune`` to remove the remaining data that was referenced only by the removed
data that was referenced by the snapshot from the repository. This can snapshots. The latter can be automated with the ``--prune`` option of ``forget``,
be automated with the ``--prune`` option of the ``forget`` command, which runs ``prune`` automatically if any snapshots were actually removed.
which runs ``prune`` automatically if snapshots have been removed.
Pruning snapshots can be a time-consuming process, depending on the Pruning snapshots can be a time-consuming process, depending on the
amount of snapshots and data to process. During a prune operation, the number of snapshots and data to process. During a prune operation, the
repository is locked and backups cannot be completed. Please plan your repository is locked and backups cannot be completed. Please plan your
pruning so that there's time to complete it and it doesn't interfere with pruning so that there's time to complete it and it doesn't interfere with
regular backup runs. regular backup runs.
@ -156,67 +155,75 @@ to ``forget``:
Removing snapshots according to a policy Removing snapshots according to a policy
**************************************** ****************************************
Removing snapshots manually is tedious and error-prone, therefore restic Removing snapshots manually is tedious and error-prone, therefore restic allows
allows specifying which snapshots should be removed automatically specifying a policy (one or more ``--keep-*`` options) for which snapshots to
according to a policy. You can specify how many hourly, daily, weekly, keep. You can for example define how many hourly, daily, weekly, monthly and
monthly and yearly snapshots to keep, any other snapshots are removed. yearly snapshots to keep, and any other snapshots will be removed.
The most important command-line parameter here is ``--dry-run`` which
instructs restic to not remove anything but print which snapshots would
be removed.
When ``forget`` is run with a policy, restic loads the list of all .. warning:: If you use an append-only repository with policy-based snapshot
snapshots, then groups these by host name and list of directories. The grouping removal, some security considerations are important. Please refer to the
options can be set with ``--group-by``, to only group snapshots by paths and section below for more information.
tags use ``--group-by paths,tags``. The policy is then applied to each group of
snapshots separately. This is a safety feature.
The ``forget`` command accepts the following parameters: .. note:: You can always use the ``--dry-run`` option of the ``forget`` command,
which instructs restic to not remove anything but instead just print what
actions would be performed.
- ``--keep-last n`` never delete the ``n`` last (most recent) snapshots The ``forget`` command accepts the following policy options:
- ``--keep-hourly n`` for the last ``n`` hours in which a snapshot was
made, keep only the last snapshot for each hour. - ``--keep-last n`` keep the ``n`` last (most recent) snapshots.
- ``--keep-hourly n`` for the last ``n`` hours which have one or more
snapshots, keep only the most recent one for each hour.
- ``--keep-daily n`` for the last ``n`` days which have one or more - ``--keep-daily n`` for the last ``n`` days which have one or more
snapshots, only keep the last one for that day. snapshots, keep only the most recent one for each day.
- ``--keep-weekly n`` for the last ``n`` weeks which have one or more - ``--keep-weekly n`` for the last ``n`` weeks which have one or more
snapshots, only keep the last one for that week. snapshots, keep only the most recent one for each week.
- ``--keep-monthly n`` for the last ``n`` months which have one or more - ``--keep-monthly n`` for the last ``n`` months which have one or more
snapshots, only keep the last one for that month. snapshots, keep only the most recent one for each month.
- ``--keep-yearly n`` for the last ``n`` years which have one or more - ``--keep-yearly n`` for the last ``n`` years which have one or more
snapshots, only keep the last one for that year. snapshots, keep only the most recent one for each year.
- ``--keep-tag`` keep all snapshots which have all tags specified by - ``--keep-tag`` keep all snapshots which have all tags specified by
this option (can be specified multiple times). this option (can be specified multiple times).
- ``--keep-within duration`` keep all snapshots which have been made within - ``--keep-within duration`` keep all snapshots having a timestamp within
the duration of the latest snapshot. ``duration`` needs to be a number of the specified duration of the latest snapshot, where ``duration`` is a
years, months, days, and hours, e.g. ``2y5m7d3h`` will keep all snapshots number of years, months, days, and hours. E.g. ``2y5m7d3h`` will keep all
made in the two years, five months, seven days, and three hours before the snapshots made in the two years, five months, seven days and three hours
latest snapshot. before the latest (most recent) snapshot.
- ``--keep-within-hourly duration`` keep all hourly snapshots made within - ``--keep-within-hourly duration`` keep all hourly snapshots made within the
specified duration of the latest snapshot. The duration is specified in specified duration of the latest snapshot. The ``duration`` is specified in
the same way as for ``--keep-within`` and the method for determining the same way as for ``--keep-within`` and the method for determining hourly
hourly snapshots is the same as for ``--keep-hourly``. snapshots is the same as for ``--keep-hourly``.
- ``--keep-within-daily duration`` keep all daily snapshots made within - ``--keep-within-daily duration`` keep all daily snapshots made within the
specified duration of the latest snapshot. specified duration of the latest snapshot.
- ``--keep-within-weekly duration`` keep all weekly snapshots made within - ``--keep-within-weekly duration`` keep all weekly snapshots made within the
specified duration of the latest snapshot. specified duration of the latest snapshot.
- ``--keep-within-monthly duration`` keep all monthly snapshots made within - ``--keep-within-monthly duration`` keep all monthly snapshots made within the
specified duration of the latest snapshot. specified duration of the latest snapshot.
- ``--keep-within-yearly duration`` keep all yearly snapshots made within - ``--keep-within-yearly duration`` keep all yearly snapshots made within the
specified duration of the latest snapshot. specified duration of the latest snapshot.
.. note:: All calendar related ``--keep-*`` options work on the natural time .. note:: All calendar related options (``--keep-{hourly,daily,...}``) work on
boundaries and not relative to when you run the ``forget`` command. Weeks natural time boundaries and *not* relative to when you run ``forget``. Weeks
are Monday 00:00 -> Sunday 23:59, days 00:00 to 23:59, hours :00 to :59, etc. are Monday 00:00 to Sunday 23:59, days 00:00 to 23:59, hours :00 to :59, etc.
They also only count hours/days/weeks/etc which have one or more snapshots.
.. note:: All duration related options (``--keep-{within,-*}``) ignore snapshots
with a timestamp in the future (relative to when the ``forget`` command is
run) and these snapshots will hence not be removed.
.. note:: Specifying ``--keep-tag ''`` will match untagged snapshots only. .. note:: Specifying ``--keep-tag ''`` will match untagged snapshots only.
Multiple policies will be ORed together so as to be as inclusive as possible When ``forget`` is run with a policy, restic loads the list of all snapshots,
for keeping snapshots. then groups these by host name and list of directories. The grouping options can
be set with ``--group-by``, to e.g. group snapshots by only paths and tags use
``--group-by paths,tags``. The policy is then applied to each group of snapshots
separately. This is a safety feature to prevent accidental removal of unrelated
backup sets.
Additionally, you can restrict removing snapshots to those which have a Additionally, you can restrict the policy to only process snapshots which have a
particular hostname with the ``--host`` parameter, or tags with the particular hostname with the ``--host`` parameter, or tags with the ``--tag``
``--tag`` option. When multiple tags are specified, only the snapshots option. When multiple tags are specified, only the snapshots which have all the
which have all the tags are considered. For example, the following command tags are considered. For example, the following command removes all but the
removes all but the latest snapshot of all snapshots that have the tag ``foo``: latest snapshot of all snapshots that have the tag ``foo``:
.. code-block:: console .. code-block:: console
@ -243,21 +250,8 @@ the tag.
$ restic forget --tag '' --keep-last 1 $ restic forget --tag '' --keep-last 1
All the ``--keep-*`` options above only count Let's look at a simple example: Suppose you have only made one backup every
hours/days/weeks/months/years which have a snapshot, so those without a Sunday for 12 weeks:
snapshot are ignored.
For safety reasons, restic refuses to act on an "empty" policy. For example,
if one were to specify ``--keep-last 0`` to forget *all* snapshots in the
repository, restic will respond that no snapshots will be removed. To delete
all snapshots, use ``--keep-last 1`` and then finally remove the last
snapshot ID manually (by passing the ID to ``forget``).
All snapshots are evaluated against all matching ``--keep-*`` counts. A
single snapshot on 2017-09-30 (Sat) will count as a daily, weekly and monthly.
Let's explain this with an example: Suppose you have only made a backup
on each Sunday for 12 weeks:
.. code-block:: console .. code-block:: console
@ -280,8 +274,8 @@ on each Sunday for 12 weeks:
--------------------------------------------------------------- ---------------------------------------------------------------
12 snapshots 12 snapshots
Then ``forget --keep-daily 4`` will keep the last four snapshots for the last Then ``forget --keep-daily 4`` will keep the last four snapshots, for the last
four Sundays, but remove the rest: four Sundays, and remove the other snapshots:
.. code-block:: console .. code-block:: console
@ -312,29 +306,86 @@ four Sundays, but remove the rest:
--------------------------------------------------------------- ---------------------------------------------------------------
8 snapshots 8 snapshots
The result of the ``forget --keep-daily`` operation does not depend on when it The processed snapshots are evaluated against all ``--keep-*`` options but a
is run, it will only count the days for which a snapshot exists. This is a snapshot only need to match a single option to be kept (the results are ORed).
safety feature: it prevents restic from removing snapshots when no new ones are This means that the most recent snapshot on a Sunday would match both hourly,
created. Otherwise, running ``forget --keep-daily 4`` on a Friday (without any daily and weekly ``--keep-*`` options, and possibly more depending on calendar.
snapshot Monday to Thursday) would remove all snapshots!
Another example: Suppose you make daily backups for 100 years. Then For example, suppose you make one backup every day for 100 years. Then ``forget
``forget --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 75`` --keep-daily 7 --keep-weekly 5 --keep-monthly 12 --keep-yearly 75`` would keep
will keep the most recent 7 daily snapshots, then 4 (remember, 7 dailies the most recent 7 daily snapshots and 4 last-day-of-the-week ones (since the 7
already include a week!) last-day-of-the-weeks and 11 or 12 dailies already include 1 weekly). Additionally, 12 or 11 last-day-of-the-month
last-day-of-the-months (11 or 12 depends if the 5 weeklies cross a month). snapshots will be kept (depending on whether one of them ends up being the same
And finally 75 last-day-of-the-year snapshots. All other snapshots are as a daily or weekly). And finally 75 or 74 last-day-of-the-year snapshots are
removed. kept, depending on whether one of them ends up being the same as an already kept
snapshot. All other snapshots are removed.
You might want to maintain the same policy as for the example above, but have You might want to maintain the same policy as in the example above, but have
irregular backups. For example, the 7 snapshots specified with ``--keep-daily 7`` irregular backups. For example, the 7 snapshots specified with ``--keep-daily 7``
might be spread over a longer period. If what you want is to keep daily snapshots might be spread over a longer period. If what you want is to keep daily
for a week, weekly for a month, monthly for a year and yearly for 75 years, you snapshots for the last week, weekly for the last month, monthly for the last
could specify: year and yearly for the last 75 years, you can instead specify ``forget
``forget --keep-within-daily 7d --keep-within-weekly 1m --keep-within-monthly 1y --keep-within-daily 7d --keep-within-weekly 1m --keep-within-monthly 1y
--keep-within-yearly 75y`` --keep-within-yearly 75y`` (note that `1w` is not a recognized duration, so
(Note that `1w` is not a recognized duration, so you will have to specify you will have to specify `7d` instead).
`7d` instead)
For safety reasons, restic refuses to act on an "empty" policy. For example,
if one were to specify ``--keep-last 0`` to forget *all* snapshots in the
repository, restic will respond that no snapshots will be removed. To delete
all snapshots, use ``--keep-last 1`` and then finally remove the last snapshot
manually (by passing the ID to ``forget``).
Security considerations in append-only mode
===========================================
.. note:: TL;DR: With append-only repositories, one should specifically use the
``--keep-within`` option of the ``forget`` command when removing snapshots.
To prevent a compromised backup client from deleting its backups (for example
due to a ransomware infection), a repository service/backend can serve the
repository in a so-called append-only mode. This means that the repository is
served in such a way that it can only be written to and read from, while delete
and overwrite operations are denied. Restic's `rest-server`_ features an
append-only mode, but few other standard backends do. To support append-only
with such backends, one can use `rclone`_ as a complement in between the backup
client and the backend service.
.. _rest-server: https://github.com/restic/rest-server/
.. _rclone: https://rclone.org/commands/rclone_serve_restic/
To remove snapshots and recover the corresponding disk space, the ``forget``
and ``prune`` commands require full read, write and delete access to the
repository. If an attacker has this, the protection offered by append-only
mode is naturally void. The usual and recommended setup with append-only
repositories is therefore to use a separate and well-secured client whenever
full access to the repository is needed, e.g. for administrative tasks such
as running ``forget``, ``prune`` and other maintenance commands.
However, even with append-only mode active and a separate, well-secured client
used for administrative tasks, an attacker who is able to add garbage snapshots
to the repository could bring the snapshot list into a state where all the
legitimate snapshots risk being deleted by an unsuspecting administrator that
runs the ``forget`` command with certain ``--keep-*`` options, leaving only the
attacker's useless snapshots.
For example, if the ``forget`` policy is to keep three weekly snapshots, and
the attacker adds an empty snapshot for each of the last three weeks, all with
a timestamp (see the ``backup`` command's ``--time`` option) slightly more
recent than the existing snapshots (but still within the target week), then the
next time the repository administrator (or a scheduled job) runs the ``forget``
command with this policy, the legitimate snapshots will be removed (since the
policy will keep only the most recent snapshot within each week). Even without
running ``prune``, recovering data would be messy and some metadata lost.
To avoid this, ``forget`` policies applied to append-only repositories should
use the ``--keep-within`` option, as this will keep not only the attacker's
snapshots but also the legitimate ones. Assuming the system time is correctly
set when ``forget`` runs, this will allow the administrator to notice problems
with the backup or the compromised host (e.g. by seeing more snapshots than
usual or snapshots with suspicious timestamps). This is, of course, limited to
the specified duration: if ``forget --keep-within 7d`` is run 8 days after the
last good snapshot, then the attacker can still use that opportunity to remove
all legitimate snapshots.
Customize pruning Customize pruning
***************** *****************

View file

@ -607,7 +607,7 @@ examples of things an adversary could achieve in various circumstances.
An adversary with read access to your backup storage location could: An adversary with read access to your backup storage location could:
- Attempt a brute force password guessing attack against a copy of the - Attempt a brute force password guessing attack against a copy of the
repository (even more reason to use long, 30+ character passwords). repository (please use strong passwords with sufficient entropy).
- Infer which packs probably contain trees via file access patterns. - Infer which packs probably contain trees via file access patterns.
- Infer the size of backups by using creation timestamps of repository objects. - Infer the size of backups by using creation timestamps of repository objects.
@ -618,7 +618,7 @@ An adversary with network access could:
- Determine from where you create your backups (i.e., the location where the - Determine from where you create your backups (i.e., the location where the
requests originate). requests originate).
- Determine where you store your backups (i.e., which provider/target system). - Determine where you store your backups (i.e., which provider/target system).
- Infer the size of backups by using creation timestamps of repository objects. - Infer the size of backups by observing network traffic.
The following are examples of the implications associated with violating some The following are examples of the implications associated with violating some
of the aforementioned assumptions. of the aforementioned assumptions.
@ -629,11 +629,11 @@ system making backups could:
- Render the entire backup process untrustworthy (e.g., intercept password, - Render the entire backup process untrustworthy (e.g., intercept password,
copy files, manipulate data). copy files, manipulate data).
- Create snapshots (containing garbage data) which cover all modified files - Create snapshots (containing garbage data) which cover all modified files
and wait until a trusted host has used forget often enough to forget all and wait until a trusted host has used ``forget`` often enough to remove all
correct snapshots. correct snapshots.
- Create a garbage snapshot for every existing snapshot with a slightly different - Create a garbage snapshot for every existing snapshot with a slightly
timestamp and wait until forget has run, thereby removing all correct different timestamp and wait until certain ``forget`` configurations have been
snapshots at once. run, thereby removing all correct snapshots at once.
An adversary with write access to your files at the storage location could: An adversary with write access to your files at the storage location could:
@ -645,21 +645,27 @@ An adversary with write access to your files at the storage location could:
the snapshot cannot be restored completely. Restic is not designed to detect the snapshot cannot be restored completely. Restic is not designed to detect
this attack. this attack.
An adversary who compromises a host system with append-only access to the An adversary who compromises a host system with append-only (read+write allowed,
backup repository could: delete+overwrite denied) access to the backup repository could:
- Capture the password and decrypt backups from the past and in the future
(see the "leaked key" example below for related information).
- Render new backups untrustworthy *after* the host has been compromised - Render new backups untrustworthy *after* the host has been compromised
(due to having complete control over new backups). An attacker cannot delete (due to having complete control over new backups). An attacker cannot delete
or manipulate old backups. As such, restoring old snapshots created *before* or manipulate old backups. As such, restoring old snapshots created *before*
a host compromise remains possible. a host compromise remains possible.
*Note: It is **not** recommended to ever run forget automatically for an - Potentially manipulate the use of the ``forget`` command into deleting all
append-only backup to which a potentially compromised host has access legitimate snapshots, keeping only bogus snapshots added by the attacker.
because an attacker using fake snapshots could cause forget to remove Ransomware might try this in order to leave only one option to get your data
correct snapshots.* back: paying the ransom. For safe use of ``forget``, please see the
corresponding documentation on removing backup snapshots and append-only mode.
An adversary who has a leaked key for a repository which has not been re-encrypted An adversary who has a leaked (decrypted) key for a repository could:
could:
- Decrypt existing and future backup data. If multiple hosts backup into the same
repository, an attacker will get access to the backup data of every host.
- Decrypt existing and future backup data. If multiple hosts backup into the
same repository, an attacker will get access to the backup data of every host.
Note that since the local encryption key gives access to the master key, a
password change will not prevent this. Changing the master key can currently
only be done using the ``copy`` command, which moves the data into a new
repository with a new master key, or by making a completely new repository
and new backup.