forked from TrueCloudLab/restic
doc: rewrite troubleshooting section
This commit is contained in:
parent
7c8dd61e8c
commit
9c64a95df8
1 changed files with 154 additions and 71 deletions
|
@ -14,98 +14,181 @@
|
||||||
Troubleshooting
|
Troubleshooting
|
||||||
#########################
|
#########################
|
||||||
|
|
||||||
Being a backup software, the repository format ensures that the data saved in the repository
|
The repository format used by restic is designed to be error resistant. In
|
||||||
is verifiable and error-restistant. Restic even implements some self-healing functionalities.
|
particular, commands like, for example, ``backup`` or ``prune`` can be interrupted
|
||||||
|
at *any* point in time without damaging the repository. You might have to run
|
||||||
|
``unlock`` manually though, but that's it.
|
||||||
|
|
||||||
However, situations might occur where your repository gets in an incorrect state and measurements
|
However, a repository might be damaged if some of its files are damaged or lost.
|
||||||
need to be done to get you out of this situation. These situations might be due to hardware failure,
|
This can occur due to hardware failures, accidentally removing files from the
|
||||||
accidentially removing files directly from the repository or bugs in the restic implementation.
|
repository or bugs in the implementation of restic.
|
||||||
|
|
||||||
This document is meant to give you some hints about how to recover from such situations.
|
The following steps will help you recover a repository. This guide does not cover
|
||||||
|
all possible types of repository damages. Thus, if the steps do not work for you
|
||||||
|
or you are unsure how to proceed, then ask for help. Please always include the
|
||||||
|
check output discussed in the next section and what steps you've taken to repair
|
||||||
|
the repository so far.
|
||||||
|
|
||||||
1. Stay calm and don't over-react
|
* `Forum <https://forum.restic.net/>`_
|
||||||
********************************************
|
* Our IRC channel ``#restic`` on ``irc.libera.chat``
|
||||||
|
|
||||||
The most important thing if you find yourself in the situation of a damaged repository is to
|
Make sure that you **use the latest available restic version**. It can contain
|
||||||
stay calm and don't do anything you might regret later.
|
bugfixes, and improvements to simplify the repair of a repository. It might also
|
||||||
|
contain a fix for your repository problems!
|
||||||
The following point should be always considered:
|
|
||||||
|
|
||||||
- Make a copy of you repository and try to recover from that copy. If you suspect a storage failure,
|
|
||||||
it may be even better, to make *two* copies: one to get all data out of the possibly failing storage
|
|
||||||
and another one to try the recovery process.
|
|
||||||
- Pause your regular operations on the repository or let them run on a copy. You will especially make
|
|
||||||
sure that no `forget` or `prune` is run as these command are supposed to remove data and may result
|
|
||||||
in data loss.
|
|
||||||
- Search if your issue is already known and solved. Good starting points are the restic forum and the
|
|
||||||
github issues.
|
|
||||||
- Get you some help if you are unsure what to do. Find a colleage or friend to discuss what should be done.
|
|
||||||
Also feel free to consult the restic forum.
|
|
||||||
- When using the commands below, make sure you read and understand the documentation. Some of the commands
|
|
||||||
may not be your every-day commands, so make sure you really understand what they are doing.
|
|
||||||
|
|
||||||
|
|
||||||
2. `check` is your friend
|
1. Find out what is damaged
|
||||||
********************************************
|
***************************
|
||||||
|
|
||||||
Run `restic check` to find out what type of error you have. The results may be technical but can give you
|
The first step is always to check the repository.
|
||||||
a good hint what's really wrong.
|
|
||||||
|
|
||||||
Moreover, you can always run a `check` to ensure that your repair really was sucessful and your repository
|
.. code-block:: console
|
||||||
is in a sane state again.
|
|
||||||
But make sure that your needed data is also still contained in your repository ;-)
|
|
||||||
|
|
||||||
Note that `check` also prints out warning in some cases. These warnings point out that the repo may be
|
$ restic check --read-data
|
||||||
optimized but is still in perfect shape and does not need any troubleshooting.
|
|
||||||
|
|
||||||
3. Index trouble -> `repair index`
|
using temporary cache in /tmp/restic-check-cache-1418935501
|
||||||
********************************************
|
repository 12345678 opened (version 2, compression level auto)
|
||||||
|
created new cache in /tmp/restic-check-cache-1418935501
|
||||||
|
create exclusive lock for repository
|
||||||
|
load indexes
|
||||||
|
check all packs
|
||||||
|
check snapshots, trees and blobs
|
||||||
|
error for tree 7ef8ebab:
|
||||||
|
id 7ef8ebabc59aadda1a237d23ca7abac487b627a9b86508aa0194690446ff71f6 not found in repository
|
||||||
|
[0:02] 100.00% 7 / 7 snapshots
|
||||||
|
read all data
|
||||||
|
[0:05] 100.00% 25 / 25 packs
|
||||||
|
Fatal: repository contains errors
|
||||||
|
|
||||||
A common problem with broken repostories is that the index does no longer correctly represent the contents
|
.. note::
|
||||||
of your pack files. This is especially the case if some pack files got lost.
|
|
||||||
`repair index` recovers this situation and ensures that the index exactly represents the pack files.
|
|
||||||
|
|
||||||
You might even need to manually remove corrupted pack files. In this case make sure, you run
|
This will download the whole repository. If retrieving data from the backend is
|
||||||
`restic repair index` after.
|
expensive, then omit the ``--read-data`` option. Keep a copy of the check output
|
||||||
|
as it might be necessary later on!
|
||||||
|
|
||||||
Also if you encounter problems with the index files itselves, `repair index` will solve these problems
|
If the output contains warnings that the ``ciphertext verification failed`` for
|
||||||
immediately.
|
some blobs in the repository, then please ask for help in the forum or our IRC
|
||||||
|
channel. These errors are often caused by hardware problems which **must** be
|
||||||
|
investigated and fixed. Otherwise, the backup will be damaged again and again.
|
||||||
|
|
||||||
However, rebuilding the index does not solve every problem, e.g. lost pack files.
|
Similarly, if a repository is repeatedly damaged, please open an `issue on Github
|
||||||
|
<https://github.com/restic/restic/issues/new/choose>`_ as this could indicate a bug
|
||||||
|
somewhere. Please include the check output and additional information that might
|
||||||
|
help locate the problem.
|
||||||
|
|
||||||
4. Delete unneeded defect snapshots -> `forget`
|
|
||||||
********************************************
|
|
||||||
|
|
||||||
If you encounter defect snapshots but realize you can spare them, it is often a good idea to simply
|
2. Backup the repository
|
||||||
delete them using `forget`. In case that your repository remains with just sane snapshots (including
|
************************
|
||||||
all trees and files) the next `prune` run will put your repository in a sane state.
|
|
||||||
|
|
||||||
This can be also used if you manage to create new snapshots which can replace the defect ones, see
|
Create a full copy of the repository if possible. Or at the very least make a
|
||||||
below.
|
copy of the ``index`` and ``snapshots`` folders. This will allow you to roll back
|
||||||
|
the repository if the repair procedure fails. If your repository resides in a
|
||||||
|
cloud storage, then you can for example use `rclone <https://rclone.org/>`_ to
|
||||||
|
make such a copy.
|
||||||
|
|
||||||
5. No fear to `backup` again
|
Please disable all regular operations on the repository to prevent unexpected
|
||||||
********************************************
|
changes. Especially, ``forget`` or ``prune`` must be disabled as they could
|
||||||
|
remove data unexpectedly.
|
||||||
|
|
||||||
There are quite some self-healing mechanisms withing the `backup` command. So it is always a good idea to
|
.. warning::
|
||||||
backup again and check if this did heal your repository.
|
|
||||||
If you realize that a specific file is broken in your repository and you have this file, any run of
|
|
||||||
`backup` which includes that file will be able to heal the situation.
|
|
||||||
|
|
||||||
Note that `backup` relies on a correct index state, so make sure your index is fine or run `repair index`
|
If you suspect hardware problems, then you *must* investigate those first.
|
||||||
before running `backup`.
|
Otherwise, the repository will soon be damaged again.
|
||||||
|
|
||||||
6. Unreferenced tree -> `recover`
|
Please take the time to understand what the commands described in the following
|
||||||
********************************************
|
do. If you are unsure, then ask for help in the forum or our IRC channel. Search
|
||||||
|
whether your issue is already known and solved. Please take a look at the
|
||||||
|
`forum`_ and `Github issues <https://github.com/restic/restic/issues>`_.
|
||||||
|
|
||||||
If for some reason you have unreferenced trees in your repository but you actually need them, run
|
|
||||||
`recover` it will generate a new snapshot which allows access to all trees that you have in your
|
|
||||||
repository.
|
|
||||||
|
|
||||||
Note that `recover` relies on a correct index state, so make sure your index is fine or run `repair index`
|
3. Repair the index
|
||||||
before running `recover`.
|
*******************
|
||||||
|
|
||||||
7. Repair defect snapshots using `repair`
|
Restic relies on its index to contain correct information about what data is
|
||||||
********************************************
|
stored in the repository. Thus, the first step to repair a repository is to
|
||||||
|
repair the index:
|
||||||
|
|
||||||
If all other things did not help, you can repair defect snapshots with `repair`. Note that the repaired
|
.. code-block:: console
|
||||||
snapshots will miss data which was referenced in the defect snapshot.
|
|
||||||
|
$ restic repair index
|
||||||
|
|
||||||
|
repository a14e5863 opened (version 2, compression level auto)
|
||||||
|
loading indexes...
|
||||||
|
getting pack files to read...
|
||||||
|
removing not found pack file 83ad44f59b05f6bce13376b022ac3194f24ca19e7a74926000b6e316ec6ea5a4
|
||||||
|
rebuilding index
|
||||||
|
[0:00] 100.00% 27 / 27 packs processed
|
||||||
|
deleting obsolete index files
|
||||||
|
[0:00] 100.00% 3 / 3 files deleted
|
||||||
|
done
|
||||||
|
|
||||||
|
This ensures that no longer existing files are removed from the index. All later
|
||||||
|
steps to repair the repository rely on a correct index. That is, you must always
|
||||||
|
repair the index first!
|
||||||
|
|
||||||
|
Please note that it is not recommended to repair the index unless the repository
|
||||||
|
is actually damaged.
|
||||||
|
|
||||||
|
|
||||||
|
4. Run all backups (optional)
|
||||||
|
*****************************
|
||||||
|
|
||||||
|
With a correct index, the ``backup`` command guarantees that newly created
|
||||||
|
snapshots can be restored successfully. It can also heal older snapshots,
|
||||||
|
if the missing data is also contained in the new snapshot.
|
||||||
|
|
||||||
|
Therefore, it is recommended to run all your ``backup`` tasks again. In some
|
||||||
|
cases, this is enough to fully repair the repository.
|
||||||
|
|
||||||
|
|
||||||
|
5. Remove missing data from snapshots
|
||||||
|
*************************************
|
||||||
|
|
||||||
|
If your repository is still missing data, then you can use the ``repair snapshots``
|
||||||
|
command to remove all inaccessible data from the snapshots. That is, this will
|
||||||
|
result in a limited amount of data loss. Using the ``--forget`` option, the
|
||||||
|
command will automatically remove the original, damaged snapshots.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ restic repair snapshots --forget
|
||||||
|
|
||||||
|
snapshot 6979421e of [/home/user/restic/restic] at 2022-11-02 20:59:18.617503315 +0100 CET)
|
||||||
|
file "/restic/internal/fuse/snapshots_dir.go": removed missing content
|
||||||
|
file "/restic/internal/restorer/restorer_unix_test.go": removed missing content
|
||||||
|
file "/restic/internal/walker/walker.go": removed missing content
|
||||||
|
saved new snapshot 7b094cea
|
||||||
|
removed old snapshot 6979421e
|
||||||
|
|
||||||
|
modified 1 snapshots
|
||||||
|
|
||||||
|
If you did not add the ``--forget`` option, then you have to manually delete all
|
||||||
|
modified snapshots using the ``forget`` command. In the example above, you'd have
|
||||||
|
to run ``restic forget 6979421e``.
|
||||||
|
|
||||||
|
|
||||||
|
6. Check the repository again
|
||||||
|
*****************************
|
||||||
|
|
||||||
|
Phew, we're almost done now. To make sure that the repository has been successfully
|
||||||
|
repaired please run ``check`` again.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ restic check --read-data
|
||||||
|
|
||||||
|
using temporary cache in /tmp/restic-check-cache-2569290785
|
||||||
|
repository a14e5863 opened (version 2, compression level auto)
|
||||||
|
created new cache in /tmp/restic-check-cache-2569290785
|
||||||
|
create exclusive lock for repository
|
||||||
|
load indexes
|
||||||
|
check all packs
|
||||||
|
check snapshots, trees and blobs
|
||||||
|
[0:00] 100.00% 7 / 7 snapshots
|
||||||
|
read all data
|
||||||
|
[0:00] 100.00% 25 / 25 packs
|
||||||
|
no errors were found
|
||||||
|
|
||||||
|
If the ``check`` command did not complete with ``no errors were found``, then
|
||||||
|
the repository is still damaged. At this point, please ask for help at the
|
||||||
|
`forum`_ or our IRC channel ``#restic`` on ``irc.libera.chat``.
|
||||||
|
|
Loading…
Reference in a new issue