Update documentation

This commit is contained in:
Alexander Neumann 2015-01-17 16:57:16 +01:00
parent fb95f02af6
commit dc3e97adc7

View file

@ -48,8 +48,6 @@ The basic layout of a sample restic repository is shown below:
├── keys
│ └── b02de829beeb3c01a63e6b25cbd421a98fef144f03b9a02e46eff9e2ca3f0bd7
├── locks
├── maps
│ └── 3c0721e5c3f5d2d78a12664b568a1bc992d17b993d41079599f8437ed66192fe
├── snapshots
│ └── 22a5af1bdc6e616f8a29579458c49627e01b32210d09adb288d1ecda7c5711ec
├── tmp
@ -60,8 +58,11 @@ The basic layout of a sample restic repository is shown below:
│ │ └── 32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5
│ ├── 95
│ │ └── 95f75feb05a7cc73e328b2efa668b1ea68f65fece55a93bc65aff6cd0bcfeefc
│ └── e0
│ └── e01150928f7ad24befd6ec15b087de1b9e0f92edabd8e5cabb3317f8b20ad044
│ ├── b8
│ │ └── b8138ab08a4722596ac89c917827358da4672eac68e3c03a8115b88dbf4bfb59
│ ├── e0
│ │ └── e01150928f7ad24befd6ec15b087de1b9e0f92edabd8e5cabb3317f8b20ad044
│ [...]
└── version
A repository can be initialized with the `restic init` command, e.g.:
@ -124,8 +125,13 @@ pretty-print the contents of a snapshot file:
Enter Password for Repository:
{
"time": "2015-01-02T18:10:50.895208559+01:00",
"tree": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf",
"map": "3c0721e5c3f5d2d78a12664b568a1bc992d17b993d41079599f8437ed66192fe",
"tree": "",
"tree": {
"id": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf",
"size": 282,
"sid": "b8138ab08a4722596ac89c917827358da4672eac68e3c03a8115b88dbf4bfb59",
"ssize": 330
},
"dir": "/tmp/testdata",
"hostname": "kasimir",
"username": "fd0",
@ -134,60 +140,33 @@ pretty-print the contents of a snapshot file:
}
Here it can be seen that this snapshot represents the contents of the directory
`/tmp/testdata`.
The two most important fields are `map` and `tree`.
Maps
----
`/tmp/testdata`. The most important field is `tree`.
All content within a restic repository is referenced according to its SHA-256
hash. Before saving, each file is split into variable sized chunks of data. The
SHA-256 hashes of all chunks are saved in an ordered list which then represents
the content of the file. In order to relate these plain text hashes to the
actual encrypted storage hashes (which vary due to random IVs), each snapshot
references a map.
the content of the file.
A map is an encrypted and compressed JSON document which contains a large list
of plain text hashes and associated storage hashes. This list is sorted by the
plain text hash in order to speed up lookups.
Maps are referenced by their storage ID, which is the SHA-256 hash of the
encrypted file stored in the `maps` directory.
The command `restic cat map` can be used to inspect the content of a map:
$ restic -r /tmp/restic-repo cat map 3c0721e5c3f5d2d78a12664b568a1bc992d17b993d41079599f8437ed66192fe
Enter Password for Repository:
[
{
"id": "1424916fc7279d58e3b2d8b533f481981ea5cb0f21a43932f26475e308e9b599",
"size": 287,
"sid": "32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5",
"ssize": 335
},
{
"id": "160916dec2e9f4597a2cc3f0787ff6b3726c21e056177292eb85281c9c2afaa0",
"size": 812,
"sid": "73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c",
"ssize": 860
},
[...]
]
In order to relate these plain text hashes to the actual encrypted storage
hashes (which vary due to random IVs), each object contains a list that maps
all referenced plaintext hashes to storage hashes. In the case of the snapshot
data structure listed above, the list only consists of one entry for the
referenced tree, so the field `tree` consists of such a mapping.
Trees and Data
--------------
The second thing a snapshot references is a tree. Trees are referenced by the
SHA-256 hash of the JSON string representation of its contents and are saved in
a subdirectory of the directory `trees`. The sub directory's name is the first
two characters of the filename the tree object is stored in.
A snapshot references a tree by the SHA-256 hash of the JSON string
representation of its contents. Trees are saved in a subdirectory of the
directory `trees`. The sub directory's name is the first two characters of the
filename the tree object is stored in.
The command `restic cat tree` can be used to inspect the tree referenced above:
$ restic -r /tmp/restic-repo cat tree 2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf
$ restic -r /tmp/restic-repo cat tree b8138ab08a4722596ac89c917827358da4672eac68e3c03a8115b88dbf4bfb59
Enter Password for Repository:
[
{
"nodes": [
{
"name": "testdata",
"type": "dir",
@ -200,21 +179,33 @@ The command `restic cat tree` can be used to inspect the tree referenced above:
"user": "fd0",
"inode": 409704562,
"content": null,
"subtree": "a8838fdbf2902095fb1b9de8b0e30d2e4e2a91bbc82fb15f98f6f1535b9ccbe6"
"subtree": "b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc"
}
],
"map": [
{
"id": "b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc",
"size": 910,
"sid": "8b238c8811cc362693e91a857460c78d3acf7d9edb2f111048691976803cf16e",
"ssize": 958
}
]
}
A tree is a list of entries which contain meta data like a name and timestamps.
When the entry references a directory, the field `subtree` contains the plain
text ID of another tree object. The associated storage ID can be found in the
map object.
A tree contains a list of entries (in the field `nodes`) which contain meta
data like a name and timestamps. When the entry references a directory, the
field `subtree` contains the plain text ID of another tree object. The
associated storage ID can be found in the map object. All referenced plaintext
hashes are mapped to their corresponding storage hashes in the list containid
in the field `map`.
This can also be inspected by using `restic cat tree`, which automatically
searches all available maps for the storage ID:
When the command `restic cat tree` is used, the storage hash is needed to print
a tree. The tree referenced above can be dumped as follows:
$ restic -r /tmp/restic-repo cat tree a8838fdbf2902095fb1b9de8b0e30d2e4e2a91bbc82fb15f98f6f1535b9ccbe6
$ restic -r /tmp/restic-repo cat tree 8b238c8811cc362693e91a857460c78d3acf7d9edb2f111048691976803cf16e
Enter Password for Repository:
[
{
"nodes": [
{
"name": "testfile",
"type": "file",
@ -233,23 +224,34 @@ searches all available maps for the storage ID:
]
},
[...]
],
"map": [
{
"id": "50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d",
"size": 1234,
"sid": "00634c46e5f7c055c341acd1201cf8289cabe769f991d6e350f8cd8ce2a52ac3",
"ssize": 1282
},
[...]
]
}
This tree contains a file entry. In contrast to the entry above, the `subtree`
field is not present and the `content` field contains a list with one plain
text SHA-256 hash. The storage ID for this ID can in turn be looked up in the
map. Data chunks stored as encrypted files in a sub directory of the directory
`data`, similar to tree objects.
This tree contains a file entry. This time, the `subtree` field is not present
and the `content` field contains a list with one plain text SHA-256 hash. The
storage ID for this ID can in turn be looked up in the map. Data chunks stored
as encrypted files in a sub directory of the directory `data`, similar to tree
objects.
The command `restic cat data` can be used to lookup, extract and decrypt data,
e.g. for the data mentioned above:
The command `restic cat data` can be used to extract and decrypt data given a
storage hash, e.g. for the data mentioned above:
$ restic -r /tmp/restic-repo cat blob 50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d | sha256sum
$ restic -r /tmp/restic-repo cat blob 00634c46e5f7c055c341acd1201cf8289cabe769f991d6e350f8cd8ce2a52ac3 | sha256sum
Enter Password for Repository:
50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d -
As can be seen from the output of the program `sha256sum`, the hash is the
same, so the correct data has been returned.
As can be seen from the output of the program `sha256sum`, the hash matches the
plaintext hash from the map included in the tree above, so the correct data has
been returned.
Backups and Deduplication
=========================