Update README and Design documentation

This commit is contained in:
Florian Daniel 2015-08-21 22:08:10 +02:00
parent 34d7a674f8
commit 8b485c59fc
2 changed files with 24 additions and 22 deletions

View file

@ -4,18 +4,19 @@
[![sourcegraph status](https://sourcegraph.com/api/repos/github.com/restic/restic/.badges/status.png)](https://sourcegraph.com/github.com/restic/restic) [![sourcegraph status](https://sourcegraph.com/api/repos/github.com/restic/restic/.badges/status.png)](https://sourcegraph.com/github.com/restic/restic)
[![Coverage Status](https://coveralls.io/repos/restic/restic/badge.svg)](https://coveralls.io/r/restic/restic) [![Coverage Status](https://coveralls.io/repos/restic/restic/badge.svg)](https://coveralls.io/r/restic/restic)
Restic Restic Design Principles
====== ========================
Restic is a program that does backups right. The design goals are: Restic is a program that does backups right and was designed with the following
principles in mind:
* Easy: Doing backups should be a frictionless process, otherwise you are * Easy: Doing backups should be a frictionless process, otherwise you might be
tempted to skip it. Restic should be easy to configure and use, so that in tempted to skip it. Restic should be easy to configure and use, so that, in
the unlikely event of a data loss you can just restore it. Likewise, the event of a data loss, you can just restore it. Likewise,
restoring data should not be complicated. restoring data should not be complicated.
* Fast: Backing up your data with restic should only be limited by your * Fast: Backing up your data with restic should only be limited by your
network or harddisk bandwidth so that you can backup your files every day. network or hard disk bandwidth so that you can backup your files every day.
Nobody does backups if it takes too much time. Restoring backups should only Nobody does backups if it takes too much time. Restoring backups should only
transfer data that is needed for the files that are to be restored, so that transfer data that is needed for the files that are to be restored, so that
this process is also fast. this process is also fast.
@ -31,12 +32,12 @@ Restic is a program that does backups right. The design goals are:
* Efficient: With the growth of data, additional snapshots should only take * Efficient: With the growth of data, additional snapshots should only take
the storage of the actual increment. Even more, duplicate data should be the storage of the actual increment. Even more, duplicate data should be
de-duplicated before it is actually written to the storage backend to save de-duplicated before it is actually written to the storage back end to save
precious backup space. precious backup space.
Building Build restic
======== ============
Install Go/Golang (at least version 1.3), then run `go run build.go`, Install Go/Golang (at least version 1.3), then run `go run build.go`,
afterwards you'll find the binary in the current directory: afterwards you'll find the binary in the current directory:
@ -89,7 +90,7 @@ Contribute and Documentation
Contributions are welcome! More information can be found in Contributions are welcome! More information can be found in
[`CONTRIBUTING.md`](CONTRIBUTING.md). A document describing the design of [`CONTRIBUTING.md`](CONTRIBUTING.md). A document describing the design of
restic and the data structures stored on disc is contained in restic and the data structures stored on the backend is contained in
[`doc/Design.md`](doc/Design.md). [`doc/Design.md`](doc/Design.md).
The development environment is described in [`CONTRIBUTING.md`](CONTRIBUTING.md). The development environment is described in [`CONTRIBUTING.md`](CONTRIBUTING.md).

View file

@ -6,15 +6,15 @@ Terminology
This section introduces terminology used in this document. This section introduces terminology used in this document.
*Repository*: All data produced during a backup is sent to and stored at a *Repository*: All data produced during a backup is sent to and stored in a
repository in structured form, for example in a file system hierarchy of with repository in a structured form, for example in a file system hierarchy with
several subdirectories. A repository implementation must be able to fulfil a several subdirectories. A repository implementation must be able to fulfill a
number of operations, e.g. list the contents. number of operations, e.g. list the contents.
*Blob*: A Blob combines a number of data bytes with identifying information *Blob*: A Blob combines a number of data bytes with identifying information
like the SHA256 hash of the data and its length. like the SHA256 hash of the data and its length.
*Pack*: A Pack combines one or more Blobs together, e.g. in a single file. *Pack*: A Pack combines one or more Blobs, e.g. in a single file.
*Snapshot*: A Snapshot stands for the state of a file or directory that has *Snapshot*: A Snapshot stands for the state of a file or directory that has
been backed up at some point in time. The state here means the content and meta been backed up at some point in time. The state here means the content and meta
@ -22,7 +22,7 @@ data like the name and modification time for the file or the directory and its
contents. contents.
*Storage ID*: A storage ID is the SHA-256 hash of the content stored in the *Storage ID*: A storage ID is the SHA-256 hash of the content stored in the
repository. This ID is needed in order to load the file from the repository. repository. This ID is required in order to load the file from the repository.
Repository Format Repository Format
================= =================
@ -36,19 +36,20 @@ parallel. Only the delete operation removes data from the repository.
At the time of writing, the only implemented repository type is based on At the time of writing, the only implemented repository type is based on
directories and files. Such repositories can be accessed locally on the same directories and files. Such repositories can be accessed locally on the same
system or via the integrated SFTP client. The directory layout is the same for system or via the integrated SFTP client (or any other storage back end).
both access methods. This repository type is described in the following. The directory layout is the same for both access methods.
This repository type is described in the following section.
Repositories consist of several directories and a file called `config`. For Repositories consist of several directories and a file called `config`. For
all other files stored in the repository, the name for the file is the lower all other files stored in the repository, the name for the file is the lower
case hexadecimal representation of the storage ID, which is the SHA-256 hash of case hexadecimal representation of the storage ID, which is the SHA-256 hash of
the file's contents. This allows easily checking all files for accidental the file's contents. This allows for easy verification of files for accidental
modifications like disk read errors by simply running the program `sha256sum` modifications, like disk read errors, by simply running the program `sha256sum`
and comparing its output to the file name. If the prefix of a filename is and comparing its output to the file name. If the prefix of a filename is
unique amongst all the other files in the same directory, the prefix may be unique amongst all the other files in the same directory, the prefix may be
used instead of the complete filename. used instead of the complete filename.
Apart from the files stored below the `keys` directory, all files are encrypted Apart from the files stored within the `keys` directory, all files are encrypted
with AES-256 in counter mode (CTR). The integrity of the encrypted data is with AES-256 in counter mode (CTR). The integrity of the encrypted data is
secured by a Poly1305-AES message authentication code (sometimes also referred secured by a Poly1305-AES message authentication code (sometimes also referred
to as a "signature"). to as a "signature").
@ -398,7 +399,7 @@ required to create a lock on the repository before doing anything.
Locks come in two types: Exclusive and non-exclusive locks. At most one Locks come in two types: Exclusive and non-exclusive locks. At most one
process can have an exclusive lock on the repository, and during that time process can have an exclusive lock on the repository, and during that time
there mustn't be any other locks (exclusive and non-exclusive). There may be there must not be any other locks (exclusive and non-exclusive). There may be
multiple non-exclusive locks in parallel. multiple non-exclusive locks in parallel.
A lock is a file in the subdir `locks` whose filename is the storage ID of A lock is a file in the subdir `locks` whose filename is the storage ID of