From 8b485c59fc354fca6262a9be3dc1b8cf4426ff0f Mon Sep 17 00:00:00 2001 From: Florian Daniel Date: Fri, 21 Aug 2015 22:08:10 +0200 Subject: [PATCH] Update README and Design documentation --- README.md | 23 ++++++++++++----------- doc/Design.md | 23 ++++++++++++----------- 2 files changed, 24 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index d9b8609ce..438f46cb0 100644 --- a/README.md +++ b/README.md @@ -4,18 +4,19 @@ [![sourcegraph status](https://sourcegraph.com/api/repos/github.com/restic/restic/.badges/status.png)](https://sourcegraph.com/github.com/restic/restic) [![Coverage Status](https://coveralls.io/repos/restic/restic/badge.svg)](https://coveralls.io/r/restic/restic) -Restic -====== +Restic Design Principles +======================== -Restic is a program that does backups right. The design goals are: +Restic is a program that does backups right and was designed with the following +principles in mind: - * Easy: Doing backups should be a frictionless process, otherwise you are - tempted to skip it. Restic should be easy to configure and use, so that in - the unlikely event of a data loss you can just restore it. Likewise, + * Easy: Doing backups should be a frictionless process, otherwise you might be + tempted to skip it. Restic should be easy to configure and use, so that, in + the event of a data loss, you can just restore it. Likewise, restoring data should not be complicated. * Fast: Backing up your data with restic should only be limited by your - network or harddisk bandwidth so that you can backup your files every day. + network or hard disk bandwidth so that you can backup your files every day. Nobody does backups if it takes too much time. Restoring backups should only transfer data that is needed for the files that are to be restored, so that this process is also fast. @@ -31,12 +32,12 @@ Restic is a program that does backups right. The design goals are: * Efficient: With the growth of data, additional snapshots should only take the storage of the actual increment. Even more, duplicate data should be - de-duplicated before it is actually written to the storage backend to save + de-duplicated before it is actually written to the storage back end to save precious backup space. -Building -======== +Build restic +============ Install Go/Golang (at least version 1.3), then run `go run build.go`, afterwards you'll find the binary in the current directory: @@ -89,7 +90,7 @@ Contribute and Documentation Contributions are welcome! More information can be found in [`CONTRIBUTING.md`](CONTRIBUTING.md). A document describing the design of -restic and the data structures stored on disc is contained in +restic and the data structures stored on the backend is contained in [`doc/Design.md`](doc/Design.md). The development environment is described in [`CONTRIBUTING.md`](CONTRIBUTING.md). diff --git a/doc/Design.md b/doc/Design.md index 5b08c708e..5a3d0e2e9 100644 --- a/doc/Design.md +++ b/doc/Design.md @@ -6,15 +6,15 @@ Terminology This section introduces terminology used in this document. -*Repository*: All data produced during a backup is sent to and stored at a -repository in structured form, for example in a file system hierarchy of with -several subdirectories. A repository implementation must be able to fulfil a +*Repository*: All data produced during a backup is sent to and stored in a +repository in a structured form, for example in a file system hierarchy with +several subdirectories. A repository implementation must be able to fulfill a number of operations, e.g. list the contents. *Blob*: A Blob combines a number of data bytes with identifying information like the SHA256 hash of the data and its length. -*Pack*: A Pack combines one or more Blobs together, e.g. in a single file. +*Pack*: A Pack combines one or more Blobs, e.g. in a single file. *Snapshot*: A Snapshot stands for the state of a file or directory that has been backed up at some point in time. The state here means the content and meta @@ -22,7 +22,7 @@ data like the name and modification time for the file or the directory and its contents. *Storage ID*: A storage ID is the SHA-256 hash of the content stored in the -repository. This ID is needed in order to load the file from the repository. +repository. This ID is required in order to load the file from the repository. Repository Format ================= @@ -36,19 +36,20 @@ parallel. Only the delete operation removes data from the repository. At the time of writing, the only implemented repository type is based on directories and files. Such repositories can be accessed locally on the same -system or via the integrated SFTP client. The directory layout is the same for -both access methods. This repository type is described in the following. +system or via the integrated SFTP client (or any other storage back end). +The directory layout is the same for both access methods. +This repository type is described in the following section. Repositories consist of several directories and a file called `config`. For all other files stored in the repository, the name for the file is the lower case hexadecimal representation of the storage ID, which is the SHA-256 hash of -the file's contents. This allows easily checking all files for accidental -modifications like disk read errors by simply running the program `sha256sum` +the file's contents. This allows for easy verification of files for accidental +modifications, like disk read errors, by simply running the program `sha256sum` and comparing its output to the file name. If the prefix of a filename is unique amongst all the other files in the same directory, the prefix may be used instead of the complete filename. -Apart from the files stored below the `keys` directory, all files are encrypted +Apart from the files stored within the `keys` directory, all files are encrypted with AES-256 in counter mode (CTR). The integrity of the encrypted data is secured by a Poly1305-AES message authentication code (sometimes also referred to as a "signature"). @@ -398,7 +399,7 @@ required to create a lock on the repository before doing anything. Locks come in two types: Exclusive and non-exclusive locks. At most one process can have an exclusive lock on the repository, and during that time -there mustn't be any other locks (exclusive and non-exclusive). There may be +there must not be any other locks (exclusive and non-exclusive). There may be multiple non-exclusive locks in parallel. A lock is a file in the subdir `locks` whose filename is the storage ID of