restorer: Allow writing target file blobs out of order
Much simpler implementation that guarantees each required pack is downloaded only once (and hence does not need to manage pack cache). Also improves large file restore performance. Signed-off-by: Igor Fedorenko <igor@ifedorenko.com>
This commit is contained in:
parent
f2bf06a419
commit
f17ffa0283
13 changed files with 321 additions and 1448 deletions
|
@ -5,29 +5,20 @@
|
|||
// request and avoiding repeated downloads of the same pack. In addition,
|
||||
// several pack files are fetched concurrently.
|
||||
//
|
||||
// Here is high-level pseudo-code of the how the Restorer attempts to achieve
|
||||
// Here is high-level pseudo-code of how the Restorer attempts to achieve
|
||||
// these goals:
|
||||
//
|
||||
// while there are packs to process
|
||||
// choose a pack to process [1]
|
||||
// get the pack from the backend or cache [2]
|
||||
// retrieve the pack from the backend [2]
|
||||
// write pack blobs to the files that need them [3]
|
||||
// if not all pack blobs were used
|
||||
// cache the pack for future use [4]
|
||||
//
|
||||
// Pack download and processing (steps [2] - [4]) runs on multiple concurrent
|
||||
// Goroutines. The Restorer runs all steps [2]-[4] sequentially on the same
|
||||
// Goroutine.
|
||||
// Retrieval of repository packs (step [2]) and writing target files (step [3])
|
||||
// are performed concurrently on multiple goroutines.
|
||||
//
|
||||
// Before a pack is downloaded (step [2]), the required space is "reserved" in
|
||||
// the pack cache. Actual download uses single backend request to get all
|
||||
// required pack blobs. This may download blobs that are not needed, but we
|
||||
// assume it'll still be faster than getting individual blobs.
|
||||
//
|
||||
// Target files are written (step [3]) in the "right" order, first file blob
|
||||
// first, then second, then third and so on. Blob write order implies that some
|
||||
// pack blobs may not be immediately used, i.e. they are "out of order" for
|
||||
// their respective target files. Packs with unused blobs are cached (step
|
||||
// [4]). The cache has capacity limit and may purge packs before they are fully
|
||||
// used, in which case the purged packs will need to be re-downloaded.
|
||||
// Implementation does not guarantee order in which blobs are written to the
|
||||
// target files and, for example, the last blob of a file can be written to the
|
||||
// file before any of the preceeding file blobs. It is therefore possible to
|
||||
// have gaps in the data written to the target files if restore fails or
|
||||
// interrupted by the user.
|
||||
package restorer
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue