Commit graph

26 commits

Author SHA1 Message Date
Michael Eischer
ddcf549eba repository: remove IsMixedPack and add replacement for checker
Repositories with mixed packs are probably quite rare by now. When
loading data blobs from a mixed pack file, this will no longer trigger
caching that file. However, usually tree blobs are accessed first such
that this shouldn't make much of a difference.

The checker gets a simpler replacement.
2022-10-03 12:03:59 +02:00
Kyle Brennan
1e3f05c3f1 repository: prevent header overfill 2022-08-05 23:47:12 +02:00
Michael Eischer
0a6fa602c8 add option for setting min pack size 2022-08-05 23:47:12 +02:00
Michael Eischer
753e56ee29 repository: Limit to a single pending pack file
Use only a single not completed pack file to keep the number of open and
active pack files low. The main change here is to defer hashing the pack
file to the upload step. This prevents the pack assembly step to become
a bottleneck as the only task is now to write data to the temporary pack
file.

The tests are cleaned up to no longer reimplement packer manager
functions.
2022-07-02 22:42:34 +02:00
Michael Eischer
120ccc8754 repository: Rework blob saving to use an async pack uploader
Previously, SaveAndEncrypt would assemble blobs into packs and either
return immediately if the pack is not yet full or upload the pack file
otherwise. The upload will block the current goroutine until it
finishes.

Now, the upload is done using separate goroutines. This requires changes
to the error handling. As uploads are no longer tied to a SaveAndEncrypt
call, failed uploads are signaled using an errgroup.

To count the uploaded amount of data, the pack header overhead is no
longer returned by `packer.Finalize` but rather by
`packer.HeaderOverhead`. This helper method is necessary to continue
returning the pack header overhead directly to the responsible call to
`repository.SaveBlob`. Without the method this would not be possible,
as packs are finalized asynchronously.
2022-07-02 22:42:34 +02:00
Michael Eischer
a6e9e08034 Account for pack header overhead at each entry
This will miss the pack header crypto overhead and the length field,
which only amount to a few bytes per pack file.
2022-07-02 18:55:58 +02:00
Michael Eischer
a77d5c4d11 repository: index saving belongs into the MasterIndex 2022-07-02 18:38:56 +02:00
Michael Eischer
ae7e51382a Fix error on temp file deletion on windows
Apparently it can take a moment between closing a tempfile marked as
DELETE_ON_CLOSE and it actually being deleted. During that time the file
is inaccessible. Thus just skip deleting the temp file on windows.
2022-05-09 22:43:26 +02:00
Alexander Weiss
81876d5c1b Simplify cache logic 2021-09-03 21:01:00 +02:00
Michael Eischer
9aa2eff384 Add plumbing to calculate backend specific file hash for upload
This enables the backends to request the calculation of a
backend-specific hash. For the currently supported backends this will
always be MD5. The hash calculation happens as early as possible, for
pack files this is during assembly of the pack file. That way the hash
would even capture corruptions of the temporary pack file on disk.
2021-08-04 22:17:46 +02:00
greatroar
ab2b7d7f9a Decrease allocation rate in internal/pack
internal/repository benchmark results:

name             old time/op    new time/op    delta
PackerManager-8     179ms ± 1%     181ms ± 1%   +0.78%  (p=0.009 n=10+10)

name             old speed      new speed      delta
PackerManager-8   294MB/s ± 1%   292MB/s ± 1%   -0.77%  (p=0.009 n=10+10)

name             old alloc/op   new alloc/op   delta
PackerManager-8    91.3kB ± 0%    72.2kB ± 0%  -20.92%  (p=0.000 n=9+7)

name             old allocs/op  new allocs/op  delta
PackerManager-8     1.38k ± 0%     0.76k ± 0%  -45.20%  (p=0.000 n=10+7)
2020-11-15 16:51:47 +01:00
aawsome
0fed6a8dfc
Use "pack file" instead of "data file" (#2885)
- changed variable names, especially changed DataFile into PackFile
- changed in some comments
- always use "pack file" in docu
2020-08-16 11:16:38 +02:00
MichaelEischer
dd7b4f54f5
Merge pull request #2709 from greatroar/minio-sha256
Use Minio's optimized SHA-256
2020-06-12 23:32:58 +02:00
Alexander Weiss
70347e95d5 disable index uploads for prune command
+ modifications of changelog
2020-06-12 09:24:38 +02:00
Alexander Weiss
91906911b0 Fix non-intuitive repository behavior
- The SaveBlob method now checks for duplicates.
- Moves handling of pending blobs to MasterIndex.
  -> also cleans up pending index entries when they are saved in the index
  -> when using SaveBlob no need to care about index any longer
- Always check for full index and save it when storing packs.
  -> removes the need of an index uploader
  -> also removes the verbose "uploaded intermediate index" messages
- The Flush method now also saves the index
- Fix race condition when checking and saving full/non-finalized indexes
2020-06-11 13:05:23 +02:00
greatroar
42a3db05b0 Use Minio's optimized SHA-256
internal/repository benchmarks on an Intel i7-3770k:

name               old speed      new speed       delta
PackerManager-8     209MB/s ± 1%    291MB/s ± 1%  +38.94%  (p=0.008 n=5+5)
SaveAndEncrypt-8    112MB/s ± 1%    135MB/s ± 1%  +20.25%  (p=0.008 n=5+5)
2020-04-28 07:57:18 +02:00
Alexander Neumann
99f7fd74e3 backend: Improve Save()
As mentioned in issue [#1560](https://github.com/restic/restic/pull/1560#issuecomment-364689346)
this changes the signature for `backend.Save()`. It now takes a
parameter of interface type `RewindReader`, so that the backend
implementations or our `RetryBackend` middleware can reset the reader to
the beginning and then retry an upload operation.

The `RewindReader` interface also provides a `Length()` method, which is
used in the backend to get the size of the data to be saved. This
removes several ugly hacks we had to do to pull the size back out of the
`io.Reader` passed to `Save()` before. In the `s3` and `rest` backend
this is actively used.
2018-03-03 15:49:44 +01:00
Alexander Neumann
663c57ab4d debug: Remove manual Str() call Log() 2018-01-25 20:49:41 +01:00
George Armhold
d886cb5c27 replace ad-hoc context.TODO() with gopts.ctx, so that cancellation
can properly trickle down from cmd_*.

gh-1434
2017-12-03 07:22:14 -05:00
Alexander Neumann
7a5fde8f5a repository: Save pack files for trees in cache 2017-09-24 21:54:53 +02:00
Alexander Neumann
3541d06d07 repo: Split packers for tree and data
The code now bundles tree blobs and data blobs into different pack
files, so we'll end up with pack files that either only contain data or
trees. This is in preparation to adding a cache (#1040), because
tree-only pack files can easily be cached later on.
2017-09-22 15:36:47 +02:00
Alexander Neumann
db0e3cd772 repo: Remove packer limits
This commit simplifies finding a packer: The first open packer is taken,
and the upper limit for the pack file is removed.
2017-09-22 15:36:47 +02:00
Alexander Neumann
d610c60991 repo: Remove unused sync.Pool 2017-09-22 12:37:10 +02:00
Alexander Neumann
23c903074c Move restic package to internal/restic 2017-07-24 17:43:32 +02:00
Alexander Neumann
6caeff2408 Run goimports 2017-07-23 14:21:03 +02:00
Alexander Neumann
83d1a46526 Moves files 2017-07-23 14:19:13 +02:00
Renamed from src/restic/repository/packer_manager.go (Browse further)