Commit graph

67 commits

Author SHA1 Message Date
Michael Eischer
2033c02b09 index: replace CountedBlobSet with AssociatedSet 2024-05-24 22:18:14 +02:00
Michael Eischer
e52033a8bd index: slightly reduce Rewrite concurrency
The index operations are likely CPU-bounded. Thus, reduce the
concurrency accordingly.
2024-05-24 21:33:17 +02:00
Michael Eischer
57d69aa640 index: cleanup SaveIndex method 2024-05-24 21:33:17 +02:00
Michael Eischer
2ca1c37721 index: additional tests for new index save methods 2024-05-24 21:33:17 +02:00
Michael Eischer
5f7b48e65f index: replace Save() method with Rewrite and SaveFallback
Rewrite implements a streaming rewrite of the index that excludes the
given packs. For this it loads all index files from the repository and
only modifies those that require changes. This will reduce the index
churn when running prune. Rewrite does not require the in-memory index
and thus can drop it to significantly reduce the memory usage.

However, `prune --unsafe-recovery` cannot use this strategy and requires
a separate method to save the whole in-memory index. This is now handled
using SaveFallback.
2024-05-24 21:33:17 +02:00
Michael Eischer
72482ce5bd index: misc cleanups 2024-05-24 21:33:17 +02:00
Michael Eischer
9aa0c90fb2 index: remove supersedes field
Using the field with its current semantics is nearly impossible to get
right. Remove it as it will be replaced anyways in repository format 3.
2024-05-24 21:33:17 +02:00
Michael Eischer
550d1eeac3 repository: remove SaveIndex from interface
The method is now only indirectly accessible via Prune or RepairIndex.
2024-05-24 21:33:17 +02:00
Michael Eischer
fb59e00614 index: rewrite MasterIndex load/save test to be independent of repository 2024-05-24 21:33:17 +02:00
Michael Eischer
447b486c20 index: deduplicate index loading of check and repository 2024-05-24 21:33:17 +02:00
Michael Eischer
4df887406f repository: inline MasterIndex interface into Repository interface 2024-05-24 21:33:17 +02:00
Michael Eischer
223aa22cb0 replace some uses of restic.Repository with finegrained interfaces 2024-05-18 21:42:51 +02:00
Michael Eischer
8a425c2f0a remove usages of repo.Backend() from tests 2024-05-18 21:42:51 +02:00
Michael Eischer
6328b7e1f5 replace "too small" with "too short" in error messages 2024-05-18 19:59:26 +02:00
Michael Eischer
940a3159b5 let index.Each() and pack.Size() return error on canceled context
This forces a caller to actually check that the function did complete.
2024-04-22 22:39:32 +02:00
Michael Eischer
31624aeffd Improve command shutdown on context cancellation 2024-04-22 22:31:38 +02:00
Michael Eischer
10355c3fb6 repository: Better error message if blob is larger than 4GB 2024-04-19 22:00:35 +02:00
Michael Eischer
dc441c57a7 repository: unify repository initialization in tests
Tests should use a helper from internal/repository/testing.go to
construct a Repository object.
2024-03-28 23:17:02 +01:00
Michael Eischer
f8852f0eb6 repair index: fix deletion of legacy indexes 2024-03-09 18:21:22 +01:00
Michael Eischer
1a8bf358f1 index: deprecate legacy index format 2024-03-09 18:21:14 +01:00
Alexander Neumann
c0514dd8ba Fix linter errors (except for tests) 2024-02-10 22:58:10 +01:00
Michael Eischer
bfb56b78e1 replace some usages of restic.Repository with more specific interface
This should eventually make it easier to test the code.
2024-01-27 13:02:02 +01:00
Michael Eischer
cb50832d50 index: let MasterIndex.Save also delete obsolete indexes 2024-01-27 12:51:08 +01:00
Andrea Gelmini
241916d55b
Fix typos 2023-12-06 13:11:55 +01:00
Michael Eischer
c7b770eb1f convert MemorizeList to be repository based
Ideally, code that uses a repository shouldn't directly interact with
the underlying backend. Thus, move MemorizeList one layer up.
2023-10-25 23:01:35 +02:00
Michael Eischer
1b8a67fe76 move Backend interface to backend package 2023-10-25 23:00:18 +02:00
Michael Eischer
91aef00df3 check: add index loading progress bar 2023-10-01 19:55:29 +02:00
Michael Eischer
3fd0ad7448 repository: list index files only once 2023-10-01 19:53:26 +02:00
arjunajesh
ed65a7dbca implement progress bar for index loading 2023-10-01 19:52:59 +02:00
Michael Eischer
2bec99dc6f master_index: fix inconsistent length blob length in test
Two blobs with the same hash must always have the same content length.
2023-08-19 20:04:25 +02:00
Michael Eischer
090f9d6237 restic: Cleanup and simplify TestCreateSnapshot 2023-07-22 19:55:57 +02:00
Michael Eischer
b2ed42cec4 index: add basic hat test 2023-06-16 23:12:30 +02:00
Michael Eischer
55c21846b1 Revert "index: remove redundant storage of indexmap size"
This reverts commit f1c388c623.

For an uninitialized indexmap the returned size was `-1` which is
unexpected and could cause problems.
2023-06-08 18:08:46 +02:00
Michael Eischer
ac1dfc99bb index: fix blocklist size 2023-06-02 19:39:12 +02:00
Michael Eischer
9a7056a479 index: implement indexmap.grow() without random access 2023-05-30 20:13:33 +02:00
Michael Eischer
fc05e35a08 index: let indexmap.Each iterate in allocation order
Iterating through the indexmap according to the bucket order has the
problem that all indexEntries are accessed in random order which is
rather cache inefficient.

As we already keep a list of all allocated blocks, just iterate through
it. This allows iterating through a batch of indexEntries without random
memory accesses. In addition, the packID will likely remain similar
across multiple blobs as all blobs of a pack file are added as a single
batch.
2023-05-30 20:12:36 +02:00
Michael Eischer
f1c388c623 index: remove redundant storage of indexmap size 2023-05-30 20:11:53 +02:00
Michael Eischer
12141afbad index: Allow inlining of HAT 2023-05-30 20:11:14 +02:00
Michael Eischer
fed33295c3 index: store indexEntries in hashed array tree
This data structure reduces the wasted memory to O(sqrt(n)). The
top-layer of the hashed array tree (HAT) also has a size of O(sqrt(n)),
which makes it cache efficient. The top-layer should be small enough to
easily fit into the CPU cache and thus only adds little overhead
compared to directly accessing an index entry via a pointer.
2023-05-29 00:24:15 +02:00
Michael Eischer
b217f38ee7 index: Remove pointers from within indexentrys
The indexEntry objects are now allocated in a separate array. References
to an indexEntry are now stored as array indices. This has the benefit
of allowing the garbage collector to ignore the indexEntry objects as
these do not contain pointers and are part of a single large allocation.
2023-05-29 00:24:15 +02:00
Michael Eischer
0c1240360d index: add garbage collection benchmark
Allocates an index and repeatedly triggers the GC.
2023-05-29 00:23:04 +02:00
Michael Eischer
0058745881 test: use parameter instead of hardcoded constant 2023-05-18 21:17:53 +02:00
greatroar
d129baba7a repository: Reuse buffers in Repository.LoadUnpacked
This method had a buffer argument, but that was nil at all call sites.
That's removed, and instead LoadUnpacked now reuses whatever it
allocates inside its retry loop.
2023-01-30 22:01:01 +01:00
greatroar
99755c634b index: Optimize generatePackList
name                 old time/op    new time/op    delta
EncodeIndex/100-8      1.56ms ± 2%    1.48ms ± 3%   -5.37%  (p=0.000 n=10+10)
EncodeIndex/1000-8     14.5ms ± 2%    13.1ms ± 2%   -9.49%  (p=0.000 n=9+10)
EncodeIndex/10000-8     120ms ± 2%     116ms ± 2%   -3.58%  (p=0.000 n=10+10)

name                 old alloc/op   new alloc/op   delta
EncodeIndex/100-8       306kB ± 1%     275kB ± 1%  -10.28%  (p=0.000 n=10+10)
EncodeIndex/1000-8     3.69MB ±11%    2.88MB ± 5%  -22.07%  (p=0.000 n=10+9)
EncodeIndex/10000-8    35.9MB ±11%    31.9MB ±10%  -11.13%  (p=0.005 n=10+10)

name                 old allocs/op  new allocs/op  delta
EncodeIndex/100-8       3.39k ± 0%     2.39k ± 0%  -29.61%  (p=0.000 n=10+10)
EncodeIndex/1000-8      32.6k ± 0%     22.9k ± 0%  -29.63%  (p=0.000 n=10+9)
EncodeIndex/10000-8      326k ± 0%      229k ± 0%  -29.71%  (p=0.000 n=10+10)

The bulk of the allocation rate improvement comes from just removing the
debug.Log calls: every one of those copied a restic.ID to the heap.
2023-01-14 20:41:07 +01:00
greatroar
c0b5ec55ab repository: Remove empty cleanup functions in tests
TestRepository and its variants always returned no-op cleanup functions.
If they ever do need to do cleanup, using testing.T.Cleanup is easier
than passing these functions around.
2022-12-11 11:06:25 +01:00
greatroar
0e8893dae9 index: Compact data structure for Index.EachByPack 2022-10-29 23:09:17 +02:00
Michael Eischer
ae45f3b04f restic: Unify code to load Index/Lock/Snapshot 2022-10-21 21:25:11 +02:00
Michael Eischer
2e3f1c08c5 repository: split index into a separate package 2022-10-08 21:15:34 +02:00
Alexander Weiss
9607cad267 Remove internal/index 2020-11-15 07:05:09 +01:00
greatroar
21b787a4d1 Stop Counters where they're constructed and started 2020-11-09 13:03:31 +01:00