Commit graph

487 commits

Author SHA1 Message Date
Evgenii Stratonikov
61ae8b0a2c shard: ignore errors in UpdateID
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:47:39 +03:00
Evgenii Stratonikov
7b5b735fb2 [#1550] engine: Split errors on write- and meta- errors
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:45:57 +03:00
Evgenii Stratonikov
dafc21b052 [#1550] engine: Set default error threshold
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:45:57 +03:00
Leonard Lyubich
a6d1eefeff [#1549] shard: Always close metabase
Make `meta.DB` to call `Close` method on `bbolt.DB` instance if it is
non-nil only. Call `meta.DB.Close` in `shard.Shard.Close` anyway.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-08 13:45:57 +03:00
Leonard Lyubich
596d877a44 [#1549] engine: Disable shard on blobovnicza init failure
There is a need to support working w/o shard if it has problems with
blobovnicza tree.

Make `BlobStor.Init` to return new `ErrInitBlobovniczas` error. Remove
shard from storage engine's shard set if it returned this error from
`Init` call. So if some of the shards (but not all) return this error,
the node will be able to continue working without them.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-08 13:45:57 +03:00
Leonard Lyubich
263497a92b [#1549] shard: Turn to ModeDegraded on metabase failure
Make `Shard` to work in degraded mode if metabase is unavailable on
opening/init stage. Close metabase in non-degraded mode only.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-08 13:45:57 +03:00
Evgenii Stratonikov
e0e4f1f7ee engine: initialize shards in parallel
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:44:05 +03:00
Evgenii Stratonikov
9857a20c0d [#1505] pilorama: Provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:28:23 +03:00
Evgenii Stratonikov
1fed255c5b [#1505] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:59 +03:00
Evgenii Stratonikov
982cb987a3 [#1333] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:38 +03:00
Evgenii Stratonikov
5408efef82 [#1333] services/control: Return pilorama info in ListShards RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:17 +03:00
Evgenii Stratonikov
62b2769a66 [#1333] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:53 +03:00
Evgenii Stratonikov
199ee3a680 [#1481] pilorama: Fix TreeApply
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:30 +03:00
Evgenii Stratonikov
6b02df7b8c [#1444] pilorama: Fix TreeMove in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:23 +03:00
Evgenii Stratonikov
aec4f54a00 [#1444] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:22:00 +03:00
Evgenii Stratonikov
06f2681178 [#1442] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:50 +03:00
Evgenii Stratonikov
fa57a8be44 [#1431] pilorama: Use Batch for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:55 +03:00
Evgenii Stratonikov
d6d7e35454 [#1431] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:23 +03:00
Evgenii Stratonikov
241d4d6810 [#1431] engine: Add benchmark for Select vs TreeGetByPath
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:15 +03:00
Evgenii Stratonikov
35fa445195 [#1329] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:01 +03:00
Evgenii Stratonikov
9cbd4271f1 [#1329] services/tree: Implement GetOpLog RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:17:22 +03:00
Evgenii Stratonikov
3cc67db083 [#1419] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:16:16 +03:00
Evgenii Stratonikov
730f14e4eb [#1406] pilorama: Return parent from TreeGetMeta
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:52 +03:00
Evgenii Stratonikov
34cab7be82 [#1344] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:39 +03:00
Evgenii Stratonikov
59bd5ac973 [#1344] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:32 +03:00
Evgenii Stratonikov
e2c88a9983 [#1344] pilorama: Use require.ErrorIs in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:27 +03:00
Evgenii Stratonikov
dd7c4385c6 [#1326] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:50:13 +03:00
Evgenii Stratonikov
4a65eb7e5f [#1324] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Evgenii Stratonikov
cf73feb3f8 [#1324] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Evgenii Stratonikov
633b4e7d2d [#1483] metabase: Add VERSION.md
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov
6f243a2a76 [#1483] metabase: Store version
The main problem is to distinguish the case of initial initialization
and update from version 0. We can't do this at `Open`, because of
`resync_metabase` flag. Thus, the following approach was taken:
1. During `Open` check whether the metabase was initialized.
2. Check for the version in `Init` or write the new one if the metabase
   is new.
3. Update version in `Reset`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov
7df50297cd [#1520] shard: Ignore errors on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:25:28 +03:00
Evgenii Stratonikov
78ea450c25 [#1502] shard: Process locks on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 11:07:26 +03:00
Evgenii Stratonikov
972ca83e23 [#1524] writecache: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov
07e06249d5 [#1524] metabase: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov
81f925d5a0 [#1516] metabase: Optimize ListWithCursor for long listings
Cache buckets outside of the main loop and allocate memory for the
resulting offset only once.

```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.45µs ±14%    5.79µs ±11%  -10.24%  (p=0.002 n=10+10)
ListWithCursor/10_items-8     20.9µs ±17%    17.3µs ± 9%  -17.27%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     153µs ±12%     131µs ± 9%  -14.63%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.31kB ± 0%    1.91kB ± 0%  -17.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     6.94kB ± 0%    5.50kB ± 0%  -20.78%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    53.3kB ± 0%    41.5kB ± 0%  -22.18%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         40.0 ± 0%      34.0 ± 0%  -15.00%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        121 ± 0%       100 ± 0%  -17.36%  (p=0.000 n=10+10)
ListWithCursor/100_items-8       930 ± 0%       758 ± 0%  -18.49%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
a93373fe71 [#1516] metabase: Cache graveyard buckets in ListWithCursor
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.40µs ±13%    6.45µs ±14%     ~     (p=0.739 n=10+10)
ListWithCursor/10_items-8     30.9µs ±21%    20.9µs ±17%  -32.49%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     274µs ±27%     153µs ±12%  -44.09%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.26kB ± 0%    2.31kB ± 0%   +2.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     10.8kB ± 0%     6.9kB ± 0%  -36.07%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    96.8kB ± 0%    53.3kB ± 0%  -44.98%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         39.0 ± 0%      40.0 ± 0%   +2.56%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        192 ± 0%       121 ± 0%  -36.98%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.72k ± 0%     0.93k ± 0%  -45.93%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>

name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       5.23µs ±19%    5.26µs ±15%     ~     (p=0.853 n=10+10)
ListWithCursor/10_items-8     27.2µs ±15%    18.0µs ±19%  -33.80%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     250µs ±13%     139µs ±15%  -44.27%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       1.99kB ± 0%    2.04kB ± 0%   +2.82%  (p=0.000 n=8+8)
ListWithCursor/10_items-8     10.3kB ± 0%     6.4kB ± 0%  -37.83%  (p=0.000 n=8+10)
ListWithCursor/100_items-8    93.9kB ± 0%    50.4kB ± 0%  -46.37%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         35.0 ± 0%      36.0 ± 0%   +2.86%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        184 ± 0%       113 ± 0%  -38.59%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.67k ± 0%     0.88k ± 0%  -47.29%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
af4db8a73b [#1516] metabase: Cache address key and do not decode address twice
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       10.6µs ± 1%     6.4µs ±13%  -39.62%  (p=0.000 n=7+10)
ListWithCursor/10_items-8     75.3µs ± 2%    30.9µs ±21%  -58.97%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     726µs ± 2%     274µs ±27%  -62.28%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       3.19kB ± 0%    2.26kB ± 0%  -29.21%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     20.7kB ± 0%    10.8kB ± 0%  -47.68%  (p=0.000 n=10+8)
ListWithCursor/100_items-8     196kB ± 0%      97kB ± 0%  -50.65%  (p=0.000 n=7+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         55.0 ± 0%      39.0 ± 0%  -29.09%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        346 ± 0%       192 ± 0%  -44.51%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     3.25k ± 0%     1.72k ± 0%  -47.13%  (p=0.000 n=9+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
504f45e9ee [#1516] metabase: Add benchmark for ListWithCursor
```
name                        time/op
ListWithCursor/1_item-8     10.6µs ± 1%
ListWithCursor/10_items-8   75.3µs ± 2%
ListWithCursor/100_items-8   726µs ± 2%

name                        alloc/op
ListWithCursor/1_item-8     3.19kB ± 0%
ListWithCursor/10_items-8   20.7kB ± 0%
ListWithCursor/100_items-8   196kB ± 0%

name                        allocs/op
ListWithCursor/1_item-8       55.0 ± 0%
ListWithCursor/10_items-8      346 ± 0%
ListWithCursor/100_items-8   3.25k ± 0%
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
f602d05b0a [#1494] *: Fix linter warnings
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 12:26:10 +03:00
Evgenii Stratonikov
38772e1a2e writecache: provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-09 14:19:47 +03:00
Pavel Karpy
010253a97a [#1460] blobovnicza: Do not use pointers as the results
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
0e4a1beecf [#1460] blobstor: Do not use pointers as the results
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
08bf8a68f1 [#1460] engine: Do not use pointers as the results
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
7b6363f4c6 [#1460] shard: Do not use pointers as the results
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
9b2932609b [#1460] meta: Do not use pointers as the results
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
a580429996 [#1460] meta: Add a benchmark on Get operation
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
b0c7b7851a [#1418] blobstor: Do not use pointers as parameters
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
14366bbd89 [#1418] engine: Do not use pointers as parameters
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00
Pavel Karpy
5f57db6bf8 [#1418] shard: Do not use pointers as parameters
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-06 18:03:12 +03:00