Commit Graph

498 Commits (7b882b26d8cfd35e13bf61dc5c63d55903da0840)

Author SHA1 Message Date
Evgenii Stratonikov 7b882b26d8 [#1559] shard: Remove public functions
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov 34d319fed2 [#1559] metabase: Use `Set` prefix for parameter setting
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov f58234aa2f [#1559] metabase: Remove public functions
Reduce public interface of this package. Later each result will contain
an additional status, so it makes more sense to use the same functions
and result processing everywhere.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov a4adb79db7 [#1607] pilorama: Enable tree service explicitly
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov d62723f038 [#1505] pilorama: Provide timeout to `bbolt.Open`
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 26041f18bf [#1505] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov c7b10598f9 [#1333] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 5e843a73f9 [#1333] services/control: Return pilorama info in `ListShards` RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 8f4ee1aded [#1333] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 735931c842 [#1481] pilorama: Fix `TreeApply`
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov ad3038d16d [#1444] pilorama: Fix `TreeMove` in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 8027b7bb6b [#1444] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 4437cd7113 [#1442] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 3312924b82 [#1431] pilorama: Use `Batch` for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov f0a67f948d [#1431] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov f5d35571d0 [#1431] engine: Add benchmark for `Select` vs `TreeGetByPath`
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 835170e452 [#1329] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 536857ea5a [#1329] services/tree: Implement `GetOpLog` RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 7703dd5d7f [#1419] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov ad48918a97 [#1406] pilorama: Return parent from `TreeGetMeta`
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 8844d9b2db [#1344] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov d8ad68d613 [#1344] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 910db42748 [#1344] pilorama: Use `require.ErrorIs` in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov aea855e8f3 [#1326] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 46f4ce2773 [#1324] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 8cf71b7f1c [#1324] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov 11c9df6f00 [#1611] shard: Print shard ID in component logs
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-19 15:57:53 +03:00
Pavel Karpy 9f7a22e2aa [#1561] object: Return `OUT_OF_RANGE` status
Replace `ErrRangeOutOfBounds` error from `pkg/core/object` package with
`ObjectOutOfRange` from `apistatus` package. That error is returned by
storage node's server as NeoFS API statuses.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-19 13:29:45 +03:00
Pavel Karpy cb0bb7207c [#1461] shard: Add a separate `ErrLockObjectRemoval`
Do not return `meta.ErrLockObjectRemoval` from shard's methods, add shard's
own error for that.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 558cc1193a [#1461] engine: Clarify force removal
Document force removal behaviour in all the Storage engine parts.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 51afcc1182 [#1461] engine, policer: Force remove objects w/o container
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 18ec5d7c8e [#1461] meta: Return error on lock object removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy b5c56d459a [#1461] engine: Add force lock removal tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 63c00e785d [#1461] shard: Fix option naming
`WitDeletedLockCallback` => `WithDeletedLockCallback`.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 5122be34e7 [#1461] shard: Add lock tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy eaf96bccf7 [#1461] meta: Add lock tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy c6cf8e5c0b [#1461] engine: Fix LOCK test
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy 9c5ef3bab8 [#1461] node: Allow force LOCK removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy fed9e6679d [#1461] node: Unlock locked object on its lock removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Evgenii Stratonikov 72c044e2eb [#1599] engine: Parallelize shard initialization
Shard is intended to be used as a separate failure domain,
which usually resides on a separate disk. Thus, sequential
initialization is bound by IO and this change speeds up thing a bit.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-15 13:44:54 +03:00
Evgenii Stratonikov 633b4e7d2d [#1483] metabase: Add VERSION.md
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov 6f243a2a76 [#1483] metabase: Store version
The main problem is to distinguish the case of initial initialization
and update from version 0. We can't do this at `Open`, because of
`resync_metabase` flag. Thus, the following approach was taken:
1. During `Open` check whether the metabase was initialized.
2. Check for the version in `Init` or write the new one if the metabase
   is new.
3. Update version in `Reset`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov 7df50297cd [#1520] shard: Ignore errors on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:25:28 +03:00
Evgenii Stratonikov 78ea450c25 [#1502] shard: Process locks on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 11:07:26 +03:00
Evgenii Stratonikov 972ca83e23 [#1524] writecache: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov 07e06249d5 [#1524] metabase: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov 81f925d5a0 [#1516] metabase: Optimize `ListWithCursor` for long listings
Cache buckets outside of the main loop and allocate memory for the
resulting offset only once.

```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.45µs ±14%    5.79µs ±11%  -10.24%  (p=0.002 n=10+10)
ListWithCursor/10_items-8     20.9µs ±17%    17.3µs ± 9%  -17.27%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     153µs ±12%     131µs ± 9%  -14.63%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.31kB ± 0%    1.91kB ± 0%  -17.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     6.94kB ± 0%    5.50kB ± 0%  -20.78%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    53.3kB ± 0%    41.5kB ± 0%  -22.18%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         40.0 ± 0%      34.0 ± 0%  -15.00%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        121 ± 0%       100 ± 0%  -17.36%  (p=0.000 n=10+10)
ListWithCursor/100_items-8       930 ± 0%       758 ± 0%  -18.49%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov a93373fe71 [#1516] metabase: Cache graveyard buckets in `ListWithCursor`
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.40µs ±13%    6.45µs ±14%     ~     (p=0.739 n=10+10)
ListWithCursor/10_items-8     30.9µs ±21%    20.9µs ±17%  -32.49%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     274µs ±27%     153µs ±12%  -44.09%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.26kB ± 0%    2.31kB ± 0%   +2.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     10.8kB ± 0%     6.9kB ± 0%  -36.07%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    96.8kB ± 0%    53.3kB ± 0%  -44.98%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         39.0 ± 0%      40.0 ± 0%   +2.56%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        192 ± 0%       121 ± 0%  -36.98%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.72k ± 0%     0.93k ± 0%  -45.93%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>

name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       5.23µs ±19%    5.26µs ±15%     ~     (p=0.853 n=10+10)
ListWithCursor/10_items-8     27.2µs ±15%    18.0µs ±19%  -33.80%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     250µs ±13%     139µs ±15%  -44.27%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       1.99kB ± 0%    2.04kB ± 0%   +2.82%  (p=0.000 n=8+8)
ListWithCursor/10_items-8     10.3kB ± 0%     6.4kB ± 0%  -37.83%  (p=0.000 n=8+10)
ListWithCursor/100_items-8    93.9kB ± 0%    50.4kB ± 0%  -46.37%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         35.0 ± 0%      36.0 ± 0%   +2.86%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        184 ± 0%       113 ± 0%  -38.59%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.67k ± 0%     0.88k ± 0%  -47.29%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov af4db8a73b [#1516] metabase: Cache address key and do not decode address twice
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       10.6µs ± 1%     6.4µs ±13%  -39.62%  (p=0.000 n=7+10)
ListWithCursor/10_items-8     75.3µs ± 2%    30.9µs ±21%  -58.97%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     726µs ± 2%     274µs ±27%  -62.28%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       3.19kB ± 0%    2.26kB ± 0%  -29.21%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     20.7kB ± 0%    10.8kB ± 0%  -47.68%  (p=0.000 n=10+8)
ListWithCursor/100_items-8     196kB ± 0%      97kB ± 0%  -50.65%  (p=0.000 n=7+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         55.0 ± 0%      39.0 ± 0%  -29.09%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        346 ± 0%       192 ± 0%  -44.51%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     3.25k ± 0%     1.72k ± 0%  -47.13%  (p=0.000 n=9+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov 504f45e9ee [#1516] metabase: Add benchmark for `ListWithCursor`
```
name                        time/op
ListWithCursor/1_item-8     10.6µs ± 1%
ListWithCursor/10_items-8   75.3µs ± 2%
ListWithCursor/100_items-8   726µs ± 2%

name                        alloc/op
ListWithCursor/1_item-8     3.19kB ± 0%
ListWithCursor/10_items-8   20.7kB ± 0%
ListWithCursor/100_items-8   196kB ± 0%

name                        allocs/op
ListWithCursor/1_item-8       55.0 ± 0%
ListWithCursor/10_items-8      346 ± 0%
ListWithCursor/100_items-8   3.25k ± 0%
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00