Commit graph

498 commits

Author SHA1 Message Date
Evgenii Stratonikov
7b882b26d8 [#1559] shard: Remove public functions
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
34d319fed2 [#1559] metabase: Use Set prefix for parameter setting
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
f58234aa2f [#1559] metabase: Remove public functions
Reduce public interface of this package. Later each result will contain
an additional status, so it makes more sense to use the same functions
and result processing everywhere.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
a4adb79db7 [#1607] pilorama: Enable tree service explicitly
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
d62723f038 [#1505] pilorama: Provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
26041f18bf [#1505] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
c7b10598f9 [#1333] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
5e843a73f9 [#1333] services/control: Return pilorama info in ListShards RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8f4ee1aded [#1333] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
735931c842 [#1481] pilorama: Fix TreeApply
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
ad3038d16d [#1444] pilorama: Fix TreeMove in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8027b7bb6b [#1444] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
4437cd7113 [#1442] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
3312924b82 [#1431] pilorama: Use Batch for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
f0a67f948d [#1431] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
f5d35571d0 [#1431] engine: Add benchmark for Select vs TreeGetByPath
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
835170e452 [#1329] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
536857ea5a [#1329] services/tree: Implement GetOpLog RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
7703dd5d7f [#1419] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
ad48918a97 [#1406] pilorama: Return parent from TreeGetMeta
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8844d9b2db [#1344] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
d8ad68d613 [#1344] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
910db42748 [#1344] pilorama: Use require.ErrorIs in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
aea855e8f3 [#1326] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
46f4ce2773 [#1324] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8cf71b7f1c [#1324] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
11c9df6f00 [#1611] shard: Print shard ID in component logs
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-19 15:57:53 +03:00
Pavel Karpy
9f7a22e2aa [#1561] object: Return OUT_OF_RANGE status
Replace `ErrRangeOutOfBounds` error from `pkg/core/object` package with
`ObjectOutOfRange` from `apistatus` package. That error is returned by
storage node's server as NeoFS API statuses.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-19 13:29:45 +03:00
Pavel Karpy
cb0bb7207c [#1461] shard: Add a separate ErrLockObjectRemoval
Do not return `meta.ErrLockObjectRemoval` from shard's methods, add shard's
own error for that.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
558cc1193a [#1461] engine: Clarify force removal
Document force removal behaviour in all the Storage engine parts.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
51afcc1182 [#1461] engine, policer: Force remove objects w/o container
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
18ec5d7c8e [#1461] meta: Return error on lock object removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
b5c56d459a [#1461] engine: Add force lock removal tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
63c00e785d [#1461] shard: Fix option naming
`WitDeletedLockCallback` => `WithDeletedLockCallback`.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
5122be34e7 [#1461] shard: Add lock tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
eaf96bccf7 [#1461] meta: Add lock tests
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
c6cf8e5c0b [#1461] engine: Fix LOCK test
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
9c5ef3bab8 [#1461] node: Allow force LOCK removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
fed9e6679d [#1461] node: Unlock locked object on its lock removal
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Evgenii Stratonikov
72c044e2eb [#1599] engine: Parallelize shard initialization
Shard is intended to be used as a separate failure domain,
which usually resides on a separate disk. Thus, sequential
initialization is bound by IO and this change speeds up thing a bit.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-15 13:44:54 +03:00
Evgenii Stratonikov
633b4e7d2d [#1483] metabase: Add VERSION.md
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov
6f243a2a76 [#1483] metabase: Store version
The main problem is to distinguish the case of initial initialization
and update from version 0. We can't do this at `Open`, because of
`resync_metabase` flag. Thus, the following approach was taken:
1. During `Open` check whether the metabase was initialized.
2. Check for the version in `Init` or write the new one if the metabase
   is new.
3. Update version in `Reset`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:48:28 +03:00
Evgenii Stratonikov
7df50297cd [#1520] shard: Ignore errors on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 17:25:28 +03:00
Evgenii Stratonikov
78ea450c25 [#1502] shard: Process locks on metabase refill
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-21 11:07:26 +03:00
Evgenii Stratonikov
972ca83e23 [#1524] writecache: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov
07e06249d5 [#1524] metabase: Add some bolt parameters to the configuration
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-20 17:04:35 +03:00
Evgenii Stratonikov
81f925d5a0 [#1516] metabase: Optimize ListWithCursor for long listings
Cache buckets outside of the main loop and allocate memory for the
resulting offset only once.

```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.45µs ±14%    5.79µs ±11%  -10.24%  (p=0.002 n=10+10)
ListWithCursor/10_items-8     20.9µs ±17%    17.3µs ± 9%  -17.27%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     153µs ±12%     131µs ± 9%  -14.63%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.31kB ± 0%    1.91kB ± 0%  -17.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     6.94kB ± 0%    5.50kB ± 0%  -20.78%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    53.3kB ± 0%    41.5kB ± 0%  -22.18%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         40.0 ± 0%      34.0 ± 0%  -15.00%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        121 ± 0%       100 ± 0%  -17.36%  (p=0.000 n=10+10)
ListWithCursor/100_items-8       930 ± 0%       758 ± 0%  -18.49%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
a93373fe71 [#1516] metabase: Cache graveyard buckets in ListWithCursor
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       6.40µs ±13%    6.45µs ±14%     ~     (p=0.739 n=10+10)
ListWithCursor/10_items-8     30.9µs ±21%    20.9µs ±17%  -32.49%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     274µs ±27%     153µs ±12%  -44.09%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       2.26kB ± 0%    2.31kB ± 0%   +2.46%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     10.8kB ± 0%     6.9kB ± 0%  -36.07%  (p=0.000 n=8+8)
ListWithCursor/100_items-8    96.8kB ± 0%    53.3kB ± 0%  -44.98%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         39.0 ± 0%      40.0 ± 0%   +2.56%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        192 ± 0%       121 ± 0%  -36.98%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.72k ± 0%     0.93k ± 0%  -45.93%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>

name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       5.23µs ±19%    5.26µs ±15%     ~     (p=0.853 n=10+10)
ListWithCursor/10_items-8     27.2µs ±15%    18.0µs ±19%  -33.80%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     250µs ±13%     139µs ±15%  -44.27%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       1.99kB ± 0%    2.04kB ± 0%   +2.82%  (p=0.000 n=8+8)
ListWithCursor/10_items-8     10.3kB ± 0%     6.4kB ± 0%  -37.83%  (p=0.000 n=8+10)
ListWithCursor/100_items-8    93.9kB ± 0%    50.4kB ± 0%  -46.37%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         35.0 ± 0%      36.0 ± 0%   +2.86%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        184 ± 0%       113 ± 0%  -38.59%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     1.67k ± 0%     0.88k ± 0%  -47.29%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
af4db8a73b [#1516] metabase: Cache address key and do not decode address twice
```
name                        old time/op    new time/op    delta
ListWithCursor/1_item-8       10.6µs ± 1%     6.4µs ±13%  -39.62%  (p=0.000 n=7+10)
ListWithCursor/10_items-8     75.3µs ± 2%    30.9µs ±21%  -58.97%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     726µs ± 2%     274µs ±27%  -62.28%  (p=0.000 n=10+10)

name                        old alloc/op   new alloc/op   delta
ListWithCursor/1_item-8       3.19kB ± 0%    2.26kB ± 0%  -29.21%  (p=0.000 n=10+10)
ListWithCursor/10_items-8     20.7kB ± 0%    10.8kB ± 0%  -47.68%  (p=0.000 n=10+8)
ListWithCursor/100_items-8     196kB ± 0%      97kB ± 0%  -50.65%  (p=0.000 n=7+10)

name                        old allocs/op  new allocs/op  delta
ListWithCursor/1_item-8         55.0 ± 0%      39.0 ± 0%  -29.09%  (p=0.000 n=10+10)
ListWithCursor/10_items-8        346 ± 0%       192 ± 0%  -44.51%  (p=0.000 n=10+10)
ListWithCursor/100_items-8     3.25k ± 0%     1.72k ± 0%  -47.13%  (p=0.000 n=9+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00
Evgenii Stratonikov
504f45e9ee [#1516] metabase: Add benchmark for ListWithCursor
```
name                        time/op
ListWithCursor/1_item-8     10.6µs ± 1%
ListWithCursor/10_items-8   75.3µs ± 2%
ListWithCursor/100_items-8   726µs ± 2%

name                        alloc/op
ListWithCursor/1_item-8     3.19kB ± 0%
ListWithCursor/10_items-8   20.7kB ± 0%
ListWithCursor/100_items-8   196kB ± 0%

name                        allocs/op
ListWithCursor/1_item-8       55.0 ± 0%
ListWithCursor/10_items-8      346 ± 0%
ListWithCursor/100_items-8   3.25k ± 0%
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-15 20:49:41 +03:00