Commit graph

1838 commits

Author SHA1 Message Date
Leonard Lyubich
263497a92b [#1549] shard: Turn to ModeDegraded on metabase failure
Make `Shard` to work in degraded mode if metabase is unavailable on
opening/init stage. Close metabase in non-degraded mode only.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-08 13:45:57 +03:00
Pavel Karpy
1684cd63fa [#1558] node: Do not put SHA256 hash as homomorphic
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:45:16 +03:00
Evgenii Stratonikov
e0e4f1f7ee engine: initialize shards in parallel
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:44:05 +03:00
Pavel Karpy
90b4820ee0 [#1365] morph: Do not return errors if config key is missing
Return default values instead of casting errors in `HomomorphicHashDisabled`
method.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:43:46 +03:00
Pavel Karpy
adcda361a7 [#1365] node: Calculate object homomorphic hash flexibly
Do not calculate and do not write homomorphic hash for containers that were
configured to store objects without hash.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:41:35 +03:00
Pavel Karpy
e9c534b0a0 [#1365] ir: Check homomorphic hash flexibly in audit
Do not perform that check if it was turned off for the container being
checked.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:39:19 +03:00
Pavel Karpy
455096ab53 [#1365] ir: Check homomorphic hash setting on ContainerPut
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:36:22 +03:00
Pavel Karpy
fdc934a360 [#1365] morph: Add HomomorphicHashDisabled config getter
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:34:40 +03:00
Pavel Karpy
ab749460cd [#1365] cli: Calculate homomorphic hash flexibly
Do not use homomorphic hash in storage group for containers that have
`homomorphic_hashing_disabled` set to `true`.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 13:34:39 +03:00
Evgenii Stratonikov
9857a20c0d [#1505] pilorama: Provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:28:23 +03:00
Evgenii Stratonikov
1fed255c5b [#1505] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:59 +03:00
Evgenii Stratonikov
2c8a87a469 [#1334] services/tree: Document *.proto files
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:57 +03:00
Evgenii Stratonikov
681df24547 [#1333] services/control: allow to synchronize local trees
Do not check that a node indeed belongs to the container, because the
synchronization will fail in this case anyway.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:49 +03:00
Evgenii Stratonikov
982cb987a3 [#1333] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:38 +03:00
Evgenii Stratonikov
5408efef82 [#1333] services/control: Return pilorama info in ListShards RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:17 +03:00
Evgenii Stratonikov
62b2769a66 [#1333] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:53 +03:00
Evgenii Stratonikov
199ee3a680 [#1481] pilorama: Fix TreeApply
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:30 +03:00
Evgenii Stratonikov
73df95b8d3 [#1456] services/tree: wait some time before reconnecting after failure
In case node is down or failing for some reason, we can expect `Dial` to
fail. In case we actively try to replicate and `Dial` always takes 2
seconds, replication-related channels quickly become full. That affects
latency of all other write operations.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:26 +03:00
Evgenii Stratonikov
96277c650f [#1445] services/tree: Cache the list of container nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:24 +03:00
Evgenii Stratonikov
879c1de59d [#1446] services/tree: Use grpc.WithInsecure only for nodes without TLS
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:45 +03:00
Evgenii Stratonikov
6b02df7b8c [#1444] pilorama: Fix TreeMove in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:23 +03:00
Evgenii Stratonikov
578fbdca57 [#1427] services/tree: Parallelize replicator
Before this commit the replication channel was quickly filled under
heavy load. This lead to the continuously increasing latency for all
write operations. Now it looks better.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:19 +03:00
Evgenii Stratonikov
aec4f54a00 [#1444] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:22:00 +03:00
Evgenii Stratonikov
c9ddc8fbeb [#1446] services/tree: Cache connections to the container nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:20:33 +03:00
Evgenii Stratonikov
06f2681178 [#1442] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:50 +03:00
Evgenii Stratonikov
55a9a39f9e [#1442] services/tree: Fix log message for failed Apply
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:45 +03:00
Evgenii Stratonikov
d244b2658a [#1401] services/tree: Marshal public key once
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:31 +03:00
Evgenii Stratonikov
86c6c24b86 [#1401] services/tree: Retransmit queries to container nodes
Also fix a bug with replicator using the multiaddress instead of
<host>:<port> format expected by gRPC library.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:21 +03:00
Evgenii Stratonikov
fa57a8be44 [#1431] pilorama: Use Batch for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:55 +03:00
Evgenii Stratonikov
d6d7e35454 [#1431] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:23 +03:00
Evgenii Stratonikov
241d4d6810 [#1431] engine: Add benchmark for Select vs TreeGetByPath
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:15 +03:00
Evgenii Stratonikov
b3ca9ce775 [#1329] services/tree: Synchronize from the last stored height
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:09 +03:00
Evgenii Stratonikov
35fa445195 [#1329] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:01 +03:00
Evgenii Stratonikov
9cbd4271f1 [#1329] services/tree: Implement GetOpLog RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:17:22 +03:00
Evgenii Stratonikov
b19de6116f [#1426] services/tree: Do not replicate to a local node
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:17:15 +03:00
Evgenii Stratonikov
3cc67db083 [#1419] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:16:16 +03:00
Evgenii Stratonikov
730f14e4eb [#1406] pilorama: Return parent from TreeGetMeta
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:52 +03:00
Denis Kirillov
7af3424bad [#1404] services/tree: fix nodeId in GetSubTree
Signed-off-by: Denis Kirillov <denis@nspcc.ru>
2022-07-08 13:15:48 +03:00
Evgenii Stratonikov
427f63e359 [#1328] services/tree: Fix grpc import path
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:43 +03:00
Evgenii Stratonikov
035963d147 [#1328] services/tree: Implement access control
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:41 +03:00
Evgenii Stratonikov
f6589331b6 [#1328] services/tree: Fix proto field numbers
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:53 +03:00
Evgenii Stratonikov
34cab7be82 [#1344] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:39 +03:00
Evgenii Stratonikov
59bd5ac973 [#1344] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:32 +03:00
Evgenii Stratonikov
e2c88a9983 [#1344] pilorama: Use require.ErrorIs in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:27 +03:00
Evgenii Stratonikov
dd7c4385c6 [#1326] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:50:13 +03:00
Evgenii Stratonikov
375c30e687 [#1324] services/tree: Implement Object Tree Service
Object Tree Service allows changing trees assotiated with
the container in runtime.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:50:12 +03:00
Evgenii Stratonikov
4a65eb7e5f [#1324] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Evgenii Stratonikov
cf73feb3f8 [#1324] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Pavel Karpy
1658242e00 [#1590] node: Smart memory allocation in GetRange
Allocate memory only if a node chosen as the forwarded request receiver
has responded with a successful status.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 12:04:03 +03:00
Pavel Karpy
fed5e76e7f [#1590] node: Check status of forwarded requests
After fixing version fields in forwarded requests, a node does not check
statuses since errors are not covered by direct call error checks.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 12:04:03 +03:00