Commit graph

1829 commits

Author SHA1 Message Date
Evgenii Stratonikov
9857a20c0d [] pilorama: Provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:28:23 +03:00
Evgenii Stratonikov
1fed255c5b [] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:59 +03:00
Evgenii Stratonikov
2c8a87a469 [] services/tree: Document *.proto files
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:57 +03:00
Evgenii Stratonikov
681df24547 [] services/control: allow to synchronize local trees
Do not check that a node indeed belongs to the container, because the
synchronization will fail in this case anyway.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:49 +03:00
Evgenii Stratonikov
982cb987a3 [] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:38 +03:00
Evgenii Stratonikov
5408efef82 [] services/control: Return pilorama info in ListShards RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:27:17 +03:00
Evgenii Stratonikov
62b2769a66 [] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:53 +03:00
Evgenii Stratonikov
199ee3a680 [] pilorama: Fix TreeApply
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:30 +03:00
Evgenii Stratonikov
73df95b8d3 [] services/tree: wait some time before reconnecting after failure
In case node is down or failing for some reason, we can expect `Dial` to
fail. In case we actively try to replicate and `Dial` always takes 2
seconds, replication-related channels quickly become full. That affects
latency of all other write operations.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:26 +03:00
Evgenii Stratonikov
96277c650f [] services/tree: Cache the list of container nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:26:24 +03:00
Evgenii Stratonikov
879c1de59d [] services/tree: Use grpc.WithInsecure only for nodes without TLS
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:45 +03:00
Evgenii Stratonikov
6b02df7b8c [] pilorama: Fix TreeMove in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:23 +03:00
Evgenii Stratonikov
578fbdca57 [] services/tree: Parallelize replicator
Before this commit the replication channel was quickly filled under
heavy load. This lead to the continuously increasing latency for all
write operations. Now it looks better.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:23:19 +03:00
Evgenii Stratonikov
aec4f54a00 [] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:22:00 +03:00
Evgenii Stratonikov
c9ddc8fbeb [] services/tree: Cache connections to the container nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:20:33 +03:00
Evgenii Stratonikov
06f2681178 [] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:50 +03:00
Evgenii Stratonikov
55a9a39f9e [] services/tree: Fix log message for failed Apply
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:45 +03:00
Evgenii Stratonikov
d244b2658a [] services/tree: Marshal public key once
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:31 +03:00
Evgenii Stratonikov
86c6c24b86 [] services/tree: Retransmit queries to container nodes
Also fix a bug with replicator using the multiaddress instead of
<host>:<port> format expected by gRPC library.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:19:21 +03:00
Evgenii Stratonikov
fa57a8be44 [] pilorama: Use Batch for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:55 +03:00
Evgenii Stratonikov
d6d7e35454 [] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:23 +03:00
Evgenii Stratonikov
241d4d6810 [] engine: Add benchmark for Select vs TreeGetByPath
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:15 +03:00
Evgenii Stratonikov
b3ca9ce775 [] services/tree: Synchronize from the last stored height
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:09 +03:00
Evgenii Stratonikov
35fa445195 [] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:18:01 +03:00
Evgenii Stratonikov
9cbd4271f1 [] services/tree: Implement GetOpLog RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:17:22 +03:00
Evgenii Stratonikov
b19de6116f [] services/tree: Do not replicate to a local node
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:17:15 +03:00
Evgenii Stratonikov
3cc67db083 [] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:16:16 +03:00
Evgenii Stratonikov
730f14e4eb [] pilorama: Return parent from TreeGetMeta
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:52 +03:00
Denis Kirillov
7af3424bad [] services/tree: fix nodeId in GetSubTree
Signed-off-by: Denis Kirillov <denis@nspcc.ru>
2022-07-08 13:15:48 +03:00
Evgenii Stratonikov
427f63e359 [] services/tree: Fix grpc import path
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:43 +03:00
Evgenii Stratonikov
035963d147 [] services/tree: Implement access control
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:15:41 +03:00
Evgenii Stratonikov
f6589331b6 [] services/tree: Fix proto field numbers
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:53 +03:00
Evgenii Stratonikov
34cab7be82 [] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:39 +03:00
Evgenii Stratonikov
59bd5ac973 [] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:32 +03:00
Evgenii Stratonikov
e2c88a9983 [] pilorama: Use require.ErrorIs in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 13:13:27 +03:00
Evgenii Stratonikov
dd7c4385c6 [] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:50:13 +03:00
Evgenii Stratonikov
375c30e687 [] services/tree: Implement Object Tree Service
Object Tree Service allows changing trees assotiated with
the container in runtime.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:50:12 +03:00
Evgenii Stratonikov
4a65eb7e5f [] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Evgenii Stratonikov
cf73feb3f8 [] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-08 12:47:40 +03:00
Pavel Karpy
1658242e00 [] node: Smart memory allocation in GetRange
Allocate memory only if a node chosen as the forwarded request receiver
has responded with a successful status.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 12:04:03 +03:00
Pavel Karpy
fed5e76e7f [] node: Check status of forwarded requests
After fixing version fields in forwarded requests, a node does not check
statuses since errors are not covered by direct call error checks.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-08 12:04:03 +03:00
Leonard Lyubich
fdf62e8562 [] Upgrade NeoFS SDK Go to rc#5
Error checkers now support wrapped errors so there is no need to
explicitly unwrap errors in `Policer`.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-07 14:23:41 +03:00
Leonard Lyubich
9a11a75b77 [] Upgrade NeoFS SDK Go with changed reputation API
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-06 18:21:24 +03:00
Pavel Karpy
9a6da336db [] node: Do not lose API version on forwarding
Forwarded requests contained zero version in their meta header. It did not
allow responding with API statuses (`v0.0` version considered to be older
than `v2.11`) to the forwarding node and, therefore, did not allow analyzing
responses.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 17:06:24 +03:00
Pavel Karpy
c8506b247e [] *: Fix linter warnings
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 11:56:01 +03:00
Pavel Karpy
a153e040cb [] ir: Do not use map for combining SG ID and SG object
Also, add corresponding structure for that.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 09:56:26 +03:00
Pavel Karpy
ec07fda97b [] pkg: Move SGSource to the core directory
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 09:56:26 +03:00
Pavel Karpy
27304455bf [] ir: Filter expired SGs in the audit process
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 09:56:26 +03:00
Pavel Karpy
7864959d0c [] ir: Move GetSG interface
`auditor` does not need to request SG: processor will fetch that info before
audit context initialization.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 09:56:26 +03:00
Pavel Karpy
6370c2e160 [] ir: Make ClientCache not depend on audit.Task
That allows using `ClientCache` for storage group searching before task
context is initialized. Also, that makes it more general purpose.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-06 09:56:26 +03:00