The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
In case node is down or failing for some reason, we can expect `Dial` to
fail. In case we actively try to replicate and `Dial` always takes 2
seconds, replication-related channels quickly become full. That affects
latency of all other write operations.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Before this commit the replication channel was quickly filled under
heavy load. This lead to the continuously increasing latency for all
write operations. Now it looks better.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Also fix a bug with replicator using the multiaddress instead of
<host>:<port> format expected by gRPC library.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.
`GetByPath` now also considers only nodes with a single attribute while building a path.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf
Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.
There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Replace `ErrRangeOutOfBounds` error from `pkg/core/object` package with
`ObjectOutOfRange` from `apistatus` package. That error is returned by
storage node's server as NeoFS API statuses.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Shard is intended to be used as a separate failure domain,
which usually resides on a separate disk. Thus, sequential
initialization is bound by IO and this change speeds up thing a bit.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Do not calculate and do not write homomorphic hash for containers that were
configured to store objects without hash.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Do not use homomorphic hash in storage group for containers that have
`homomorphic_hashing_disabled` set to `true`.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
If the container ID is not nil and not equal to the container ID in the
request, consider bearer token invalid.
See also nspcc-dev/neofs-api#207.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Allocate memory only if a node chosen as the forwarded request receiver
has responded with a successful status.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
After fixing version fields in forwarded requests, a node does not check
statuses since errors are not covered by direct call error checks.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Error checkers now support wrapped errors so there is no need to
explicitly unwrap errors in `Policer`.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Forwarded requests contained zero version in their meta header. It did not
allow responding with API statuses (`v0.0` version considered to be older
than `v2.11`) to the forwarding node and, therefore, did not allow analyzing
responses.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
`auditor` does not need to request SG: processor will fetch that info before
audit context initialization.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
That allows using `ClientCache` for storage group searching before task
context is initialized. Also, that makes it more general purpose.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Do not use `Marshal()` with object's payload. Use `ReadFromObject` func from
SDK instead. That allows checking both attributes and SG body's expiration
epoch.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
After recent changes in NeoFS SDK Go library session tokens aren't
embedded into `container.Container` and `eacl.Table` structures.
Group value, session token and signature in a structure for container
and eACL.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
The main problem is to distinguish the case of initial initialization
and update from version 0. We can't do this at `Open`, because of
`resync_metabase` flag. Thus, the following approach was taken:
1. During `Open` check whether the metabase was initialized.
2. Check for the version in `Init` or write the new one if the metabase
is new.
3. Update version in `Reset`.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
After recent changes `buildContainer` method returns two-dimensional
slice of `NodeInfo` so there is no need to flatten it to build slice of
`common.NodeInfo`.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Make `ReadNetworkConfiguration` method to return `NetworkConfiguration`
by value in order to follow storage engine improvements and prevent heap
escaping.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
`Netmap` contract exports enumeration of the node states.
Replace using literals and constants from NeoFS API Go V2 with the
values provided by contract.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Cache object that are being processed. That prevents concurrent
object handling when there is a few number of objects and object handling
takes more time that the policer needs for starting that object handling one
more time.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
If placement contains two vectors with intersecting nodes it was possible to
send the object to the nodes twice.
Also optimizes requests: do not ask about storing the object twice from the
same node.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
The node does not support asynchronous object replication anymore, so it
does not need to have replicator worker, channel and `AddTask` function.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>