We use cache to avoid policing the same object multiple times in a short
time span (< 30 seconds). If we have 200_000 objects in a blobstor, it is a bit useless
-- if it takes 1 second to process an object and we have `replicator.pool_size: 20`
in config, the next iteration will happen in 10_000 second which is much
larger than 30 second. However we still consume a lot of memory, so it
makes sense to use saner default.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Stop child objects collection if the last returned object (the most "left"
object in the collected chain) starts exactly from the `GETRANGE`'s `from`
value.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
That could happen if a node forwards request to a node that closed the
connection during the original object stream.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Could be related to "websocket users limit reached" on the `neo-go` server
side when an SN/IR is rebooting repeatedly.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
1. Replace `mn` function with a `sigCount`.
2. Use `notary.FakeMultisigAccount` for account creation.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Includes:
1. mode change read lock operation in every exported method that r/w the
underlying database;
2. returning `ErrDegradedMode` logical error if any exported method is
called in degraded (without a metabase) mode.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
```
2022/11/15 08:40:56 worker exits from a panic: runtime error: index out of range [0] with length 0
2022/11/15 08:40:56 worker exits from panic: goroutine 1188 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x1042b60, 0xc0015ae018})
runtime/panic.go:1038 +0x215
github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1()
github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65 +0x366
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x68
```
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
It allows keeping all the locked objects safe after metabase
resynchronization. Currently, all `LOCK` objects are broadcast to all nodes
in a container, it guarantees `LOCK` object presence in a regular situation.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
1. Remove a layer of indirection for mutex, `ClientCache` is already
used by pointer.
2. Fix duplication of a `AllowExternal` field.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
A container node is expected to have full "get" access to assemble the
object.
A non-container node is expected to forward any request to a container node.
Any token is expected to be issued for an original request sender not for a
node so any new request is invalid by design with that token.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Do not lose meta information of the original requests: cache session and
bearer tokens of the original request b/w a new generated ones. Middle
request wrappers should not contain any meta information, since it is
useless (e.g. ACL service checks only the original tokens).
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
After presenting request statuses on the API level, all the errors are
unwrapped before sending to the caller side. It led to a losing invalid
request's context.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
It was needed before we started to flush during transition to
`degraded` mode. Now it is confusing.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Do not use node's local storage if it is clear that an object will be
removed anyway as a redundant. It requires moving the changing local storage
logic from the validation step to the local target implementation.
It allows performing any relations checks (e.g. object locking) only if a
node is considered as a valid container member and is expected to store
(stored previously) all the helper objects (e.g. `LOCK`, `TOMBSTONE`, etc).
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
It is not an error: removing virtual object is expected and should be just
skipped. Getting a virtual object with `raw` flag is considered as an
impossible action, all the virtual objects removals will be handled via
their children's removals implicitly.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Currently there is a possibility for modifying operations to fail
because of I/O errors and a new tree to be created on another shard.
This commit adds existence check for modifying operations.
Read operations remain as they are, not to slow things.
`TreeDrop` is an exception, because this is a tree removal and trying
multiple shards is not an unwanted behaviour.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
By default writecache puts the whole object to update storage ID.
This logic comes from the times when we needed to put objects
in the metabase by the writecache itself. Now this is done by the
blobstor at unmarshaling objects during flush only to update storage ID
is an overkill.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
All logic errors are wrapped in `logicerr.Logical` type and do not
affect shard error counter.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
In previous implementation `Shard.Delete` logged writecache's removal
failures in `error` level. There is a need to decrease severity of these
log records since they aren't critical and don't require individual
review.
Change level of the message to `info`.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
From the `Bucket.ForEach` doc:
```
The provided function must not modify the bucket; this will result in undefined behavior.
```
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
After the refactor there are new storage characteristics: a type and
a general storage id (that could be stringified).
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
In previous implementation turning to maintenance mode using NeoFS CLI
required NeoFS API endpoint. This was not convenient from the user
perspective. It's worth to move networks settings' check to the server
side.
Add `force_maintenance` field to `SetNetmapStatusRequest.Body` message
of Control API. Add `force` flag to `neofs-cli control set-status`
command which sets corresponding field in the requests body if status is
`maintenance`. Force flag is ignored for any other status.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Iterate over every shard and search for the container's trees. Final result
is a concatenation of shards' results. It is considered that one fixed tree
is placed on one fixed shard but the different trees of a fixed container
could be placed on different shards.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Add `SynchronizeAllTrees` method of the Tree service. It allows fetching
tree IDs and sync all of them. Share common logic b/w the new method and
the `SynchronizeTree`.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Add the node position in a container and the container size to the CID
descriptor that is passed to the `TreeApply`. Previously, `checkValid` does
not allow any log operarations.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
In the 2nd version, there was a database format change: buckets have changed
their keys, so it becomes impossible to check the version in the 1 -> 2+
migrations because of different buckets that store info about the version.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Storage node should not provide NeoFS Object API service when it is
under maintenance.
Declare `Common` service that unifies behavior of all object operations.
The implementation pre-checks if node is under maintenance and returns
`apistatus.NodeUnderMaintenance` if so. Use `Common` service as a first
logical processor in object service pipeline.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
In some scenarios original session can be unrelated to the objects which
are read internally by the node. For example, node requests child
objects when removing the parent one.
Tune internal NeoFS API client used by node's Object API server to
ignore unrelated sessions in `GetObject` / `HeadObject` / `PayloadRange`
ops.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
If shard ID is stored in metabase (it is not the first time boot), read it,
set it, use it (not a generated one) in the metrics writer.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Make it store its internal `zap.Logger`'s level. Also, make all the
components to accept internal `logger.Logger` instead of `zap.Logger`; it
will simplify future refactor.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Current spec allows denying GET_RANGE requests from other storage nodes.
However, GET should always be allowed and it is enough to perform
GET_RANGE locally
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
In previous implementation `ObjectService.Get` RPC handler failed with
`parent address in child object differs` while assembling the "big"
object. This was caused by the child check which required parent
reference to be set in all child objects. The check was impracticable
because not all elements of the split-chain have a link to the parent.
Make `execCtx.isChild` to return `true` if parameterized object has no
parent header in its own header.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Node response with `NODE_UNDER_MAINTENANCE` status signals that the node
was switched to maintenance mode. There is a delay between the actual
switch and the reflection in the network map of up to one epoch. To
speed up the reaction to the maintenance, it is required to recognize
such node responses in the Policer.
Make `Policer.processNodes` to exclude elements with shortage decreasing
on `NODE_UNDER_MAINTENANCE` status response.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Nodes under maintenance SHOULD not respond to object requests. Based on
this, storage node's Policer SHOULD consider such nodes as problem ones.
However, to prevent spam with the new replicas, on the contrary, Policer
should consider them normal.
Make `Policer.processNodes` to exclude elements if `IsMaintenance()`
with shortage decreasing.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Make `replicator.TaskResult` to accept `netmap.NodeInfo` type instead of
uint64 in order to clarify the meaning and prevent passing the random
numbers.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
It doesn't make sense to check object relation in session check of
`ObjectService.Put` RPC which has been spawned by `ObjectService.Delete`
with session. Session issuer can't predict identifier of the tombstone
object to be created.
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
Currently, when removing shard special care must be taken with respect
to shard numbering. `mode: disabled` allows to leave shard configuration
in place while also ignoring it during initialization. This makes
disk replacement much more convenient.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
In previous implementation of `neofs-node` app object session was not
checked for substitution of the object related to it. Also, for access
checks, the session object was substituted instead of the one from the
request. This, on the one hand, made it possible to inherit the session
from the parent object for authorization for certain actions. On the
other hand, it covered the mentioned object substitution, which is a
critical vulnerability.
Next changes are applied to processing of all Object service requests:
- check if object session relates to the requested object
- use requested object in access checks.
Disclosed problem of object context inheritance will be solved within
Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>