Commit graph

749 commits

Author SHA1 Message Date
Pavel Karpy
306609030a [#2159] node: Add tree replication timeout configuration
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
3bb5a320d7 [#2154] services/tree: Do not log an error when synchronizing container of 1 node
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
387d1e2977 [#2127] services/tree: Randomize node order for synchronization
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
b207dc424f [#2158] policer: Reduce default cache size
We use cache to avoid policing the same object multiple times in a short
time span (< 30 seconds). If we have 200_000 objects in a blobstor, it is a bit useless
-- if it takes 1 second to process an object and we have `replicator.pool_size: 20`
in config, the next iteration will happen in 10_000 second which is much
larger than 30 second. However we still consume a lot of memory, so it
makes sense to use saner default.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
b413094704 [#2095] node: Fix collecting child objects
Stop child objects collection if the last returned object (the most "left"
object in the collected chain) starts exactly from the `GETRANGE`'s `from`
value.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-30 11:07:35 +03:00
Pavel Karpy
350eecfa13 [#2095] node: Do not allow GETRANGE requests with zero length
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-30 11:07:35 +03:00
Pavel Karpy
923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Pavel Karpy
d54022eacc [#2047] node: Do not send chunk twice on request forwarding
That could happen if a node forwards request to a node that closed the
connection during the original object stream.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-02 11:27:48 +03:00
Evgenii Stratonikov
bd25db5d4a [#1984] metrics: Use separate metrics for success/failed requests
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-01 14:18:10 +03:00
Evgenii Stratonikov
e21c472dc7 [#1984] services/object: Increase put_req_count after the request is processed
As it is specified in metrics description.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-01 14:18:10 +03:00
Pavel Karpy
51963abce7 [#1972] node: Fix errors comments in the Put service
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 16:58:52 +03:00
Evgenii Stratonikov
660c38d07e [#2062] services/policer: Use a proper key for object cache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
1779664644 [#2058] services/policer: Fix panic in shardPolicyWorker
```
2022/11/15 08:40:56 worker exits from a panic: runtime error: index out of range [0] with length 0
2022/11/15 08:40:56 worker exits from panic: goroutine 1188 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x1042b60, 0xc0015ae018})
	runtime/panic.go:1038 +0x215
github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1()
	github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65 +0x366
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x68
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
9a20498f34 [#1940] Removing all trees by container ID if tree ID is empty in pilorama.Forest.TreeDrop
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-11-19 11:01:04 +03:00
Pavel Karpy
634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
d5a14041e0 [#2040] node: Do not attach tokens in the assembly process
A container node is expected to have full "get" access to assemble the
object.
A non-container node is expected to forward any request to a container node.
Any token is expected to be issued for an original request sender not for a
node so any new request is invalid by design with that token.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
fd61bdadcb [#2040] node: Attach original meta to the spawned requests
Do not lose meta information of the original requests: cache session and
bearer tokens of the original request b/w a new generated ones. Middle
request wrappers should not contain any meta information, since it is
useless (e.g. ACL service checks only the original tokens).

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
481b48b942 [#2028] node: Check session token's NBF and IAT
ACL service did not check "Not Valid Before" and "Issued At" claims.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
aadd2ad050 [#2028] node: Do not wrap malformed request errors
After presenting request statuses on the API level, all the errors are
unwrapped before sending to the caller side. It led to a losing invalid
request's context.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
2522d924b9 [#2037] services/object: Fix concurrent map writes in traverser
```
fatal error: concurrent map writes

goroutine 4337 [running]:
github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*traversal).submitProcessed(...)
        github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:78
github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).iteratePlacement.func1()
        github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:198 +0x265
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x65
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
aa478f1def [#2024] services/object: Unify status errors
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
3875fef542 [#2024] services/object: Cover corner cases for children OutOfRange
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
aab398f4f5 [#1972] node: Do not save objects if node not in a container
Do not use node's local storage if it is clear that an object will be
removed anyway as a redundant. It requires moving the changing local storage
logic from the validation step to the local target implementation.
It allows performing any relations checks (e.g. object locking) only if a
node is considered as a valid container member and is expected to store
(stored previously) all the helper objects (e.g. `LOCK`, `TOMBSTONE`, etc).

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
a455ec18c3 [#2007] services/object: Allocate memory on-demand in GET_RANGE
For big objects we want to get OutOfRange error before all the memory is
allocated.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
ff5526038d [#2007] services/object: Fix comment
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
846ff515e6 [#1812] policer: Do not remove copies if there are maintenance nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-26 19:13:17 +03:00
Leonard Lyubich
7b418c36b4 [#1930] services/session: Log calling Create RPC
There is a need to check if session is opened during system
testing/debug.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-24 17:45:22 +03:00
Leonard Lyubich
60e9de8d63 [#1916] control: Check maintenance allowance on Control server
In previous implementation turning to maintenance mode using NeoFS CLI
required NeoFS API endpoint. This was not convenient from the user
perspective. It's worth to move networks settings' check to the server
side.

Add `force_maintenance` field to `SetNetmapStatusRequest.Body` message
of Control API. Add `force` flag to `neofs-cli control set-status`
command which sets corresponding field in the requests body if status is
`maintenance`. Force flag is ignored for any other status.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-24 09:20:24 +04:00
Pavel Karpy
e1be0180f6 [#1329] tree: Sync trees when a node first time appears in a container
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-21 18:47:56 +03:00
Pavel Karpy
1766ca2039 [#1902] tree: Allow synchronize all the container trees
Add `SynchronizeAllTrees` method of the Tree service. It allows fetching
tree IDs and sync all of them. Share common logic b/w the new method and
the `SynchronizeTree`.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-20 16:17:57 +03:00
Pavel Karpy
6d4beea187 [#1902] tree: Extend grpc service with ListTrees method
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-20 16:17:57 +03:00
Pavel Karpy
3aa9938b8f [#1902] Update protoc to v3.21.7
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-20 16:17:57 +03:00
Evgenii Stratonikov
9ec01bb9c1 [#1931] control: Allow to clear errors in SetShardMode RPC
It hasn't been working since the initial implementation 7fb15fa1d0.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-20 15:51:31 +03:00
Evgenii Stratonikov
0d65888005 [#1910] .golangci.yml: Add predeclared linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
Evgenii Stratonikov
1cb892c579 [#1910] .golangci.yml: Add misspell linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
Evgenii Stratonikov
d772e35aba [#1910] .golangci.yml: Add godot linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
Pavel Karpy
bc67f77d86 [#1329] tree: Fix sync applying
Add the node position in a container and the container size to the CID
descriptor that is passed to the `TreeApply`. Previously, `checkValid` does
not allow any log operarations.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-17 20:53:34 +03:00
Pavel Karpy
f5f560d903 [#1329] tree: Do not sync trees that are from external containers
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-17 20:53:34 +03:00
Leonard Lyubich
1406d096a2 [#1680] service/object: Fail all operations in maintenance mode
Storage node should not provide NeoFS Object API service when it is
under maintenance.

Declare `Common` service that unifies behavior of all object operations.
The implementation pre-checks if node is under maintenance and returns
`apistatus.NodeUnderMaintenance` if so. Use `Common` service as a first
logical processor in object service pipeline.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-14 13:54:32 +04:00
Leonard Lyubich
05420173cc [#1894] services/object: Ignore unrelated sessions in client
In some scenarios original session can be unrelated to the objects which
are read internally by the node. For example, node requests child
objects when removing the parent one.

Tune internal NeoFS API client used by node's Object API server to
ignore unrelated sessions in `GetObject` / `HeadObject` / `PayloadRange`
ops.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-14 12:37:31 +03:00
Pavel Karpy
13c4a9f4b8 [#1332] tree: Make SignMessage public
It will allow reusing signing routine in other components
(e.g. `neofs-cli`).

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-13 20:01:48 +03:00
8714fc42b5 [#1765] Use hex format to print storage node ID
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-10-13 12:55:21 +03:00
Pavel Karpy
f037022a7a [#1770] logger: Refactor Logger component
Make it store its internal `zap.Logger`'s level. Also, make all the
components to accept internal `logger.Logger` instead of `zap.Logger`; it
will simplify future refactor.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-12 18:11:05 +03:00
Evgenii Stratonikov
4baf00aa21 [#1884] services/object: Fallback to GET in GET_RANGE
Current spec allows denying GET_RANGE requests from other storage nodes.
However, GET should always be allowed and it is enough to perform
GET_RANGE locally

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-12 17:05:51 +03:00
Leonard Lyubich
dde4d4df2a [#1878] services/object: Fix child check in GET
In previous implementation `ObjectService.Get` RPC handler failed with
`parent address in child object differs` while assembling the "big"
object. This was caused by the child check which required parent
reference to be set in all child objects. The check was impracticable
because not all elements of the split-chain have a link to the parent.

Make `execCtx.isChild` to return `true` if parameterized object has no
parent header in its own header.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-12 16:56:37 +03:00
Evgenii Stratonikov
49eab6318c [#1867] control: Fix degraded-read-only mode parsing
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-12 11:20:48 +03:00
Evgenii Stratonikov
c0199dee93 [#1867] services/control: Interpret empty list of IDs as all shards
In neofs-cli the flag is still required, but `all` can be used to
process all shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-12 11:20:48 +03:00
Evgenii Stratonikov
19c0a74e94 [#1867] services/control: Allow to provide multiple shard IDs to some commands
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-12 11:20:48 +03:00
Leonard Lyubich
050ad2762c [#1680] replicator: Consider NODE_UNDER_MAINTENANCE as OK
Node response with `NODE_UNDER_MAINTENANCE` status signals that the node
was switched to maintenance mode. There is a delay between the actual
switch and the reflection in the network map of up to one epoch. To
speed up the reaction to the maintenance, it is required to recognize
such node responses in the Policer.

Make `Policer.processNodes` to exclude elements with shortage decreasing
on `NODE_UNDER_MAINTENANCE` status response.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
e99e25b52f [#1680] replicator: Consider nodes under maintenance as OK
Nodes under maintenance SHOULD not respond to object requests. Based on
this, storage node's Policer SHOULD consider such nodes as problem ones.
However, to prevent spam with the new replicas, on the contrary, Policer
should consider them normal.

Make `Policer.processNodes` to exclude elements if `IsMaintenance()`
with shortage decreasing.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00