Commit graph

108 commits

Author SHA1 Message Date
Evgenii Stratonikov
1779664644 [#2058] services/policer: Fix panic in shardPolicyWorker
```
2022/11/15 08:40:56 worker exits from a panic: runtime error: index out of range [0] with length 0
2022/11/15 08:40:56 worker exits from panic: goroutine 1188 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x1042b60, 0xc0015ae018})
	runtime/panic.go:1038 +0x215
github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1()
	github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65 +0x366
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x68
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
846ff515e6 [#1812] policer: Do not remove copies if there are maintenance nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-26 19:13:17 +03:00
Evgenii Stratonikov
0d65888005 [#1910] .golangci.yml: Add predeclared linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
Evgenii Stratonikov
d772e35aba [#1910] .golangci.yml: Add godot linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
8714fc42b5 [#1765] Use hex format to print storage node ID
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-10-13 12:55:21 +03:00
Pavel Karpy
f037022a7a [#1770] logger: Refactor Logger component
Make it store its internal `zap.Logger`'s level. Also, make all the
components to accept internal `logger.Logger` instead of `zap.Logger`; it
will simplify future refactor.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-12 18:11:05 +03:00
Leonard Lyubich
050ad2762c [#1680] replicator: Consider NODE_UNDER_MAINTENANCE as OK
Node response with `NODE_UNDER_MAINTENANCE` status signals that the node
was switched to maintenance mode. There is a delay between the actual
switch and the reflection in the network map of up to one epoch. To
speed up the reaction to the maintenance, it is required to recognize
such node responses in the Policer.

Make `Policer.processNodes` to exclude elements with shortage decreasing
on `NODE_UNDER_MAINTENANCE` status response.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
e99e25b52f [#1680] replicator: Consider nodes under maintenance as OK
Nodes under maintenance SHOULD not respond to object requests. Based on
this, storage node's Policer SHOULD consider such nodes as problem ones.
However, to prevent spam with the new replicas, on the contrary, Policer
should consider them normal.

Make `Policer.processNodes` to exclude elements if `IsMaintenance()`
with shortage decreasing.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
df5d7bf729 [#1680] replicator: Work with netmap.NodeInfo in TaskResult
Make `replicator.TaskResult` to accept `netmap.NodeInfo` type instead of
uint64 in order to clarify the meaning and prevent passing the random
numbers.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
e6f8904040 [#1680] policer: Refactor tracking the processed nodes
Add clear methods with docs. Use the methods instead of direct map
and bool instructions.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Evgenii Stratonikov
5834f9807e [#1847] services/policer: Provide container ID in logs
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-07 09:58:16 +03:00
Evgenii Stratonikov
898689ec14 [#1731] services/replicator: Unify Task interface with other parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-09-24 13:47:48 +03:00
Pavel Karpy
51afcc1182 [#1461] engine, policer: Force remove objects w/o container
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Leonard Lyubich
fdf62e8562 [#1586] Upgrade NeoFS SDK Go to rc#5
Error checkers now support wrapped errors so there is no need to
explicitly unwrap errors in `Policer`.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-07 14:23:41 +03:00
Leonard Lyubich
c165d1a9b5 [#1556] Upgrade NeoFS SDK Go with changed container API
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-05 11:26:06 +03:00
Leonard Lyubich
b67974a8d3 [#xxx] Upgrade NeoFS SDK Go with changed container sessions
After recent changes in NeoFS SDK Go library session tokens aren't
embedded into `container.Container` and `eacl.Table` structures.

Group value, session token and signature in a structure for container
and eACL.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-22 16:38:57 +03:00
Leonard Lyubich
21d2f8f861 [#1513] Upgrade NeoFS SDK Go with changed netmap package
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-17 15:53:18 +03:00
Pavel Karpy
36f4929e52 [#1507] node: Do not handle object concurrently by the policer
Cache object that are being processed. That prevents concurrent
object handling when there is a few number of objects and object handling
takes more time that the policer needs for starting that object handling one
more time.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-15 20:43:32 +03:00
Pavel Karpy
256165045b [#1508] node: Do not replicate object twice
If placement contains two vectors with intersecting nodes it was possible to
send the object to the nodes twice.
Also optimizes requests: do not ask about storing the object twice from the
same node.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-15 20:33:04 +03:00
Evgenii Stratonikov
2ae7c94cd6 [#1462] *: Remove log.With invocations
`log.With` is suitable during initialization, but in other places it induces
some overhead, even when branches with logging are not taken.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-03 14:30:00 +03:00
Pavel Karpy
babd382ba5 [#1418] engine: Do not use pointers as parameters
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-03 07:35:17 +03:00
Leonard Lyubich
1c30414a6c [#1454] Upgrade NeoFS SDK Go module with new IDs
Core changes:
 * avoid package-colliding variable naming
 * avoid using pointers to IDs where unnecessary
 * avoid using `idSDK` import alias pattern
 * use `EncodeToString` for protocol string calculation and `String` for
  printing

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-01 17:41:45 +03:00
Leonard Lyubich
96cdc04705 [#1449] policer: Unwrap status HEAD response
Helper function `client.IsErrObjectNotFound` doesn't support error
unwrapping, so we need to do it on caller side.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-30 17:59:16 +03:00
Leonard Lyubich
e96ea4635c [#1449] policer: Fix selection of new storage candidates
`Policer` should pass list of selected candidates into `WithNodes`
method of `replicator.Task`. In previous implementation `processNodes`
method passed an opposite list: failed nodes and/or the local one.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-30 17:59:16 +03:00
Leonard Lyubich
f8ac4632f8 [#1335] policer: Prevent potential object loss
In previous implementation `Policer` considered local object copy as
redundant on processing single placement vector.

Make `Policer` to call redundant copy callback after full placement
processing. Also fix 404 error parsing.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-23 15:24:23 +03:00
Leonard Lyubich
f15e6e888f [#1377] oid, cid: Upgrade SDK package
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-16 15:33:22 +03:00
Leonard Lyubich
967650f2ed [#1247] container: Return ContainerNotFound status error
Replace `core/container.ErrNotFound` error returned by `Source.Get`
interface method with `apistatus.ContainerNotFound` status error. This
error is returned by storage node's server as NeoFS API statuses.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-03-17 16:34:00 +03:00
Evgenii Stratonikov
050a4bb2b0 [#1115] *: link TODOs to corresponding issues
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-02-11 12:58:59 +03:00
Pavel Karpy
1667ec9e6d [#1131] *: Adopt SDK changes
`object.Address` has been moved to `object/address`
`object.ID` has been moved to `object/id`

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-02-08 09:45:38 +03:00
Alex Vanin
5d46035ae8 [#1052] Tidy INFO logs
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-12-27 14:28:01 +03:00
Alex Vanin
bca7cf9470 [#1047] policer: Check context before job selection
When application is being terminated, replicator routine
might be on the object picking phase. Storage is terminated
asynchronously, thus `Select()` may return corresponding
error. If we don't process `context.Done()` in this case,
then application freezes on shutdown.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-12-20 14:42:57 +03:00
Alex Vanin
011d0f605b [#965] replicator: Make HandleTask function public
Continues replication executed in separate pool of goroutines,
so there is no need in worker to handle replication tasks
asynchronously.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-11-26 15:39:38 +03:00
Alex Vanin
a74a402a7d [#965] policer: Implement continuous replication
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-11-26 15:39:38 +03:00
Evgenii Stratonikov
95893927aa *: replace neofs-api-go with neofs-sdk-go
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-11-12 17:29:09 +03:00
Evgenii Stratonikov
7cb3d0cb4a [#885] policer: remove objects for removed container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-10-18 12:14:14 +03:00
Evgenii Stratonikov
b8ba677c85 [#882] policer: add CID to the error message
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-10-08 08:21:01 +03:00
Leonard Lyubich
e473f3ac91 [#645] *: Use helper functions to build client.NodeInfo structures
Helper functions from core/client package allow to set public keys of
storage nodes.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-30 20:57:00 +03:00
Leonard Lyubich
73fb1a886c [#849] policer: Write message about redundant local object copy
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-27 11:27:41 +03:00
Leonard Lyubich
d613a856ce [#849] policer: Log object address in processNodes method
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-27 11:27:41 +03:00
Leonard Lyubich
358e3ed8c4 [#645] *: Change the locality condition of the node from the placement
Some software components regulate the way of working with placement arrays
when a local node enters it. In the previous implementation, the locality
criterion was the correspondence between the announced network address
(group) and the address with which the node was configured. However, by
design, network addresses are not unique identifiers of storage nodes in the
system.

Change comparisons by network addresses to comparisons by keys in all
packages with the logic described above. Implement `netmap.AnnouncedKeys`
interface on `cfg` type in the storage node application.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-07 09:53:18 +03:00
Leonard Lyubich
abfcc7498c [#715] services/policer: Select pseudo-random list of objects to check
In previous implementation of Policer's job queue the same list of objects for
processing was selected at each iteration. This was caused by consistent
return of `engine.List` function.

Use `rand.Shuffle` function to compose pseudo-random list of all objects in
order to approximately evenly distribute objects to work.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-08-25 14:40:12 +03:00
Leonard Lyubich
8ac3c62518 [#607] object/head: Make client constructor to work with group address
Make Object Head service to work with `AddressGroup` instead of `Address`
in order to support multiple addresses of the storage node.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-28 15:52:50 +03:00
Leonard Lyubich
6e5d7f84af [#607] network: Generalize LocalAddressSource to address group
Make `LocalAddressSource.LocalAddress` method to return `AddressGroup`. Make
`IsLocalAddress` function to accept parameter of type `AddressGroup`. Adopt
the application code with temporary `GroupFromAddress` helper.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-28 15:52:50 +03:00
Leonard Lyubich
adbbad0beb [#607] network: Do not work with Address pointers
`network.Address` structure in most cases created once and used read-only.

Replace `AddressFromString` function with `Address.FromString` method with
the same purpose and implementation. Make all libraries to work with value.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-18 18:09:50 +03:00
Leonard Lyubich
95ccbbc2f9 [#607] network: Accept value instead of pointer in IsLocalAddress
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-18 18:09:50 +03:00
Leonard Lyubich
5900975d58 [#217] object/policer: Leave readability instead of performance comment
Right now we pass redundant copy to callback outside the for-loop through
the helpful boolean variable instead of calling it deeply nested.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-02-24 20:59:14 +03:00
Leonard Lyubich
277e3ca20a [#217] policer: Handler redundant local copy of the object
Detect redundant local copy of the object in Object Policer. Add redundant
copy callback (`WithRedundantCopyCallback` option). Pass address of the
redundant copy to callback.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-02-24 20:59:14 +03:00
Alex Vanin
bbe700fa37 [#250] service/policer: Don't shrink node list at unknown error
Every unknown error must not decrease shortage counter and must not
exclude faulty node from the node list, because this list will be used
later for replication.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2020-12-14 21:49:50 +03:00
Alex Vanin
e0350efe00 [#231] services/policer: Use engine.List method
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2020-12-11 17:19:37 +03:00
Leonard Lyubich
aa9eb2eaf2 [#186] policer: Use new storage engine for work
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-12-11 17:19:37 +03:00
Alex Vanin
a14bb6292b [#182] Reuse search filter in policer
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2020-11-19 17:59:46 +03:00
Alex Vanin
2e605b2435 [#182] Limit policer object filter to physical stored objects
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2020-11-19 17:59:46 +03:00
Leonard Lyubich
3de8febe57 [#174] Update to latest neofs-api-go changes
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-11-17 11:56:00 +03:00
Alex Vanin
65be09d3db [#155] Update neofs-api-go with refactored pkg/netmap
Refactored pkg/netmap package provides JSON converters for
NodeInfo and PlacementPolicy structures, that has been used
by client applications.

It also updates Node structure itself so it is a part of
grpc <-> v2 <-> pkg conversion chain.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2020-11-06 09:55:05 +03:00
Leonard Lyubich
c0aa892161 [#136] localstorage: Make local storage to use new metabase
Replace meta Bucket with meta.DB instance in local storage implementation.
Adopt all dependent components to new local storage.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-11-03 18:42:32 +03:00
Leonard Lyubich
f66c7958e7 [#109] services/policer: Assign tasks to Replicator
Make Policer to call AddTask method of Replicator when an insufficient
number of copies of an object is detected in the container.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-10-23 15:23:22 +03:00
Leonard Lyubich
0dab4b7581 [#108] services: Implement Policer service
Implement Policer service that performs background work to check compliance
with the placement policy for local objects in the container. In the initial
implementation, the selection of the working queue of objects is
simplified, and there is no transfer of the result to the replicator.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-10-21 14:42:51 +03:00