Commit graph

64 commits

Author SHA1 Message Date
82f84662e5 [#6] services/policer: Reduce the amount of indirect pointers
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-03-01 15:29:54 +03:00
cb016d53a6 [#1] Fix comments and error messages
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
8f61cc1dcc [#5] policer: Use generic LRU client
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
b207dc424f [#2158] policer: Reduce default cache size
We use cache to avoid policing the same object multiple times in a short
time span (< 30 seconds). If we have 200_000 objects in a blobstor, it is a bit useless
-- if it takes 1 second to process an object and we have `replicator.pool_size: 20`
in config, the next iteration will happen in 10_000 second which is much
larger than 30 second. However we still consume a lot of memory, so it
makes sense to use saner default.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Evgenii Stratonikov
660c38d07e [#2062] services/policer: Use a proper key for object cache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
1779664644 [#2058] services/policer: Fix panic in shardPolicyWorker
```
2022/11/15 08:40:56 worker exits from a panic: runtime error: index out of range [0] with length 0
2022/11/15 08:40:56 worker exits from panic: goroutine 1188 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x1042b60, 0xc0015ae018})
	runtime/panic.go:1038 +0x215
github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1()
	github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65 +0x366
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x68
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
846ff515e6 [#1812] policer: Do not remove copies if there are maintenance nodes
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-26 19:13:17 +03:00
Evgenii Stratonikov
0d65888005 [#1910] .golangci.yml: Add predeclared linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
Evgenii Stratonikov
d772e35aba [#1910] .golangci.yml: Add godot linker
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-18 15:08:26 +03:00
8714fc42b5 [#1765] Use hex format to print storage node ID
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-10-13 12:55:21 +03:00
Pavel Karpy
f037022a7a [#1770] logger: Refactor Logger component
Make it store its internal `zap.Logger`'s level. Also, make all the
components to accept internal `logger.Logger` instead of `zap.Logger`; it
will simplify future refactor.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-12 18:11:05 +03:00
Leonard Lyubich
050ad2762c [#1680] replicator: Consider NODE_UNDER_MAINTENANCE as OK
Node response with `NODE_UNDER_MAINTENANCE` status signals that the node
was switched to maintenance mode. There is a delay between the actual
switch and the reflection in the network map of up to one epoch. To
speed up the reaction to the maintenance, it is required to recognize
such node responses in the Policer.

Make `Policer.processNodes` to exclude elements with shortage decreasing
on `NODE_UNDER_MAINTENANCE` status response.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
e99e25b52f [#1680] replicator: Consider nodes under maintenance as OK
Nodes under maintenance SHOULD not respond to object requests. Based on
this, storage node's Policer SHOULD consider such nodes as problem ones.
However, to prevent spam with the new replicas, on the contrary, Policer
should consider them normal.

Make `Policer.processNodes` to exclude elements if `IsMaintenance()`
with shortage decreasing.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
df5d7bf729 [#1680] replicator: Work with netmap.NodeInfo in TaskResult
Make `replicator.TaskResult` to accept `netmap.NodeInfo` type instead of
uint64 in order to clarify the meaning and prevent passing the random
numbers.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Leonard Lyubich
e6f8904040 [#1680] policer: Refactor tracking the processed nodes
Add clear methods with docs. Use the methods instead of direct map
and bool instructions.

Signed-off-by: Leonard Lyubich <ctulhurider@gmail.com>
2022-10-11 12:54:27 +03:00
Evgenii Stratonikov
5834f9807e [#1847] services/policer: Provide container ID in logs
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-07 09:58:16 +03:00
Evgenii Stratonikov
898689ec14 [#1731] services/replicator: Unify Task interface with other parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-09-24 13:47:48 +03:00
Pavel Karpy
51afcc1182 [#1461] engine, policer: Force remove objects w/o container
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Leonard Lyubich
fdf62e8562 [#1586] Upgrade NeoFS SDK Go to rc#5
Error checkers now support wrapped errors so there is no need to
explicitly unwrap errors in `Policer`.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-07 14:23:41 +03:00
Leonard Lyubich
c165d1a9b5 [#1556] Upgrade NeoFS SDK Go with changed container API
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-07-05 11:26:06 +03:00
Leonard Lyubich
b67974a8d3 [#xxx] Upgrade NeoFS SDK Go with changed container sessions
After recent changes in NeoFS SDK Go library session tokens aren't
embedded into `container.Container` and `eacl.Table` structures.

Group value, session token and signature in a structure for container
and eACL.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-22 16:38:57 +03:00
Leonard Lyubich
21d2f8f861 [#1513] Upgrade NeoFS SDK Go with changed netmap package
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-17 15:53:18 +03:00
Pavel Karpy
36f4929e52 [#1507] node: Do not handle object concurrently by the policer
Cache object that are being processed. That prevents concurrent
object handling when there is a few number of objects and object handling
takes more time that the policer needs for starting that object handling one
more time.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-15 20:43:32 +03:00
Pavel Karpy
256165045b [#1508] node: Do not replicate object twice
If placement contains two vectors with intersecting nodes it was possible to
send the object to the nodes twice.
Also optimizes requests: do not ask about storing the object twice from the
same node.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-15 20:33:04 +03:00
Evgenii Stratonikov
2ae7c94cd6 [#1462] *: Remove log.With invocations
`log.With` is suitable during initialization, but in other places it induces
some overhead, even when branches with logging are not taken.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-06-03 14:30:00 +03:00
Pavel Karpy
babd382ba5 [#1418] engine: Do not use pointers as parameters
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-06-03 07:35:17 +03:00
Leonard Lyubich
1c30414a6c [#1454] Upgrade NeoFS SDK Go module with new IDs
Core changes:
 * avoid package-colliding variable naming
 * avoid using pointers to IDs where unnecessary
 * avoid using `idSDK` import alias pattern
 * use `EncodeToString` for protocol string calculation and `String` for
  printing

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-06-01 17:41:45 +03:00
Leonard Lyubich
96cdc04705 [#1449] policer: Unwrap status HEAD response
Helper function `client.IsErrObjectNotFound` doesn't support error
unwrapping, so we need to do it on caller side.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-30 17:59:16 +03:00
Leonard Lyubich
e96ea4635c [#1449] policer: Fix selection of new storage candidates
`Policer` should pass list of selected candidates into `WithNodes`
method of `replicator.Task`. In previous implementation `processNodes`
method passed an opposite list: failed nodes and/or the local one.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-30 17:59:16 +03:00
Leonard Lyubich
f8ac4632f8 [#1335] policer: Prevent potential object loss
In previous implementation `Policer` considered local object copy as
redundant on processing single placement vector.

Make `Policer` to call redundant copy callback after full placement
processing. Also fix 404 error parsing.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-23 15:24:23 +03:00
Leonard Lyubich
f15e6e888f [#1377] oid, cid: Upgrade SDK package
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-05-16 15:33:22 +03:00
Leonard Lyubich
967650f2ed [#1247] container: Return ContainerNotFound status error
Replace `core/container.ErrNotFound` error returned by `Source.Get`
interface method with `apistatus.ContainerNotFound` status error. This
error is returned by storage node's server as NeoFS API statuses.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2022-03-17 16:34:00 +03:00
Evgenii Stratonikov
050a4bb2b0 [#1115] *: link TODOs to corresponding issues
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-02-11 12:58:59 +03:00
Pavel Karpy
1667ec9e6d [#1131] *: Adopt SDK changes
`object.Address` has been moved to `object/address`
`object.ID` has been moved to `object/id`

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-02-08 09:45:38 +03:00
Alex Vanin
5d46035ae8 [#1052] Tidy INFO logs
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-12-27 14:28:01 +03:00
Alex Vanin
bca7cf9470 [#1047] policer: Check context before job selection
When application is being terminated, replicator routine
might be on the object picking phase. Storage is terminated
asynchronously, thus `Select()` may return corresponding
error. If we don't process `context.Done()` in this case,
then application freezes on shutdown.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-12-20 14:42:57 +03:00
Alex Vanin
011d0f605b [#965] replicator: Make HandleTask function public
Continues replication executed in separate pool of goroutines,
so there is no need in worker to handle replication tasks
asynchronously.

Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-11-26 15:39:38 +03:00
Alex Vanin
a74a402a7d [#965] policer: Implement continuous replication
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-11-26 15:39:38 +03:00
Evgenii Stratonikov
95893927aa *: replace neofs-api-go with neofs-sdk-go
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-11-12 17:29:09 +03:00
Evgenii Stratonikov
7cb3d0cb4a [#885] policer: remove objects for removed container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-10-18 12:14:14 +03:00
Evgenii Stratonikov
b8ba677c85 [#882] policer: add CID to the error message
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2021-10-08 08:21:01 +03:00
Leonard Lyubich
e473f3ac91 [#645] *: Use helper functions to build client.NodeInfo structures
Helper functions from core/client package allow to set public keys of
storage nodes.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-30 20:57:00 +03:00
Leonard Lyubich
73fb1a886c [#849] policer: Write message about redundant local object copy
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-27 11:27:41 +03:00
Leonard Lyubich
d613a856ce [#849] policer: Log object address in processNodes method
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-27 11:27:41 +03:00
Leonard Lyubich
358e3ed8c4 [#645] *: Change the locality condition of the node from the placement
Some software components regulate the way of working with placement arrays
when a local node enters it. In the previous implementation, the locality
criterion was the correspondence between the announced network address
(group) and the address with which the node was configured. However, by
design, network addresses are not unique identifiers of storage nodes in the
system.

Change comparisons by network addresses to comparisons by keys in all
packages with the logic described above. Implement `netmap.AnnouncedKeys`
interface on `cfg` type in the storage node application.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-09-07 09:53:18 +03:00
Leonard Lyubich
abfcc7498c [#715] services/policer: Select pseudo-random list of objects to check
In previous implementation of Policer's job queue the same list of objects for
processing was selected at each iteration. This was caused by consistent
return of `engine.List` function.

Use `rand.Shuffle` function to compose pseudo-random list of all objects in
order to approximately evenly distribute objects to work.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-08-25 14:40:12 +03:00
Leonard Lyubich
8ac3c62518 [#607] object/head: Make client constructor to work with group address
Make Object Head service to work with `AddressGroup` instead of `Address`
in order to support multiple addresses of the storage node.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-28 15:52:50 +03:00
Leonard Lyubich
6e5d7f84af [#607] network: Generalize LocalAddressSource to address group
Make `LocalAddressSource.LocalAddress` method to return `AddressGroup`. Make
`IsLocalAddress` function to accept parameter of type `AddressGroup`. Adopt
the application code with temporary `GroupFromAddress` helper.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-06-28 15:52:50 +03:00