Commit Graph

777 Commits (79ba34714ab83b76bdfe1946b7ac3395bdd4296f)

Author SHA1 Message Date
Dmitrii Stepanov 6925fb4c59 [TrueCloudLab/hrw#2] node: Use typed HRW methods
Update HRW lib and use typed HRW methods to sort shards and nodes

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
Alejandro Lopez 73bb590cb1 [#64] node: Use pool_size_local and separate pool for local puts
Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
2023-02-22 13:43:19 +03:00
Alejandro Lopez cb5468abb8 [#66] node: Replace interface{} with any
Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
2023-02-21 16:47:07 +03:00
Anton Nikiforov 22f3c7d080 [#1868] Reload config for pprof and metrics on SIGHUP
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-02-20 13:53:27 +03:00
Evgenii Stratonikov 2567f8020e [#2260] services/object: Do not assemble object with TTL=1
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy 901d62567d [#57] node: Broadcast link objects
It boosts object assembling by an _average_ container node.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-17 11:58:27 +03:00
Stanislav Bogatyrev cb016d53a6 [#1] Fix comments and error messages
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
Stanislav Bogatyrev c761a95eef [#1] Fix project name in control service
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
Evgenii Stratonikov d65a95a2c6 [#28] pilorama: Remove `LogMove` struct
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 46c62be7e8 [#28] Fix linter issues
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 67c97c6804 [#2210] services/tree: Drop messages not in queue
Currently, under high load clients are blocked on channel send
and the number of goroutines can increase indefinitely.
In this commit we drop replication messages if send/recv queue is full
and rely on a background synchronization.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 0d8366f475 [#2207] object/acl: Return status error for expired session token
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 6efa93be0a [#1621] services/tree: Return `Apply` result asyncronously
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 554b85411f [#2190] services/object: Log service error with INFO level
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov 3d1d2ee7b1 [#11] Regenerate proto files
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-12 08:55:47 +03:00
Evgenii Stratonikov f0be0befc5 [#5] services/object_manager: Use generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
Evgenii Stratonikov 1b3374ac7f [#5] services/tree: User generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
Evgenii Stratonikov 8f61cc1dcc [#5] policer: Use generic LRU client
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
Evgenii Stratonikov 6f5edac730 [#2164] network/cache: Do not reconnect to failed clients immediately
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov b4e90cdf51 [#2165] pilorama: Optimize `TreeApply` when used for synchronization
Because synchronization _most likely_ will have apply already existing
operations, it is much faster to check their presence in a read
transaction. However, always doing this will degrade the perfomance
for normal `Apply`. And, let's be honest, it is already not good.
Thus we add a separate parameter which specifies whether this logic is
enabled.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov f9fcd85363 [#2165] services/tree: Remember starting height for the synchronization
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov 06137dbf8e [#2165] services/tree: Do not export `synchronizeAllTrees`
It is used only privately.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov c299b98afe [#2165] services/tree: Parallelize synchronization
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov de9957e076 [#2165] services/tree: Always synchronize all containers
In case of split-brain we must synchronize everything.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy 6a4e5e6f0a [#2144] node: Try node's private key if dynamic token fetching failed
`GETRANGEHASH` request spawns `GETRANGE` requests if an object could not be
found locally. If the original request contains session, it can be static
and, therefore, fetching session key can not be performed successfully.
As the best effort a node could request object's range with its own key.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy 86a4fba571 [#2144] node: Clarify `KeyStorage.GetKey` method
Actualize the doc, fix API status error return.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov 04b5ec759b [#2139] object/put: Use `sync.Pool` for temporary payloads
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov 9e0decd12d [#2162] services/tree: Close connection after the syncronization
There was a goroutine leak here.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy 306609030a [#2159] node: Add tree replication timeout configuration
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov 3bb5a320d7 [#2154] services/tree: Do not log an error when synchronizing container of 1 node
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov 387d1e2977 [#2127] services/tree: Randomize node order for synchronization
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov b207dc424f [#2158] policer: Reduce default cache size
We use cache to avoid policing the same object multiple times in a short
time span (< 30 seconds). If we have 200_000 objects in a blobstor, it is a bit useless
-- if it takes 1 second to process an object and we have `replicator.pool_size: 20`
in config, the next iteration will happen in 10_000 second which is much
larger than 30 second. However we still consume a lot of memory, so it
makes sense to use saner default.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy b413094704 [#2095] node: Fix collecting child objects
Stop child objects collection if the last returned object (the most "left"
object in the collected chain) starts exactly from the `GETRANGE`'s `from`
value.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-30 11:07:35 +03:00
Pavel Karpy 350eecfa13 [#2095] node: Do not allow `GETRANGE` requests with zero length
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-30 11:07:35 +03:00
Pavel Karpy 923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Pavel Karpy d54022eacc [#2047] node: Do not send chunk twice on request forwarding
That could happen if a node forwards request to a node that closed the
connection during the original object stream.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-12-02 11:27:48 +03:00
Evgenii Stratonikov bd25db5d4a [#1984] metrics: Use separate metrics for success/failed requests
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-01 14:18:10 +03:00
Evgenii Stratonikov e21c472dc7 [#1984] services/object: Increase put_req_count after the request is processed
As it is specified in metrics description.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-01 14:18:10 +03:00
Pavel Karpy 51963abce7 [#1972] node: Fix errors comments in the Put service
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 16:58:52 +03:00
Evgenii Stratonikov 660c38d07e [#2062] services/policer: Use a proper key for object cache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov 1779664644 [#2058] services/policer: Fix panic in `shardPolicyWorker`
```
2022/11/15 08:40:56 worker exits from a panic: runtime error: index out of range [0] with length 0
2022/11/15 08:40:56 worker exits from panic: goroutine 1188 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x1042b60, 0xc0015ae018})
	runtime/panic.go:1038 +0x215
github.com/nspcc-dev/neofs-node/pkg/services/policer.(*Policer).shardPolicyWorker.func1()
	github.com/nspcc-dev/neofs-node/pkg/services/policer/process.go:65 +0x366
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
	github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x68
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Anton Nikiforov 9a20498f34 [#1940] Removing all trees by container ID if tree ID is empty in `pilorama.Forest.TreeDrop`
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-11-19 11:01:04 +03:00
Pavel Karpy 634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy d5a14041e0 [#2040] node: Do not attach tokens in the assembly process
A container node is expected to have full "get" access to assemble the
object.
A non-container node is expected to forward any request to a container node.
Any token is expected to be issued for an original request sender not for a
node so any new request is invalid by design with that token.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy fd61bdadcb [#2040] node: Attach original meta to the spawned requests
Do not lose meta information of the original requests: cache session and
bearer tokens of the original request b/w a new generated ones. Middle
request wrappers should not contain any meta information, since it is
useless (e.g. ACL service checks only the original tokens).

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy 481b48b942 [#2028] node: Check session token's NBF and IAT
ACL service did not check "Not Valid Before" and "Issued At" claims.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy aadd2ad050 [#2028] node: Do not wrap malformed request errors
After presenting request statuses on the API level, all the errors are
unwrapped before sending to the caller side. It led to a losing invalid
request's context.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov 2522d924b9 [#2037] services/object: Fix concurrent map writes in traverser
```
fatal error: concurrent map writes

goroutine 4337 [running]:
github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*traversal).submitProcessed(...)
        github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:78
github.com/nspcc-dev/neofs-node/pkg/services/object/put.(*distributedTarget).iteratePlacement.func1()
        github.com/nspcc-dev/neofs-node/pkg/services/object/put/distributed.go:198 +0x265
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x65
```

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov aa478f1def [#2024] services/object: Unify status errors
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov 3875fef542 [#2024] services/object: Cover corner cases for children OutOfRange
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00