engine: Optimize Inhume operation to improve speed with large object sets #1476

Open
a-savchuk wants to merge 2 commits from a-savchuk/frostfs-node:boost-inhume-speed into master
Member

Close #1450

For more details, please refer to the discussion in the issue

Close #1450 For more details, please refer to the discussion in the issue
a-savchuk added 3 commits 2024-11-07 10:06:14 +00:00
This function was a duplicate of the `(*StorageEngine).addShard`
but it didn't allow to set an engine's shard pool size what can
be very useful sometimes.

Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
[#1450] engine: Batch addresses in (*StorageEngine).Inhume
All checks were successful
Tests and linters / Run gofumpt (pull_request) Successful in 1m21s
DCO action / DCO (pull_request) Successful in 1m47s
Vulncheck / Vulncheck (pull_request) Successful in 2m18s
Pre-commit hooks / Pre-commit (pull_request) Successful in 2m24s
Build / Build Components (pull_request) Successful in 2m44s
Tests and linters / gopls check (pull_request) Successful in 3m0s
Tests and linters / Staticcheck (pull_request) Successful in 3m11s
Tests and linters / Lint (pull_request) Successful in 3m57s
Tests and linters / Tests (pull_request) Successful in 4m39s
Tests and linters / Tests with -race (pull_request) Successful in 6m6s
6f0519ec6a
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
a-savchuk force-pushed boost-inhume-speed from 6f0519ec6a to 2e7ba57e3f 2024-11-07 13:26:46 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 2e7ba57e3f to bc0d61977d 2024-11-13 15:30:49 +00:00 Compare
a-savchuk changed title from WIP: engine: Speed up (*StorageEngine).Inhume to WIP: engine: Make Inhume operation handle objects in parallel 2024-11-13 15:32:30 +00:00
a-savchuk force-pushed boost-inhume-speed from bc0d61977d to f8b85a20bb 2024-11-13 15:33:07 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from f8b85a20bb to b19cf31440 2024-11-13 15:33:57 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b19cf31440 to 59fc75de0b 2024-11-14 21:52:46 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 59fc75de0b to 59821fb271 2024-11-14 21:56:02 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 59821fb271 to 37dcdaf0e1 2024-11-14 22:00:55 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 37dcdaf0e1 to 147c467151 2024-11-14 22:02:04 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 147c467151 to 13256d80fb 2024-11-18 13:07:32 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 13256d80fb to ec0acc7cfc 2024-11-18 13:14:09 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from ec0acc7cfc to b947896005 2024-11-18 13:19:42 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b947896005 to d8764c7652 2024-11-19 09:26:08 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from d8764c7652 to 69c983ee96 2024-11-19 09:36:59 +00:00 Compare
a-savchuk added 1 commit 2024-11-19 10:02:01 +00:00
[#1450] node/config: Allow to configure engine's InhumePoolSize
All checks were successful
Tests and linters / Run gofumpt (pull_request) Successful in 1m28s
DCO action / DCO (pull_request) Successful in 1m56s
Vulncheck / Vulncheck (pull_request) Successful in 2m30s
Pre-commit hooks / Pre-commit (pull_request) Successful in 2m35s
Build / Build Components (pull_request) Successful in 3m33s
Tests and linters / gopls check (pull_request) Successful in 3m30s
Tests and linters / Staticcheck (pull_request) Successful in 4m10s
Tests and linters / Lint (pull_request) Successful in 4m24s
Tests and linters / Tests (pull_request) Successful in 6m22s
Tests and linters / Tests with -race (pull_request) Successful in 6m24s
d69a90b9ce
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
a-savchuk force-pushed boost-inhume-speed from d69a90b9ce to 2e3063c694 2024-11-19 10:04:33 +00:00 Compare
a-savchuk changed title from WIP: engine: Make Inhume operation handle objects in parallel to engine: Make Inhume operation handle objects in parallel 2024-11-19 10:17:15 +00:00
a-savchuk requested review from storage-core-developers 2024-11-19 10:17:23 +00:00
a-savchuk requested review from storage-core-committers 2024-11-19 10:17:24 +00:00
aarifullin reviewed 2024-11-19 12:09:48 +00:00
@ -99,0 +91,4 @@
for _, addr := range prm.addrs {
select {
case <-ctx.Done():
taskCancel(context.Cause(ctx))
Member

Probably I got it wrong, but does it really make sense to cancel inherited taskCtx with context deadline exceed. It seems taskCtx is "done" by the same reason?

Probably I got it wrong, but does it really make sense to cancel inherited `taskCtx` with `context deadline exceed`. It seems `taskCtx` is "done" by the same reason?
Author
Member

Yes, you're right. I fixed it

Yes, you're right. I fixed it
aarifullin marked this conversation as resolved
a-savchuk force-pushed boost-inhume-speed from 2e3063c694 to 8ea7b8a504 2024-11-19 12:17:55 +00:00 Compare
dstepanov-yadro reviewed 2024-11-19 14:11:44 +00:00
@ -99,0 +96,4 @@
}
wg.Add(1)
if err := e.inhumePool.Submit(func() {

To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other Inhume requests. I suggest using concurrent execution for each request.

To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other `Inhume` requests. I suggest using concurrent execution for each request.
Author
Member

Threw a worker pool away and use grouping objects by shard instead

Threw a worker pool away and use grouping objects by shard instead
a-savchuk force-pushed boost-inhume-speed from 8ea7b8a504 to 3fe0a84364 2024-11-19 20:10:49 +00:00 Compare
a-savchuk changed title from engine: Make Inhume operation handle objects in parallel to WIP: engine: Make Inhume operation handle objects in parallel 2024-11-21 12:13:06 +00:00
a-savchuk force-pushed boost-inhume-speed from 3fe0a84364 to 9ac33f8d1c 2024-11-21 12:13:28 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 9ac33f8d1c to d06418592e 2024-11-21 12:15:59 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from d06418592e to 220caa401a 2024-11-21 12:16:21 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 220caa401a to b2bb62fe9b 2024-11-21 13:56:28 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b2bb62fe9b to 9823510de2 2024-11-21 14:12:55 +00:00 Compare
a-savchuk changed title from WIP: engine: Make Inhume operation handle objects in parallel to engine: Optimize Inhume operation to improve speed with large object sets 2024-11-21 14:17:48 +00:00
a-savchuk force-pushed boost-inhume-speed from 9823510de2 to c712742c4e 2024-11-21 15:04:21 +00:00 Compare
dstepanov-yadro reviewed 2024-11-22 10:50:58 +00:00
@ -131,0 +135,4 @@
//
// If checkLocked is set, [apistatus.ObjectLocked] will be returned if any of
// the objects are locked.
func (e *StorageEngine) groupObjectsByShard(ctx context.Context, addrs []oid.Address, checkLocked bool) (map[string][]int, error) {

I suggest to replace map[string][]int with map[string][]oid.Address. Anyway you do this in inhume method.

I suggest to replace `map[string][]int` with `map[string][]oid.Address`. Anyway you do this in `inhume` method.
Author
Member

The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches.

However, I think it's slightly pointless in this case. You're right, I fixed it

The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches. However, I think it's slightly pointless in this case. You're right, I fixed it
dstepanov-yadro reviewed 2024-11-22 10:54:00 +00:00
@ -100,0 +93,4 @@
var errLocked *apistatus.ObjectLocked
for shardID, addrIndexes := range addrsPerShard {

I suggest to perform Inhume on different shards concurrently. Or I missing something?

I suggest to perform Inhume on different shards concurrently. Or I missing something?
Author
Member

I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now

I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now
a-savchuk force-pushed boost-inhume-speed from c712742c4e to 86d127b76a 2024-11-22 13:19:24 +00:00 Compare
All checks were successful
Tests and linters / Run gofumpt (pull_request) Successful in 1m55s
Required
Details
DCO action / DCO (pull_request) Successful in 2m10s
Required
Details
Vulncheck / Vulncheck (pull_request) Successful in 2m36s
Required
Details
Pre-commit hooks / Pre-commit (pull_request) Successful in 2m59s
Required
Details
Build / Build Components (pull_request) Successful in 3m6s
Required
Details
Tests and linters / gopls check (pull_request) Successful in 3m25s
Required
Details
Tests and linters / Staticcheck (pull_request) Successful in 3m38s
Required
Details
Tests and linters / Lint (pull_request) Successful in 4m26s
Required
Details
Tests and linters / Tests (pull_request) Successful in 5m0s
Required
Details
Tests and linters / Tests with -race (pull_request) Successful in 6m33s
Required
Details
This pull request doesn't have enough approvals yet. 0 of 2 approvals granted.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u boost-inhume-speed:a-savchuk-boost-inhume-speed
git checkout a-savchuk-boost-inhume-speed
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
TrueCloudLab/storage-core-committers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1476
No description provided.