engine: Optimize Inhume operation to improve speed with large object sets #1476

Merged
fyrchik merged 3 commits from a-savchuk/frostfs-node:boost-inhume-speed into master 2024-12-04 07:37:17 +00:00
Member

Close #1450

For more details, please refer to the discussion in the issue

Close #1450 For more details, please refer to the discussion in the issue
a-savchuk force-pushed boost-inhume-speed from 6f0519ec6a to 2e7ba57e3f 2024-11-07 13:26:46 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 2e7ba57e3f to bc0d61977d 2024-11-13 15:30:49 +00:00 Compare
a-savchuk changed title from WIP: engine: Speed up (*StorageEngine).Inhume to WIP: engine: Make Inhume operation handle objects in parallel 2024-11-13 15:32:30 +00:00
a-savchuk force-pushed boost-inhume-speed from bc0d61977d to f8b85a20bb 2024-11-13 15:33:07 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from f8b85a20bb to b19cf31440 2024-11-13 15:33:57 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b19cf31440 to 59fc75de0b 2024-11-14 21:52:46 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 59fc75de0b to 59821fb271 2024-11-14 21:56:02 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 59821fb271 to 37dcdaf0e1 2024-11-14 22:00:55 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 37dcdaf0e1 to 147c467151 2024-11-14 22:02:04 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 147c467151 to 13256d80fb 2024-11-18 13:07:32 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 13256d80fb to ec0acc7cfc 2024-11-18 13:14:09 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from ec0acc7cfc to b947896005 2024-11-18 13:19:42 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b947896005 to d8764c7652 2024-11-19 09:26:08 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from d8764c7652 to 69c983ee96 2024-11-19 09:36:59 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from d69a90b9ce to 2e3063c694 2024-11-19 10:04:33 +00:00 Compare
a-savchuk changed title from WIP: engine: Make Inhume operation handle objects in parallel to engine: Make Inhume operation handle objects in parallel 2024-11-19 10:17:15 +00:00
requested reviews from storage-core-developers, storage-core-committers 2024-11-19 10:17:24 +00:00
aarifullin reviewed 2024-11-19 12:09:48 +00:00
@ -99,0 +91,4 @@
for _, addr := range prm.addrs {
select {
case <-ctx.Done():
taskCancel(context.Cause(ctx))
Member

Probably I got it wrong, but does it really make sense to cancel inherited taskCtx with context deadline exceed. It seems taskCtx is "done" by the same reason?

Probably I got it wrong, but does it really make sense to cancel inherited `taskCtx` with `context deadline exceed`. It seems `taskCtx` is "done" by the same reason?
Author
Member

Yes, you're right. I fixed it

Yes, you're right. I fixed it
aarifullin marked this conversation as resolved
a-savchuk force-pushed boost-inhume-speed from 2e3063c694 to 8ea7b8a504 2024-11-19 12:17:55 +00:00 Compare
dstepanov-yadro reviewed 2024-11-19 14:11:44 +00:00
@ -99,0 +96,4 @@
}
wg.Add(1)
if err := e.inhumePool.Submit(func() {

To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other Inhume requests. I suggest using concurrent execution for each request.

To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other `Inhume` requests. I suggest using concurrent execution for each request.
Author
Member

Threw a worker pool away and use grouping objects by shard instead

Threw a worker pool away and use grouping objects by shard instead
dstepanov-yadro marked this conversation as resolved
a-savchuk force-pushed boost-inhume-speed from 8ea7b8a504 to 3fe0a84364 2024-11-19 20:10:49 +00:00 Compare
a-savchuk changed title from engine: Make Inhume operation handle objects in parallel to WIP: engine: Make Inhume operation handle objects in parallel 2024-11-21 12:13:06 +00:00
a-savchuk force-pushed boost-inhume-speed from 3fe0a84364 to 9ac33f8d1c 2024-11-21 12:13:28 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 9ac33f8d1c to d06418592e 2024-11-21 12:15:59 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from d06418592e to 220caa401a 2024-11-21 12:16:21 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 220caa401a to b2bb62fe9b 2024-11-21 13:56:28 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from b2bb62fe9b to 9823510de2 2024-11-21 14:12:55 +00:00 Compare
a-savchuk changed title from WIP: engine: Make Inhume operation handle objects in parallel to engine: Optimize Inhume operation to improve speed with large object sets 2024-11-21 14:17:48 +00:00
a-savchuk force-pushed boost-inhume-speed from 9823510de2 to c712742c4e 2024-11-21 15:04:21 +00:00 Compare
dstepanov-yadro reviewed 2024-11-22 10:50:58 +00:00
@ -131,0 +135,4 @@
//
// If checkLocked is set, [apistatus.ObjectLocked] will be returned if any of
// the objects are locked.
func (e *StorageEngine) groupObjectsByShard(ctx context.Context, addrs []oid.Address, checkLocked bool) (map[string][]int, error) {

I suggest to replace map[string][]int with map[string][]oid.Address. Anyway you do this in inhume method.

I suggest to replace `map[string][]int` with `map[string][]oid.Address`. Anyway you do this in `inhume` method.
Author
Member

The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches.

However, I think it's slightly pointless in this case. You're right, I fixed it

The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches. However, I think it's slightly pointless in this case. You're right, I fixed it
dstepanov-yadro marked this conversation as resolved
dstepanov-yadro reviewed 2024-11-22 10:54:00 +00:00
@ -100,0 +93,4 @@
var errLocked *apistatus.ObjectLocked
for shardID, addrIndexes := range addrsPerShard {

I suggest to perform Inhume on different shards concurrently. Or I missing something?

I suggest to perform Inhume on different shards concurrently. Or I missing something?
Author
Member

I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now

I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now
dstepanov-yadro marked this conversation as resolved
a-savchuk force-pushed boost-inhume-speed from c712742c4e to 86d127b76a 2024-11-22 13:19:24 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 86d127b76a to 8265462afa 2024-11-25 08:32:38 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 8265462afa to 90047c6cf4 2024-11-25 08:40:21 +00:00 Compare
dstepanov-yadro approved these changes 2024-11-25 10:55:32 +00:00
Dismissed
a-savchuk force-pushed boost-inhume-speed from 90047c6cf4 to 89edb31068 2024-11-26 07:18:31 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 89edb31068 to 434744ddb5 2024-11-27 08:24:25 +00:00 Compare
a-savchuk force-pushed boost-inhume-speed from 434744ddb5 to a49d6e2cce 2024-11-28 08:14:51 +00:00 Compare
a-savchuk dismissed dstepanov-yadro's review 2024-11-28 08:14:51 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

Author
Member

@acid-ant I left a comment after our recent discussion. I suggest to keep the previous behavior
a49d6e2cce/pkg/local_object_storage/engine/inhume.go (L174-L183)

@acid-ant I left a comment after our recent discussion. I suggest to keep the previous behavior https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/a49d6e2cceb0dca8404139de53a904355da96305/pkg/local_object_storage/engine/inhume.go#L174-L183
a-savchuk force-pushed boost-inhume-speed from a49d6e2cce to d4f4280662 2024-11-29 10:50:26 +00:00 Compare
dstepanov-yadro approved these changes 2024-11-29 13:19:12 +00:00
Dismissed
fyrchik reviewed 2024-12-02 06:47:18 +00:00
@ -98,3 +89,1 @@
return InhumeRes{}, new(apistatus.ObjectLocked)
}
}
var inhumePrm shard.InhumePrm
Owner

The rename of this parameters pollutes diff for not reason.
This hasn't changed, but it colored lines attract my attention.
Please, move this to a separate commit or revert.

var shPrm shard.InhumePrm
if prm.forceRemoval {
		shPrm.ForceRemoval()
}
The rename of this parameters pollutes diff for not reason. This hasn't changed, but it colored lines attract my attention. Please, move this to a separate commit or revert. ``` var shPrm shard.InhumePrm if prm.forceRemoval { shPrm.ForceRemoval() } ```
Owner

Other than that, lgtm/

Other than that, lgtm/
Author
Member

Done

Done
a-savchuk force-pushed boost-inhume-speed from d4f4280662 to 5a0f06449b 2024-12-02 07:23:28 +00:00 Compare
a-savchuk dismissed dstepanov-yadro's review 2024-12-02 07:23:29 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

fyrchik approved these changes 2024-12-02 08:15:13 +00:00
Dismissed
fyrchik approved these changes 2024-12-02 08:16:42 +00:00
Dismissed
@ -118,0 +113,4 @@
if _, err := sh.Inhume(ctx, shPrm); err != nil {
switch {
case errors.As(err, &errLocked):
case errors.Is(err, shard.ErrLockObjectRemoval):
Owner

The previous code replaced the error with meta.ErrLockObjectRemoval
Why has this changed?

The previous code replaced the error with `meta.ErrLockObjectRemoval` Why has this changed?
Author
Member

They're the same error

// ErrLockObjectRemoval is returned when inhume operation is being
// performed on lock object, and it is not a forced object removal.
var ErrLockObjectRemoval = meta.ErrLockObjectRemoval
They're the same error ```go // ErrLockObjectRemoval is returned when inhume operation is being // performed on lock object, and it is not a forced object removal. var ErrLockObjectRemoval = meta.ErrLockObjectRemoval ```
Owner

These kinds of things are better looking in a separate commit, far from obvious.

These kinds of things are better looking in a separate commit, far from obvious.
Author
Member

Done

Done
a-savchuk changed title from engine: Optimize Inhume operation to improve speed with large object sets to WIP: engine: Optimize Inhume operation to improve speed with large object sets 2024-12-02 08:17:27 +00:00
Author
Member

Set WIP because I need to check metabase resync will work correctly

Ref: 3b61cb4f49

Set WIP because I need to check metabase resync will work correctly Ref: 3b61cb4f49
a-savchuk force-pushed boost-inhume-speed from 5a0f06449b to 5fa80a7fa3 2024-12-02 12:32:43 +00:00 Compare
a-savchuk changed title from WIP: engine: Optimize Inhume operation to improve speed with large object sets to engine: Optimize Inhume operation to improve speed with large object sets 2024-12-03 08:50:17 +00:00
a-savchuk force-pushed boost-inhume-speed from 5fa80a7fa3 to 281d65435e 2024-12-04 07:09:54 +00:00 Compare
a-savchuk dismissed fyrchik's review 2024-12-04 07:09:54 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

a-savchuk dismissed fyrchik's review 2024-12-04 07:09:54 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

fyrchik approved these changes 2024-12-04 07:37:04 +00:00
fyrchik merged commit 281d65435e into master 2024-12-04 07:37:17 +00:00
acid-ant approved these changes 2024-12-04 07:41:57 +00:00
@ -131,2 +166,4 @@
objectExists bool
)
e.iterateOverSortedShards(addr, func(_ int, sh hashedShard) (stop bool) {
Member

stop is no more needed.

`stop` is no more needed.
Author
Member

Sure, but it makes the function's signature more explicit

Sure, but it makes the function's signature more explicit
a-savchuk deleted branch boost-inhume-speed 2024-12-04 07:44:19 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1476
No description provided.