engine: Optimize Inhume
operation to improve speed with large object sets #1476
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1476
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "a-savchuk/frostfs-node:boost-inhume-speed"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Close #1450
For more details, please refer to the discussion in the issue
6f0519ec6a
to2e7ba57e3f
2e7ba57e3f
tobc0d61977d
WIP: engine: Speed upto WIP: engine: Make(*StorageEngine).Inhume
Inhume
operation handle objects in parallelbc0d61977d
tof8b85a20bb
f8b85a20bb
tob19cf31440
b19cf31440
to59fc75de0b
59fc75de0b
to59821fb271
59821fb271
to37dcdaf0e1
37dcdaf0e1
to147c467151
147c467151
to13256d80fb
13256d80fb
toec0acc7cfc
ec0acc7cfc
tob947896005
b947896005
tod8764c7652
d8764c7652
to69c983ee96
d69a90b9ce
to2e3063c694
WIP: engine: Maketo engine: MakeInhume
operation handle objects in parallelInhume
operation handle objects in parallel@ -99,0 +91,4 @@
for _, addr := range prm.addrs {
select {
case <-ctx.Done():
taskCancel(context.Cause(ctx))
Probably I got it wrong, but does it really make sense to cancel inherited
taskCtx
withcontext deadline exceed
. It seemstaskCtx
is "done" by the same reason?Yes, you're right. I fixed it
2e3063c694
to8ea7b8a504
@ -99,0 +96,4 @@
}
wg.Add(1)
if err := e.inhumePool.Submit(func() {
To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other
Inhume
requests. I suggest using concurrent execution for each request.Threw a worker pool away and use grouping objects by shard instead
8ea7b8a504
to3fe0a84364
engine: Maketo WIP: engine: MakeInhume
operation handle objects in parallelInhume
operation handle objects in parallel3fe0a84364
to9ac33f8d1c
9ac33f8d1c
tod06418592e
d06418592e
to220caa401a
220caa401a
tob2bb62fe9b
b2bb62fe9b
to9823510de2
WIP: engine: Maketo engine: OptimizeInhume
operation handle objects in parallelInhume
operation to improve speed with large object sets9823510de2
toc712742c4e
@ -131,0 +135,4 @@
//
// If checkLocked is set, [apistatus.ObjectLocked] will be returned if any of
// the objects are locked.
func (e *StorageEngine) groupObjectsByShard(ctx context.Context, addrs []oid.Address, checkLocked bool) (map[string][]int, error) {
I suggest to replace
map[string][]int
withmap[string][]oid.Address
. Anyway you do this ininhume
method.The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches.
However, I think it's slightly pointless in this case. You're right, I fixed it
@ -100,0 +93,4 @@
var errLocked *apistatus.ObjectLocked
for shardID, addrIndexes := range addrsPerShard {
I suggest to perform Inhume on different shards concurrently. Or I missing something?
I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now
c712742c4e
to86d127b76a
86d127b76a
to8265462afa
8265462afa
to90047c6cf4
90047c6cf4
to89edb31068
89edb31068
to434744ddb5
434744ddb5
toa49d6e2cce
New commits pushed, approval review dismissed automatically according to repository settings
@acid-ant I left a comment after our recent discussion. I suggest to keep the previous behavior
a49d6e2cce/pkg/local_object_storage/engine/inhume.go (L174-L183)
a49d6e2cce
tod4f4280662
@ -98,3 +89,1 @@
return InhumeRes{}, new(apistatus.ObjectLocked)
}
}
var inhumePrm shard.InhumePrm
The rename of this parameters pollutes diff for not reason.
This hasn't changed, but it colored lines attract my attention.
Please, move this to a separate commit or revert.
Other than that, lgtm/
Done
d4f4280662
to5a0f06449b
New commits pushed, approval review dismissed automatically according to repository settings
@ -118,0 +113,4 @@
if _, err := sh.Inhume(ctx, shPrm); err != nil {
switch {
case errors.As(err, &errLocked):
case errors.Is(err, shard.ErrLockObjectRemoval):
The previous code replaced the error with
meta.ErrLockObjectRemoval
Why has this changed?
They're the same error
These kinds of things are better looking in a separate commit, far from obvious.
Done
engine: Optimizeto WIP: engine: OptimizeInhume
operation to improve speed with large object setsInhume
operation to improve speed with large object setsSet WIP because I need to check metabase resync will work correctly
Ref:
3b61cb4f49
5a0f06449b
to5fa80a7fa3
WIP: engine: Optimizeto engine: OptimizeInhume
operation to improve speed with large object setsInhume
operation to improve speed with large object sets5fa80a7fa3
to281d65435e
New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
@ -131,2 +166,4 @@
objectExists bool
)
e.iterateOverSortedShards(addr, func(_ int, sh hashedShard) (stop bool) {
stop
is no more needed.Sure, but it makes the function's signature more explicit