engine: Optimize Inhume
operation to improve speed with large object sets #1476
No reviewers
TrueCloudLab/storage-core-developers
TrueCloudLab/storage-core-committers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1476
Loading…
Reference in a new issue
No description provided.
Delete branch "a-savchuk/frostfs-node:boost-inhume-speed"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Close #1450
For more details, please refer to the discussion in the issue
(*testEngineWrapper).setInitializedShards
251d17e36aInhume
benchmark for handling large number of addresses b12968e959(*StorageEngine).Inhume
6f0519ec6a
to2e7ba57e3f
2e7ba57e3f
tobc0d61977d
WIP: engine: Speed upto WIP: engine: Make(*StorageEngine).Inhume
Inhume
operation handle objects in parallelbc0d61977d
tof8b85a20bb
f8b85a20bb
tob19cf31440
b19cf31440
to59fc75de0b
59fc75de0b
to59821fb271
59821fb271
to37dcdaf0e1
37dcdaf0e1
to147c467151
147c467151
to13256d80fb
13256d80fb
toec0acc7cfc
ec0acc7cfc
tob947896005
b947896005
tod8764c7652
d8764c7652
to69c983ee96
InhumePoolSize
d69a90b9ce
to2e3063c694
WIP: engine: Maketo engine: MakeInhume
operation handle objects in parallelInhume
operation handle objects in parallel@ -99,0 +91,4 @@
for _, addr := range prm.addrs {
select {
case <-ctx.Done():
taskCancel(context.Cause(ctx))
Probably I got it wrong, but does it really make sense to cancel inherited
taskCtx
withcontext deadline exceed
. It seemstaskCtx
is "done" by the same reason?Yes, you're right. I fixed it
2e3063c694
to8ea7b8a504
@ -99,0 +96,4 @@
}
wg.Add(1)
if err := e.inhumePool.Submit(func() {
To tell the truth, I don't like this approach. It turns out that one request with a large number of addresses to inhume can block other
Inhume
requests. I suggest using concurrent execution for each request.Threw a worker pool away and use grouping objects by shard instead
8ea7b8a504
to3fe0a84364
engine: Maketo WIP: engine: MakeInhume
operation handle objects in parallelInhume
operation handle objects in parallel3fe0a84364
to9ac33f8d1c
9ac33f8d1c
tod06418592e
d06418592e
to220caa401a
220caa401a
tob2bb62fe9b
b2bb62fe9b
to9823510de2
WIP: engine: Maketo engine: OptimizeInhume
operation handle objects in parallelInhume
operation to improve speed with large object sets9823510de2
toc712742c4e
@ -131,0 +135,4 @@
//
// If checkLocked is set, [apistatus.ObjectLocked] will be returned if any of
// the objects are locked.
func (e *StorageEngine) groupObjectsByShard(ctx context.Context, addrs []oid.Address, checkLocked bool) (map[string][]int, error) {
I suggest to replace
map[string][]int
withmap[string][]oid.Address
. Anyway you do this ininhume
method.The initial motivation was to use less space to store grouped addresses: suppose we store indexes not addresses, we create one address batch and inhume it, and so on, and Go's GC will be able to free memory of the already used address batches.
However, I think it's slightly pointless in this case. You're right, I fixed it
@ -100,0 +93,4 @@
var errLocked *apistatus.ObjectLocked
for shardID, addrIndexes := range addrsPerShard {
I suggest to perform Inhume on different shards concurrently. Or I missing something?
I discussed this with @fyrchik, and we decided not to do it concurrently because the grouping itself increases the inhume speed enough. At least, we won't be doing it for now
c712742c4e
to86d127b76a
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.