Optimize existence check in the metabase #874
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#874
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
To check whether an object exists we check multiple buckets (locked, gc, graveyard etc.)
Maybe it would be beneficial to store object status by key separately. This way status and existence checks will consult a single bucket.
Now metabase existence check works pretty fast: 5ms for 95q on 1.2K RPS. So multiple buckets check is a problem?
This task is about supporting >60 shards with a rack. We currently linearly check each shard, so even tiny improvement will be multiplied by the amount of shards, so benchmarks should be done in this setup.
WIth big objects I can already see existence checks being noticeable in terms of the amount of allocated objects in this scenario.
Ok, but one more metabase bucket will decrease write latency and increase space amplification, need keep it in mind.
Also it is possible to run existance check concurrently. I read comment in the source code and in the issue https://github.com/nspcc-dev/neofs-node/issues/1146. But I also checked it by myself, and looks like in case of a lot of shards concurrency can lead to a significant performance improvement.
There are benchmark results on my laptop:
master:
with concurrency:
Now existance check performed concurrently.