Simple object with lock and expiration time not removed after locks are expired #145
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#145
Loading…
Reference in a new issue
No description provided.
Delete branch "dstepanov-yadro/frostfs-node:bug/OBJECT-2279"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Let say we have:
Current epoch is 100
Simple object with NEOFSEXPIRATIONEPOCH=101
Simple object is locked with lock until epoch 103
Steps:
Tick epochs until current epoch = 103
Check object availability
Expected Behavior
Simple object should be removed since locks are gone and object expired
Current Behavior
Simple object continues to present in storage even if we wait additional extra epochs
705aad7370
to699485924e
WIP: Simple object with lock and expiration time not removed after locks are expiredto Simple object with lock and expiration time not removed after locks are expired@ -121,1 +120,3 @@
removerSleepInterval time.Duration
removerBatchSize int
removerSleepInterval time.Duration
expiredCollectorBatchSize int
With many params our system will be harder to configure (the process of fine-tuning and determining what the best value is is too slow and error-prone). We already have many complex dependencies between parameter values and system behaviour.
Can we just take the same batch size? Do we have a reason, why should they be different?
removerBatchSize
is batch size for removing objects.expiredCollectorBatchSize
is batch size for marking objects expired.These parameters relate to different processes. In the case of one parameter, we will create another complex dependency.
@ -122,0 +120,4 @@
removerBatchSize int
removerSleepInterval time.Duration
expiredCollectorBatchSize int
expiredCollectorWorkersCount int
Don't forget to update
docs/storage-node-configuration.md
.Done.
@ -242,0 +268,4 @@
errGroup, egCtx := errgroup.WithContext(ctx)
errGroup.SetLimit(workersCount)
errGroup.Go(func() error {
Why do we need a goroutine here?
If
s.getExpiredObjects
fails with an error then all other goroutines executings.handleExpiredObjects
will be cancelled byegCtx
, and vice versa. It is possible to write code without this goroutine, but code will be much more tricky.699485924e
toacf79b8425
@ -237,0 +248,4 @@
s.collectExpiredObjects(ctx, e)
}
func (s *Shard) getExpiredObjectsParameters() (workersCount, batchSize int) {
Why not doing it once on init?
getExpiredObjectsParameters()
doesn't look hard to run.I somewhat agree, but here we add a function which looks unnecessary
batchSize := c.gcCfg.getExpiredObjectBatchSize
doesn't make the code simpler to me.Also, how would SIGHUP become easier?
I guess I see, we could lock in the function and lock during SIGHUP.
0112da3974
to7a31988a36