Fix writecache counters #595
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#595
Loading…
Reference in a new issue
No description provided.
Delete branch "dstepanov-yadro/frostfs-node:fix/writecache_bbolt_db_counter"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #585
There are several problems with writecache counters:
If the big object was putted more than once and first object wasn't flushed before second put, than the counter will be always more than one.
Small object flush collector can collect objects more than once, so the counter can be negative.
This problems (especially second one) lead to invalid cache size estimation, so write cache can be logically overflown.
This PR fixes this problems.
FSTree benchmarks to check counter logic effect
FSTree master bench:
FSTree current bench:
9334ef5593
toe0acd24642
e0acd24642
to52d9e99115
83a2e8a58a
to44d42a6756
44d42a6756
to87c28d8bd8
87c28d8bd8
to8382f4b785
8382f4b785
toe3de76c494
e3de76c494
toa01676f9f1
5a08d5c137
tobf9f4d9318
bf9f4d9318
to019a1ac0cb
@ -397,4 +432,2 @@
}
// PutStream puts executes handler on a file opened for write.
func (t *FSTree) PutStream(addr oid.Address, handler func(*os.File) error) error {
Unused
@ -74,6 +74,7 @@ func (c *cache) Put(ctx context.Context, prm common.PutPrm) (common.PutRes, erro
if err == nil {
added = true
}
c.estimateCacheSize()
To update metrics after object put
@ -58,3 +60,4 @@
if err := c.fsTree.Open(readOnly); err != nil {
return fmt.Errorf("could not open FSTree: %w", err)
}
if err := c.fsTree.Init(); err != nil {
To init object counter
019a1ac0cb
tofcbce1becb
fcbce1becb
toad4a066ead
@ -67,2 +74,3 @@
b := tx.Bucket(defaultBucket)
return b.Delete([]byte(key))
key := []byte(key)
recordDeleted = !recordDeleted && b.Get(key) != nil
batch func can be called more than once, so check previous value is required
ad4a066ead
to3d775d5434
3d775d5434
to41837d2a10
41837d2a10
to81609b8259
WIP: Fix writecache countersto Fix writecache counters81609b8259
tod76a78eb5d
@ -371,0 +397,4 @@
t.counter.Inc()
var targetFileExists bool
if _, e := os.Stat(p); e == nil {
targetFileExists = true
It seems you can get rid of the flag like that:
:)
No, we have to os.Rename the file regardless of whether it exists.
We still have a race condition with
Stat
in normal scenario (and we dont if this is done under a mutex).No, this is done under a mutex:
@ -97,2 +100,3 @@
)
c.objCounters.IncDB()
if recordAdded {
c.objCounters.cDB.Add(1)
c.objCounters.cDB.Add(1)
->c.objCounters.IncDB()
?c.objCounters.IncDB()
was removed, as there is no need to have this method, all fields are available.@ -15,1 +13,3 @@
return db + fstree
dbCount := c.objCounters.DB()
fsCount := c.objCounters.FS()
if fsCount > 0 {
I suppose
dbCount
andfsCount
may be negative (despite they areuint64
). Don't you need to consider the such case?I mean
fsCount
can beMAX_UINT64_SIZE - n
meaning it's negative but still the check works out and decrementedYes, they may be negative. But what should we do?
Panic or error? Looks too much.
Use zero instead of negative value? Then we won't know for sure that we have a bug.
So negative values are good point to create a bug, but they shouldn't break the program flow.
@ -364,10 +394,21 @@ func (t *FSTree) writeAndRename(tmpPath, p string, data []byte) error {
case syscall.ENOSPC:
err = common.ErrNoSpace
_ = os.RemoveAll(tmpPath)
case syscall.EEXIST:
Why is it removed?
It looks like this
case
statement was used for retries. Retries were deleted, so thiscase
is redundant.@ -89,2 +90,3 @@
b := tx.Bucket(defaultBucket)
return b.Put([]byte(obj.addr), obj.data)
key := []byte(obj.addr)
recordAdded = !recordAdded && b.Get(key) == nil
Why, though? We should always check the actual value, because we cannot assume anything else.
See
boltdb.Batch
comment:I understand this as
function will be called at least once
. Ifkey
was inserted by first call, then it will be already already in db on second call.No, this violates transaction isolation: transaction either applies (nil error) or not (non-nil error, and can be retried in Batch in this case). So even if we applied everything on our first execution, I doubt the changes are persisted.
If what I wrote is false, we have much bigger problems (consider metabase and all its checks, for example)
Oh, I missed the
regardless
part.But no, still, we are talking about the data in the database, the function can be executed twice but under no circumstances should we apply a transaction twice.
Here is a call to batch trigger: https://github.com/etcd-io/bbolt/blob/master/db.go#L983
If any transaction in the batch returns false, all previous functions can be reexecuted with nil error, but not reapplied (we still return err from Update().
Also, how about avoiding PUT in this case?
It should reduce the amount of pages we flush during commit (I am not sure whether its worth the effort, though, the situation is quite rare).
Ok, you're right. Fixed.
@ -351,3 +370,2 @@
err = fmt.Errorf("couldn't read file after %d retries", retryCount)
// unreachable, but precaution never hurts, especially 1 day before release.
tmpPath := p + "#" + uuid.NewString()
Looks like comment above no more relevant.
Hm, which one? Comment was deleted.
Ah. Removed this cool story completely. Thx.
d76a78eb5d
tocf1e5fd529
cf1e5fd529
to193835b50b
@ -0,0 +17,4 @@
func (c *noopCounter) Inc() {}
func (c *noopCounter) Dec() {}
type SimpleCounter struct {
s/SimpleCounter/AtomicCounter/
?No,
atomic
is implementation detailAgree, can we mention in the interface comment that the counter implementation must be thread-safe?
@ -528,7 +534,6 @@ func (t *FSTree) NumberOfObjects() (uint64, error) {
if err != nil {
return 0, fmt.Errorf("could not walk through %s directory: %w", t.RootPath, err)
}
?
Is there some mistake here?
Unrelated style changes mixed with everything else lead to bigger diffs, which make it harder to review.
It is small, true, but I see it in almost every review -- It is not hard to avoid this.
@ -351,3 +348,1 @@
err = fmt.Errorf("couldn't read file after %d retries", retryCount)
// unreachable, but precaution never hurts, especially 1 day before release.
tmpPath := p + "#" + uuid.NewString()
We should not panic in fstree.
uuid replaced with atomic counter.
@ -371,0 +378,4 @@
}
err = os.Rename(tmpPath, p)
if err == nil && targetFileExists {
t.counter.Dec() //target already exists, tmp file removed, so no counter changes
Why do we
Inc
thenDec
instead of just not doingInc
in the first place?It is also nice to increase the counter after the file has been written.
You've answered your question:
It is also nice to increase the counter after the file has been written.
I mean when the temp file is written the object does not yet exists, it start to exist after rename, right?
But temp file exists, right?
Will
Iterate
see it?Yes
As I understand, for
Delete
it is ok, because the lock is already taken (not obvious, but works).Iterate
does not use any locks currently, and I don't see anything changed in this PR.Am I missing something?
renamed to file counter.
@ -72,4 +74,1 @@
inFS-- //small.bolt DB file
}
c.objCounters.cDB.Store(inDB)
Why do we update counters for db and fstree in different places?
db counter relates to db. fstree counter relates to fstree. But both of counters we update in
cache.Open
method.Ok, I don't mind, but to me it adds cognitive complexity, so at some point sth similar to #610 will happen.
193835b50b
to6f0a717af1
6f0a717af1
todec6e34ab2
dec6e34ab2
to5c85aa6cfa
5c85aa6cfa
to2efe9cc1be