writecache: Fix metric values #759

Merged
dstepanov-yadro merged 1 commit from dstepanov-yadro/frostfs-node:fix/wc_counter into master 2023-10-30 13:05:01 +00:00

Closes #725

Main problem is that c.metrics.Evict (also c.metrics.Delete and c.metrics.Add) changes actual count metric value, but the same object can be flushed twice, so in this case metric value can be negative.

Now only one place left with metrics update - c.estimateCacheSize().

Thanx to @elebedeva for troubleshooting, bug reproduce and fix suggestion.

Closes #725 Main problem is that `c.metrics.Evict` (also `c.metrics.Delete` and `c.metrics.Add`) changes actual count metric value, but the same object can be flushed twice, so in this case metric value can be negative. Now only one place left with metrics update - `c.estimateCacheSize()`. Thanx to @elebedeva for troubleshooting, bug reproduce and fix suggestion.
dstepanov-yadro reviewed 2023-10-27 09:20:58 +00:00
@ -60,6 +60,7 @@ func (c *cache) runFlushLoop(ctx context.Context) {
case <-tt.C:
c.flushSmallObjects(ctx)
tt.Reset(defaultFlushInterval)
c.estimateCacheSize()
Author
Member

Scenario

Cache DB contains two object
Goroutine 1: flush and delete object 1, decrements counter, counter value = 1
Goroutine 2: flush and delete object 2, decrements counter, counter value = 0, set metric value to 0
Goroutine 1: set metric value to 1

Now metric will be updated every 1 second in case of empty cache.

Scenario Cache DB contains two object Goroutine 1: flush and delete object 1, decrements counter, counter value = 1 Goroutine 2: flush and delete object 2, decrements counter, counter value = 0, set metric value to 0 Goroutine 1: set metric value to 1 Now metric will be updated every 1 second in case of empty cache.
Owner

In you scenario we have metric=1, when it should be 0. What is the scenario for the negative metric?

In you scenario we have metric=1, when it should be 0. What is the scenario for the negative metric?
Author
Member

Described in main comment.

When we flush the same object twice: we can put the same object in channel twice, if worker haven't deleted it yet. c.metrics.Evict is called twice, so metric counter also was decremented twice, because recordDeleted is not checked in c.metrics.Evict: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/pkg/local_object_storage/writecache/writecachebbolt/storage.go#L81

Described in main comment. When we flush the same object twice: we can put the same object in channel twice, if worker haven't deleted it yet. `c.metrics.Evict` is called twice, so metric counter also was decremented twice, because `recordDeleted` is not checked in `c.metrics.Evict`: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/pkg/local_object_storage/writecache/writecachebbolt/storage.go#L81
fyrchik marked this conversation as resolved
dstepanov-yadro requested review from storage-core-committers 2023-10-27 09:21:32 +00:00
dstepanov-yadro requested review from storage-core-developers 2023-10-27 09:21:32 +00:00
dstepanov-yadro force-pushed fix/wc_counter from d368a5aeba to d4b6ebe7e7 2023-10-27 09:23:04 +00:00 Compare
fyrchik approved these changes 2023-10-27 09:28:18 +00:00
@ -73,7 +73,7 @@ func (c *cache) deleteFromDB(key string) {
err := c.db.Batch(func(tx *bbolt.Tx) error {
b := tx.Bucket(defaultBucket)
key := []byte(key)
recordDeleted = !recordDeleted && b.Get(key) != nil
Owner

Why this change?

Why this change?
Author
Member

We discussed it in previous PR with metrics, so it was just not deleted.

We discussed it in previous PR with metrics, so it was just not deleted.
fyrchik marked this conversation as resolved
Owner

We also need this for the support branch.

We also need this for the support branch.
fyrchik added this to the v0.37.0 milestone 2023-10-30 07:04:05 +00:00
fyrchik modified the milestone from v0.37.0 to v0.38.0 2023-10-30 07:04:15 +00:00
dstepanov-yadro requested review from elebedeva 2023-10-30 10:38:41 +00:00
acid-ant approved these changes 2023-10-30 11:23:52 +00:00
elebedeva approved these changes 2023-10-30 12:29:23 +00:00
dstepanov-yadro merged commit d4b6ebe7e7 into master 2023-10-30 13:05:01 +00:00
dstepanov-yadro deleted branch fix/wc_counter 2023-10-30 13:05:02 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#759
No description provided.