[#1648] writecache: Fix race condition when reporting cache size metrics

There is a race condition when multiple cache operation try to report the cache size metrics simultaneously. Consider the following example: - the initial total size of objects stored in the cache size is 2 - worker X deletes an object and reads the cache size, which is 1 - worker Y deletes an object and reads the cache size, which is 0 - worker Y reports the cache size it learnt, which is 0 - worker X reports the cache size it learnt, which is 1 As a result, the observed cache size is 1 (i. e. one object remains in the cache), which is incorrect because the actual cache size is 0. To fix this, let's report the metrics periodically in the flush loop. Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
2025-02-18 10:51:43 +03:00 · 2025-02-18 10:51:43 +03:00 · 02f3a7f65c
commit 02f3a7f65c
parent 9b29e7392f
1 changed files with 3 additions and 0 deletions
--- a/pkg/local_object_storage/writecache/flush.go
+++ b/pkg/local_object_storage/writecache/flush.go
@ -87,6 +87,9 @@ func (c *cache) pushToFlushQueue(ctx context.Context, fl *flushLimiter) {
 			}

 			c.modeMtx.RUnlock()
+
+			// counter changed by fstree
+			c.estimateCacheSize()
 		case <-ctx.Done():
 			return
 		}