[#1648] writecache: Fix race condition when reporting cache size metrics

There is a race condition when multiple cache operation try to report
the cache size metrics simultaneously. Consider the following example:
- the initial total size of objects stored in the cache size is 2
- worker X deletes an object and reads the cache size, which is 1
- worker Y deletes an object and reads the cache size, which is 0
- worker Y reports the cache size it learnt, which is 0
- worker X reports the cache size it learnt, which is 1

As a result, the observed cache size is 1 (i. e. one object remains
in the cache), which is incorrect because the actual cache size is 0.

To fix this, let's report the metrics periodically in the flush loop.

Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
This commit is contained in:
Aleksey Savchuk 2025-02-18 10:51:43 +03:00
parent 9b29e7392f
commit 02f3a7f65c
Signed by: a-savchuk
GPG key ID: 70C0A7FF6F9C4639

View file

@ -87,6 +87,9 @@ func (c *cache) pushToFlushQueue(ctx context.Context, fl *flushLimiter) {
}
c.modeMtx.RUnlock()
// counter changed by fstree
c.estimateCacheSize()
case <-ctx.Done():
return
}