[#1648] writecache: Fix race condition when reporting cache size metrics
All checks were successful
Vulncheck / Vulncheck (push) Successful in 1m2s
Build / Build Components (push) Successful in 1m53s
Pre-commit hooks / Pre-commit (push) Successful in 2m7s
Tests and linters / Run gofumpt (push) Successful in 2m45s
Tests and linters / Tests with -race (push) Successful in 3m17s
Tests and linters / Lint (push) Successful in 3m20s
Tests and linters / gopls check (push) Successful in 3m15s
Tests and linters / Staticcheck (push) Successful in 3m21s
Tests and linters / Tests (push) Successful in 3m37s
OCI image / Build container images (push) Successful in 4m37s
DCO action / DCO (pull_request) Successful in 28s
Vulncheck / Vulncheck (pull_request) Successful in 50s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m28s
Build / Build Components (pull_request) Successful in 1m38s
Tests and linters / Run gofumpt (pull_request) Successful in 3m0s
Tests and linters / Tests with -race (pull_request) Successful in 3m10s
Tests and linters / Tests (pull_request) Successful in 3m14s
Tests and linters / Lint (pull_request) Successful in 3m21s
Tests and linters / Staticcheck (pull_request) Successful in 3m21s
Tests and linters / gopls check (pull_request) Successful in 3m34s

There is a race condition when multiple cache operation try to report
the cache size metrics simultaneously. Consider the following example:
- the initial total size of objects stored in the cache size is 2
- worker X deletes an object and reads the cache size, which is 1
- worker Y deletes an object and reads the cache size, which is 0
- worker Y reports the cache size it learnt, which is 0
- worker X reports the cache size it learnt, which is 1

As a result, the observed cache size is 1 (i. e. one object remains
in the cache), which is incorrect because the actual cache size is 0.

To fix this, let's report the metrics periodically in the flush loop.

Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
This commit is contained in:
Aleksey Savchuk 2025-02-18 10:51:43 +03:00
parent 9b29e7392f
commit 02f3a7f65c
Signed by: a-savchuk
GPG key ID: 70C0A7FF6F9C4639

View file

@ -87,6 +87,9 @@ func (c *cache) pushToFlushQueue(ctx context.Context, fl *flushLimiter) {
}
c.modeMtx.RUnlock()
// counter changed by fstree
c.estimateCacheSize()
case <-ctx.Done():
return
}