[#1648] writecache: Fix race condition when reporting cache size metrics
All checks were successful
Vulncheck / Vulncheck (push) Successful in 1m2s
Build / Build Components (push) Successful in 1m53s
Pre-commit hooks / Pre-commit (push) Successful in 2m7s
Tests and linters / Run gofumpt (push) Successful in 2m45s
Tests and linters / Tests with -race (push) Successful in 3m17s
Tests and linters / Lint (push) Successful in 3m20s
Tests and linters / gopls check (push) Successful in 3m15s
Tests and linters / Staticcheck (push) Successful in 3m21s
Tests and linters / Tests (push) Successful in 3m37s
OCI image / Build container images (push) Successful in 4m37s
DCO action / DCO (pull_request) Successful in 28s
Vulncheck / Vulncheck (pull_request) Successful in 50s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m28s
Build / Build Components (pull_request) Successful in 1m38s
Tests and linters / Run gofumpt (pull_request) Successful in 3m0s
Tests and linters / Tests with -race (pull_request) Successful in 3m10s
Tests and linters / Tests (pull_request) Successful in 3m14s
Tests and linters / Lint (pull_request) Successful in 3m21s
Tests and linters / Staticcheck (pull_request) Successful in 3m21s
Tests and linters / gopls check (pull_request) Successful in 3m34s
All checks were successful
Vulncheck / Vulncheck (push) Successful in 1m2s
Build / Build Components (push) Successful in 1m53s
Pre-commit hooks / Pre-commit (push) Successful in 2m7s
Tests and linters / Run gofumpt (push) Successful in 2m45s
Tests and linters / Tests with -race (push) Successful in 3m17s
Tests and linters / Lint (push) Successful in 3m20s
Tests and linters / gopls check (push) Successful in 3m15s
Tests and linters / Staticcheck (push) Successful in 3m21s
Tests and linters / Tests (push) Successful in 3m37s
OCI image / Build container images (push) Successful in 4m37s
DCO action / DCO (pull_request) Successful in 28s
Vulncheck / Vulncheck (pull_request) Successful in 50s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m28s
Build / Build Components (pull_request) Successful in 1m38s
Tests and linters / Run gofumpt (pull_request) Successful in 3m0s
Tests and linters / Tests with -race (pull_request) Successful in 3m10s
Tests and linters / Tests (pull_request) Successful in 3m14s
Tests and linters / Lint (pull_request) Successful in 3m21s
Tests and linters / Staticcheck (pull_request) Successful in 3m21s
Tests and linters / gopls check (pull_request) Successful in 3m34s
There is a race condition when multiple cache operation try to report the cache size metrics simultaneously. Consider the following example: - the initial total size of objects stored in the cache size is 2 - worker X deletes an object and reads the cache size, which is 1 - worker Y deletes an object and reads the cache size, which is 0 - worker Y reports the cache size it learnt, which is 0 - worker X reports the cache size it learnt, which is 1 As a result, the observed cache size is 1 (i. e. one object remains in the cache), which is incorrect because the actual cache size is 0. To fix this, let's report the metrics periodically in the flush loop. Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
This commit is contained in:
parent
9b29e7392f
commit
02f3a7f65c
1 changed files with 3 additions and 0 deletions
|
@ -87,6 +87,9 @@ func (c *cache) pushToFlushQueue(ctx context.Context, fl *flushLimiter) {
|
|||
}
|
||||
|
||||
c.modeMtx.RUnlock()
|
||||
|
||||
// counter changed by fstree
|
||||
c.estimateCacheSize()
|
||||
case <-ctx.Done():
|
||||
return
|
||||
}
|
||||
|
|
Loading…
Add table
Reference in a new issue