There is a race condition when multiple cache operation try to report
the cache size metrics simultaneously. Consider the following example:
- the initial total size of objects stored in the cache size is 2
- worker X deletes an object and reads the cache size, which is 1
- worker Y deletes an object and reads the cache size, which is 0
- worker Y reports the cache size it learnt, which is 0
- worker X reports the cache size it learnt, which is 1
As a result, the observed cache size is 1 (i. e. one object remains
in the cache), which is incorrect because the actual cache size is 0.
To fix this, let's report the metrics periodically in the flush loop.
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
Since Go 1.22 a "for" statement with a "range" clause is able
to iterate through integer values from zero to an upper limit.
gopatch script:
@@
var i, e expression
@@
-for i := 0; i <= e - 1; i++ {
+for i := range e {
...
}
@@
var i, e expression
@@
-for i := 0; i <= e; i++ {
+for i := range e + 1 {
...
}
@@
var i, e expression
@@
-for i := 0; i < e; i++ {
+for i := range e {
...
}
Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
Badger implementation isn't tested and works not well,
but requires human resources to maintain.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction.
`View` transaction itself takes `mmapLock` [1], which is lifted after tx
finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`)
When we evict items from LRU cache mutex order is different:
first we take LRU mutex and then execute `Batch` which _does_ take
`mmapLock` in case we need to remap. Thus the deadlock.
[1] 8f4a7e1f92/db.go (L708)
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
It was needed before we started to flush during transition to
`degraded` mode. Now it is confusing.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Degraded mode allows us to operate without an SSD,
thus writecache should be unavailable in this mode.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Allow user to initiate flushing objects from a writecache.
We need this in 2 cases:
1. During writecache storage schema update, it should be flushed with
the old version of node and started clean with a new one.
2. During SSD replacement, to avoid data loss.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Set flush mark in the inside the flush worker because writing to the blobstor
can fail. Because each evicted object must be deleted, it is reasonable
to do this in the evict callback.
The evict callback is protected by LRU mutex and thus potentially interferes
with `Get` and `Iterate` methods. This problem will be addressed in the
future.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
1. Remove in-memory cache. It doesn't persist objects and if we want
more speed, `NoSync` option can be used for the bolt DB.
2. Put to the metabase in a synchronous fashion. This considerably
simplifies overall logic and plays nicely with the metabase bolt DB
batch settings.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>