[#421] Try using badger for the write-cache #462

Merged
fyrchik merged 1 commit from ale64bit/frostfs-node:feature/421-write-cache-badger into master 2023-08-07 08:17:02 +00:00
Member

Signed-off-by: Alejandro Lopez a.lopez@yadro.com

Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
ale64bit changed title from [#421] Try using badger for the write-cache to WIP: [#421] Try using badger for the write-cache 2023-06-22 12:03:24 +00:00
ale64bit force-pushed feature/421-write-cache-badger from 1f170cad91 to 3d7c992409 2023-06-23 09:55:44 +00:00 Compare
ale64bit changed title from WIP: [#421] Try using badger for the write-cache to [#421] Try using badger for the write-cache 2023-06-23 09:56:41 +00:00
ale64bit requested review from storage-core-committers 2023-06-23 09:56:54 +00:00
ale64bit requested review from storage-core-developers 2023-06-23 09:56:55 +00:00
ale64bit added the
discussion
label 2023-06-23 09:56:59 +00:00

https://dgraph.io/docs/badger/get-started/#garbage-collection

Badger relies on the client to perform garbage collection at a time of their choosing.

I think we should add some GC goroutine.

https://dgraph.io/docs/badger/get-started/#garbage-collection ``` Badger relies on the client to perform garbage collection at a time of their choosing. ``` I think we should add some GC goroutine.
ale64bit force-pushed feature/421-write-cache-badger from 3d7c992409 to 0df45009ed 2023-06-26 07:28:53 +00:00 Compare
ale64bit reviewed 2023-06-26 07:30:11 +00:00
@ -0,0 +13,4 @@
case <-c.closeCh:
return
case <-t.C:
for {
Author
Member

Unfortunately linter doesn't allow writing:

for c.db.RunValueLogGC(0.5) == nil {
}

because of empty body, even though the function call is stateful.

Unfortunately linter doesn't allow writing: ``` for c.db.RunValueLogGC(0.5) == nil { } ``` because of empty body, even though the function call is stateful.

hack the linter: add logging:)

hack the linter: add logging:)
Author
Member

done :)

done :)
Author
Member

https://dgraph.io/docs/badger/get-started/#garbage-collection

Badger relies on the client to perform garbage collection at a time of their choosing.

I think we should add some GC goroutine.

Done

> https://dgraph.io/docs/badger/get-started/#garbage-collection > > ``` > Badger relies on the client to perform garbage collection at a time of their choosing. > ``` > > I think we should add some GC goroutine. Done
ale64bit force-pushed feature/421-write-cache-badger from 0df45009ed to eb5c961501 2023-06-26 09:23:50 +00:00 Compare
dstepanov-yadro approved these changes 2023-06-26 11:09:37 +00:00
ale64bit requested review from storage-core-committers 2023-06-27 07:14:50 +00:00
dstepanov-yadro approved these changes 2023-06-28 06:32:07 +00:00
acid-ant approved these changes 2023-06-28 19:09:53 +00:00
@ -0,0 +39,4 @@
func (c *cache) runFlushLoop() {
ch := c.closeCh
c.wg.Add(1)
go func() {
Member

Looks redundant. Why we need to read from closeCh in routine and do nothing?

Looks redundant. Why we need to read from `closeCh` in routine and do nothing?
Author
Member

fixed

fixed
acid-ant marked this conversation as resolved
ale64bit force-pushed feature/421-write-cache-badger from eb5c961501 to 590192ce4a 2023-06-29 06:57:52 +00:00 Compare
Author
Member

Besides the benchmarks in this PR, I added some write test results in the original issue: #421 (comment)

Besides the benchmarks in this PR, I added some write test results in the original issue: https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/421#issuecomment-14409
ale64bit force-pushed feature/421-write-cache-badger from 590192ce4a to 4e445deb59 2023-07-03 09:44:03 +00:00 Compare
Author
Member

Added a config to select the writecache implementation.

Added a config to select the writecache implementation.
ale64bit requested review from acid-ant 2023-07-03 09:44:43 +00:00
ale64bit requested review from dstepanov-yadro 2023-07-03 09:44:47 +00:00
acid-ant approved these changes 2023-07-03 11:29:15 +00:00
aarifullin approved these changes 2023-07-03 14:23:29 +00:00
dstepanov-yadro reviewed 2023-07-05 08:37:27 +00:00
@ -126,6 +128,7 @@ type shardCfg struct {
writecacheCfg struct {
enabled bool
typ writecacheconfig.Type

I hope this option is just for testing?

I hope this option is just for testing?
Author
Member

wdym?

wdym?

If we are going to support two DB's, we will have to provide an update procedure. If, for example, I changed bbolt to badger in the config, then the system must first flush objects from bbolt, and then change the storage to badger.

If we are going to support two DB's, we will have to provide an update procedure. If, for example, I changed bbolt to badger in the config, then the system must first flush objects from bbolt, and then change the storage to badger.
Author
Member

I see. Yes, I agree. This option is mostly intended to be able to use both during a transition period. But maybe @fyrchik knows better.

I see. Yes, I agree. This option is mostly intended to be able to use both during a transition period. But maybe @fyrchik knows better.
dstepanov-yadro marked this conversation as resolved
dstepanov-yadro approved these changes 2023-07-05 08:40:31 +00:00
ale64bit force-pushed feature/421-write-cache-badger from 4e445deb59 to e7358b9808 2023-07-07 08:24:14 +00:00 Compare
ale64bit force-pushed feature/421-write-cache-badger from e7358b9808 to c95d71346e 2023-07-07 11:24:01 +00:00 Compare
ale64bit force-pushed feature/421-write-cache-badger from c95d71346e to 3b53f8fbb1 2023-07-10 07:27:19 +00:00 Compare
Author
Member

@fyrchik what's the plan with this PR?

@fyrchik what's the plan with this PR?
dstepanov-yadro approved these changes 2023-07-28 12:51:00 +00:00
aarifullin reviewed 2023-08-01 11:09:25 +00:00
@ -0,0 +118,4 @@
var err error
if c.db != nil {
err = c.db.Close()
if err != nil {
Member

But if it is successfully closed, it is still non-nil and db can keep semi-closed state. Is it correct? Do not we need to assign c.db = nil always?

But if it is successfully closed, it is still non-nil and `db` can keep semi-closed state. Is it correct? Do not we need to assign `c.db = nil` always?
Author
Member

Hmm, that's interesting. I just copied it from the existing writecache:
https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/pkg/local_object_storage/writecache/writecache.go#L166-L173

I think you are right. But let's address it in a different PR, because I'm afraid @fyrchik will say "it's unrelated to the PR" or smth.

Hmm, that's interesting. I just copied it from the existing writecache: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/pkg/local_object_storage/writecache/writecache.go#L166-L173 I think you are right. But let's address it in a different PR, because I'm afraid @fyrchik will say "it's unrelated to the PR" or smth.
aarifullin approved these changes 2023-08-01 11:23:39 +00:00
ale64bit force-pushed feature/421-write-cache-badger from 3b53f8fbb1 to 6ac9730106 2023-08-02 08:51:02 +00:00 Compare
ale64bit force-pushed feature/421-write-cache-badger from 6ac9730106 to 453ff301d1 2023-08-02 08:54:36 +00:00 Compare
fyrchik reviewed 2023-08-02 10:57:06 +00:00
@ -719,2 +714,2 @@
writecache.WithLogger(c.log),
)
switch wcRead.typ {
case writecacheconfig.TypeBBolt:
Owner

What do you think about hiding the implementation behind an interface?
Basically, we provide type and type specific options, the implementation picks option for type.

What do you think about hiding the implementation behind an interface? Basically, we provide type and type specific options, the implementation picks option for type.
Author
Member

I rather don't do that if we are not planning to maintain multiple implementations.

I rather don't do that if we are not planning to maintain multiple implementations.
fyrchik marked this conversation as resolved
@ -38,0 +51,4 @@
return writecacheconfig.TypeBadger
}
panic(fmt.Sprintf("invalid writecache type: %q", t))
Owner

empty string should be equal to bbolt for compatibility

empty string should be equal to `bbolt` for compatibility
Author
Member

done

done
fyrchik marked this conversation as resolved
@ -39,3 +41,3 @@
}
func newCustomShard(t testing.TB, rootPath string, enableWriteCache bool, wcOpts []writecache.Option, bsOpts []blobstor.Option, metaOptions []meta.Option) *shard.Shard {
func newCustomShard(t testing.TB, rootPath string, enableWriteCache bool, wcOpts writecacheconfig.Options, bsOpts []blobstor.Option, metaOptions []meta.Option) *shard.Shard {
Owner

Why not provide a pointer? (so we just use nil in the caller as before.

Why not provide a pointer? (so we just use `nil` in the caller as before.
Author
Member

I prefer the explicit type at the caller site.

I prefer the explicit type at the caller site.
Owner

Ok, but this makes me think in situations where I don't care (enabledWriteCache=false)

Ok, but this makes me _think_ in situations where I don't care (`enabledWriteCache=false`)
Author
Member

so, do you want me change it or ... ?

so, do you want me change it or ... ?
Owner

As you wish, not worth debating, just providing an argument.

As you wish, not worth debating, just providing an argument.
@ -0,0 +80,4 @@
continue
}
// We put objects in batches of fixed size to not interfere with main put cycle a lot.
Owner

This comment may no longer be valid. The problem with bolt is that in-progress View tx can prevent remapping. For badger this could be optimized.

This comment may no longer be valid. The problem with `bolt` is that in-progress `View` tx can prevent remapping. For badger this could be optimized.
Author
Member

done

done
Owner

Ok, we still have the old algorithm. Could you create a task to improve this then? (I believe we can send inside the transaction without intermediate slice).

Ok, we still have the old algorithm. Could you create a task to improve this then? (I believe we _can_ send inside the transaction without intermediate slice).
Author
Member
https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/568
fyrchik marked this conversation as resolved
@ -0,0 +111,4 @@
check(t, mb, bs, objects)
})
t.Run("flush on moving to degraded mode", func(t *testing.T) {
Owner

These tests check logical behaviour which should hold true for any writecache implementation, what about making them generic (sth similar to blobstor/internal/blobstortest)?

These tests check logical behaviour which should hold true for any writecache implementation, what about making them generic (sth similar to `blobstor/internal/blobstortest`)?
Author
Member

done

done
fyrchik marked this conversation as resolved
@ -0,0 +198,4 @@
return obj, data
}
func initWC(t *testing.T, wc writecache.Cache) {
Owner

This is removed in master, can we remove it here too?
Initially it was a separate function because it had auxiliary logic, not it has not.

This is removed in master, can we remove it here too? Initially it was a separate function because it had auxiliary logic, not it has not.
Author
Member

done

done
fyrchik marked this conversation as resolved
@ -0,0 +14,4 @@
go func() {
defer c.wg.Done()
t := time.NewTicker(defaultGCInterval)
Owner

Why not make this a tunable parameter?

Why not make this a tunable parameter?
Author
Member

done

done
fyrchik marked this conversation as resolved
@ -0,0 +22,4 @@
case <-c.closeCh:
return
case <-t.C:
for c.db.RunValueLogGC(0.5) == nil {
Owner

Magic constant, can we move it to default... too?

Magic constant, can we move it to `default...` too?
Author
Member

it's not magic, added a comment.

it's not magic, added a comment.
Owner

Everything is magic until one finally understands :)

Everything is magic until one finally understands :)
fyrchik marked this conversation as resolved
@ -0,0 +18,4 @@
// Returns ErrNotInitialized if write-cache has not been initialized yet.
// Returns ErrOutOfSpace if saving an object leads to WC's size overflow.
// Returns ErrBigObject if an objects exceeds maximum object size.
func (c *cache) Put(ctx context.Context, prm common.PutPrm) (common.PutRes, error) {
Owner

So we no longer store object in the fstree here?

So we no longer store object in the fstree here?
Author
Member

that's right.

that's right.
fyrchik marked this conversation as resolved
@ -0,0 +10,4 @@
func (c *cache) estimateCacheSize() uint64 {
onDiskSize, _ := c.db.EstimateSize(nil)
c.metrics.SetEstimateSize(onDiskSize, 0)
Owner

So we update this before each put?
What about doing this after?

Also, do we need to do this after each GC cycle?

So we update this _before_ each put? What about doing this after? Also, do we need to do this after each GC cycle?
Author
Member

I kept the same behavior from the existing writecache. We can change it in a separate PR for both.
I'm not sure why to do it after GC, since it might not remove anything from disk until the compactor runs anyway, I believe.

I kept the same behavior from the existing writecache. We can change it in a separate PR for both. I'm not sure why to do it after GC, since it might not remove anything from disk until the compactor runs anyway, I believe.
fyrchik marked this conversation as resolved
@ -0,0 +41,4 @@
opts.PrefetchValues = false
it := tx.NewIterator(opts)
defer it.Close()
for it.Rewind(); it.Valid(); it.Next() {
Owner

Wow, this may affect startup time significantly. As I understand, badger doesn't have a key with a number of items?

Wow, this may affect startup time significantly. As I understand, badger doesn't have a key with a number of items?
Author
Member
Not that I know of. https://discuss.dgraph.io/t/count-of-items-in-db/7549
Owner

Ok, I have created #561

Ok, I have created https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/561
fyrchik marked this conversation as resolved
@ -0,0 +6,4 @@
// OpenDB opens a badger instance for write-cache. Opens in read-only mode if ro is true.
func OpenDB(p string, ro bool) (*badger.DB, error) {
return badger.Open(badger.DefaultOptions(p).WithReadOnly(ro).WithLoggingLevel(badger.ERROR))
Owner

What kind of logger does it use? Can we use zap there? (we have many possible customizations, and currently we can make an assumption that zap is the only thing we need to configure).

What kind of logger does it use? Can we use `zap` there? (we have many possible customizations, and currently we can make an assumption that `zap` is the only thing we need to configure).
Author
Member

done

done
fyrchik marked this conversation as resolved
ale64bit force-pushed feature/421-write-cache-badger from 453ff301d1 to 4d88bdd4e5 2023-08-02 15:26:08 +00:00 Compare
fyrchik merged commit 1a0cb0f34a into master 2023-08-07 08:17:02 +00:00
fyrchik added the
badger
label 2023-08-09 10:51:56 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#462
No description provided.