Create a list of possible badger tweaks/restrictions/problems #555

Closed
opened 2023-08-02 07:26:38 +00:00 by fyrchik · 5 comments
  1. We would like to have a small but flexible configuration.
  2. We would like to have some guide on how to set the parameters properly (and how one parameter influences another)
  3. Possible problems we can encounter (we already have a GC one) and how to avoid them.

This must be done taking the whole system into account (some problems are OK because we have sharding, we have a limited number of in-flight put requests etc.).

1. We would like to have a small but flexible configuration. 2. We would like to have some guide on how to set the parameters properly (and how one parameter influences another) 3. Possible problems we can encounter (we already have a GC one) and how to avoid them. This must be done taking the whole system into account (some problems are OK because we have sharding, we have a limited number of in-flight put requests etc.).
fyrchik added the
frostfs-node
label 2023-08-02 07:26:47 +00:00
ale64bit was assigned by fyrchik 2023-08-02 07:26:52 +00:00
Collaborator

So far the only ones I'm aware of:

  1. GC: normally the way to avoid it is to have multiple shards to distribute the cost of compacting.
  2. Limitations on the max size of a single db (~1TB I think): there are some bugs related to this, and the normal fix is to increase the max level or max level multiplier, so as long as we can control that or tweak it by config it should be alright. See https://discuss.dgraph.io/t/badger-panics-after-db-reaches-1-1t/16092/1.
So far the only ones I'm aware of: 1. GC: normally the way to avoid it is to have multiple shards to distribute the cost of compacting. 2. Limitations on the max size of a single db (~1TB I think): there are some bugs related to this, and the normal fix is to increase the max level or max level multiplier, so as long as we can control that or tweak it by config it should be alright. See https://discuss.dgraph.io/t/badger-panics-after-db-reaches-1-1t/16092/1.
Poster
Owner

Limitations on the max size of a single db (~1TB I think)

So we need to check writecache.capacity and tweak parameters based on it's value?

> Limitations on the max size of a single db (~1TB I think) So we need to check `writecache.capacity` and tweak parameters based on it's value?
Collaborator

Limitations on the max size of a single db (~1TB I think)

So we need to check writecache.capacity and tweak parameters based on it's value?

Yes. Although even if we can compute the max size, we shouldn't create such large cache capacity (~1TB is the approx. max size for the default parameters) since it wouldn't work that well anyway.

> > Limitations on the max size of a single db (~1TB I think) > > So we need to check `writecache.capacity` and tweak parameters based on it's value? Yes. Although even if we can compute the max size, we shouldn't create such large cache capacity (~1TB is the approx. max size for the default parameters) since it wouldn't work that well anyway.
fyrchik added the
badger
label 2023-08-09 10:51:30 +00:00
fyrchik added this to the v0.38.0 milestone 2023-08-23 10:41:58 +00:00
dstepanov-yadro self-assigned this 2023-11-29 15:42:21 +00:00
ale64bit was unassigned by dstepanov-yadro 2023-11-29 15:42:21 +00:00

PR #833 contains configuration with some explanations.

PR https://git.frostfs.info/TrueCloudLab/frostfs-node/pulls/833 contains configuration with some explanations.

Main config options:

SyncWrites - be default False, but this may lead to data lost, so need to set True.

IndexCacheSize - stores bloom filters, by defualt all indicies are stored in memory, but it's better to limit it with some value.

NumMemtables - max number of MemTables in memory before stall => possible stalling cause.

NumLevelZeroTablesStall - another possible stalling cause.

ValueThreshold and VLogPercentile - these parameters determine where the value will be stored: LSM tree or vLog.

NumCompactors - compact workers count.

See https://github.com/dgraph-io/badger/blob/v20.07.0/options.go for full config list.

Main config options: SyncWrites - be default False, but this may lead to data lost, so need to set True. IndexCacheSize - stores bloom filters, by defualt all indicies are stored in memory, but it's better to limit it with some value. NumMemtables - max number of MemTables in memory before stall => possible stalling cause. NumLevelZeroTablesStall - another possible stalling cause. ValueThreshold and VLogPercentile - these parameters determine where the value will be stored: LSM tree or vLog. NumCompactors - compact workers count. See https://github.com/dgraph-io/badger/blob/v20.07.0/options.go for full config list.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#555
There is no content yet.