[#561] writecache: Make badger keys counting async #613

aarifullin · 2023-08-15T21:52:18Z

aarifullin commented

2023-08-15 21:52:18 +00:00

Signed-off-by: Airat Arifullin a.arifullin@yadro.com

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

aarifullin changed title from ~~[#561] writecache: Make badger keys counting async~~ to WIP: [#561] writecache: Make badger keys counting async

2023-08-15 21:52:27 +00:00

aarifullin force-pushed feature/561-init_count from 9ab9c29769 to d858677aff

2023-08-16 08:05:13 +00:00

Compare

aarifullin force-pushed feature/561-init_count from d858677aff to 5d76695294

2023-08-16 08:38:21 +00:00

Compare

aarifullin changed title from ~~WIP: [#561] writecache: Make badger keys counting async~~ to [#561] writecache: Make badger keys counting async

2023-08-16 08:52:52 +00:00

aarifullin

2023-08-16 08:53:11 +00:00

added the
enhancement

badger
labels
requested reviews from storage-core-committers, storage-core-developers

aarifullin force-pushed feature/561-init_count from 5d76695294 to d16417386e

2023-08-16 08:59:50 +00:00

Compare

dstepanov-yadro requested changes 2023-08-16 10:03:50 +00:00

internal/logs/logs.go Outdated

					
				@ -293,2 +293,4 @@

					ShardCouldNotMarkObjectToDeleteInMetabase                               = "could not mark object to delete in metabase"                               // Debug in ../node/pkg/local_object_storage/shard/inhume.go

					WritecacheBadgerInitExperimental                                        = "initializing badger-backed experimental writecache"

					WritecacheBadgerKeysCountByIteration                                    = "failed to count keys from badger by iteration"

					WritecacheBadgerKeysCountFlush                                          = "failed to flush key_count to badger"

dstepanov-yadro commented

2023-08-16 09:52:17 +00:00

Do not use variable names from the code in the log messages. Otherwise, only the developer will be able to understand the log.

aarifullin commented

2023-08-16 10:43:24 +00:00

Fixed, check this out plz

dstepanov-yadro commented

2023-08-16 11:45:38 +00:00

Don't see.

dstepanov-yadro marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/cachebadger.go Outdated

					
				@ -128,0 +139,4 @@

						v := make([]byte, 8)

						cDB, _ := c.objCounters.DB()

						binary.LittleEndian.PutUint64(v, cDB)

						if err := c.putKeyValue(k[:], v); err != nil {

dstepanov-yadro commented

2023-08-16 09:55:50 +00:00

If writecache is full, then this fails. I don't think it is correct behaviour.

aarifullin commented

2023-08-16 10:17:57 +00:00

You're right, but writing the keys_count down to badger is rather advisory than mandatory.
This advisory writing makes the next Open invocation faster.

So, do you have any suggestions how to process this case?

You're right, but writing the `keys_count` down to badger is rather _advisory_ than mandatory. This advisory writing makes the next `Open` invocation faster. So, do you have any suggestions how to process this case?

dstepanov-yadro commented

2023-08-16 11:38:43 +00:00

I think key_count could be just saved to DB, without full-check.

dstepanov-yadro marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/put.go Outdated

					
				@ -83,0 +89,4 @@

						return writecache.ErrOutOfSpace

					}

					wb := c.db.NewWriteBatch()

dstepanov-yadro commented

2023-08-16 09:56:59 +00:00

Since this method is used for flush, batch looks redundant.

aarifullin commented

2023-08-16 10:27:34 +00:00

To be honest, this definition has been taken from

func (c *cache) put(obj objectInfo) error

above

@ale64bit
Can you tell, please, is the batch really redundant here?

To be honest, this definition has been taken from ```golang func (c *cache) put(obj objectInfo) error ``` above @ale64bit Can you tell, please, is the batch really redundant here?

ale64bit commented

2023-08-16 11:05:32 +00:00

There's not much point in having a WriteBatch for a single update. Update can be used instead.
This was introduced by me but it doesn't really match what bbolt does internally via the Batch call. Feel free to correct it.

There's not much point in having a `WriteBatch` for a single update. `Update` can be used instead. This was introduced by me but it doesn't really match what bbolt does internally via the `Batch` call. Feel free to correct it.

dstepanov-yadro marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/state.go

					
				@ -11,2 +15,4 @@

				var ErrInitializing = errors.New("db counter has been not initialized yet")

				func (c *cache) estimateCacheSize() uint64 {

					onDiskSize, _ := c.db.EstimateSize(nil)

dstepanov-yadro commented

2023-08-16 10:01:41 +00:00

The file size on disk is still used here, but bbolt uses the number of objects. But if we use a counter of objects, it is not clear how asynchronous counting will help us.

aarifullin commented

2023-08-16 10:41:44 +00:00

You're right, we don't need to use DB() if we want roughly estimate size of db data.

But we cannot use EstimateSize if need the exact number of keys in DB:

// EstimateSize can be used to get rough estimate of data size for a given prefix.

So, we won't be able to count the exact number using EstimateSize and the rough approximation for keys number may mislead if we use it for metrics or other purposes

cc @fyrchik @ale64bit

You're right, we don't need to use `DB()` if we want roughly estimate size of db data. But we cannot use `EstimateSize` if need the exact number of keys in DB: ```golang // EstimateSize can be used to get rough estimate of data size for a given prefix. ``` So, we won't be able to count the exact number using `EstimateSize` and the rough approximation for keys number may mislead if we use it for metrics or other purposes cc @fyrchik @ale64bit

dstepanov-yadro commented

2023-08-16 11:44:59 +00:00

But such rough approximation with size rounding is used for bbolt.

For example, imagine that you are a user and you see that writecache is not being used. You open the metrics and see that the limit has not been reached. Or vice versa.

But such rough approximation with size rounding is used for bbolt. For example, imagine that you are a user and you see that writecache is not being used. You open the metrics and see that the limit has not been reached. Or vice versa.

aarifullin commented

2023-08-16 12:23:02 +00:00

Probably, we talk about different things. Let me clarify:

DB() gives the exact number of keys currently saved in badger.
We cannot use EstimateSize() to calculate the number of keys: keys = EstimateSize() / EstimateKeySize() - this is incorrect

Probably, we talk about different things. Let me clarify: `DB()` gives the exact number of keys currently saved in badger. We cannot use `EstimateSize()` to calculate the number of keys: `keys = EstimateSize() / EstimateKeySize()` - this is incorrect

aarifullin marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -55,0 +75,4 @@

							if err = Delete(c.db, k[:]); err != nil {

								c.log.Error(logs.WritecacheBadgerKeysCountDelete, zap.Error(err))

							}

							return

dstepanov-yadro commented

2023-08-16 10:03:02 +00:00

Here the function returns, but metrics and objCounter is not updated.

aarifullin commented

2023-08-16 10:35:37 +00:00

Right

dstepanov-yadro commented

2023-08-16 11:40:29 +00:00

But is it correct? Please explain, I probably don't understand the logic.

aarifullin commented

2023-08-16 12:26:34 +00:00

"Right" - I meant "you are right, I forgot to calculate metrics and objCounter" :)
I fixed this - I introduced deleteKey method that deletes key that is raw-represented. It makes trace spanning and calculate metrics

"Right" - I meant "you are right, I forgot to calculate metrics and objCounter" :) I fixed this - I introduced `deleteKey` method that deletes key that is raw-represented. It makes trace spanning and calculate metrics

dstepanov-yadro marked this conversation as resolved

aarifullin force-pushed feature/561-init_count from d16417386e to ee27bae294

2023-08-16 12:17:31 +00:00

Compare

aarifullin reviewed 2023-08-16 12:27:51 +00:00

pkg/local_object_storage/writecache/writecachebadger/delete.go Outdated

					
				@ -71,0 +73,4 @@

				// deleteKey removes object from write-cache by raw-represented key.

				//

				// Returns an error of type apistatus.ObjectNotFound if object is missing in write-cache.

				func (c *cache) deleteKey(ctx context.Context, key []byte) error {

aarifullin commented

2023-08-16 12:27:51 +00:00

This is duplicated from Delete. I am wondering how this definition can be generilized

This is duplicated from `Delete`. I am wondering how this definition can be generilized

aarifullin marked this conversation as resolved

acid-ant reviewed 2023-08-16 13:05:54 +00:00

pkg/local_object_storage/writecache/writecachebadger/put.go Outdated

					
				@ -82,1 +82,4 @@

				}

				// putKeyValue persists raw-represent key and value to the write-cache

				// database and pushes the to the flush workers queue.

acid-ant commented

2023-08-16 09:59:01 +00:00

pushes the to the flush -> pushes them to the flush?

👍 1

acid-ant marked this conversation as resolved

aarifullin force-pushed feature/561-init_count from ee27bae294 to 75ea68b246

2023-08-21 13:28:30 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 75ea68b246 to 8302cf8f3c

2023-08-21 13:33:37 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 8302cf8f3c to cc71a44eba

2023-08-21 13:57:28 +00:00

Compare

aarifullin force-pushed feature/561-init_count from cc71a44eba to a7de302632

2023-08-21 14:24:08 +00:00

Compare

aarifullin force-pushed feature/561-init_count from a7de302632 to bf45a3da36

2023-08-21 15:06:09 +00:00

Compare

aarifullin force-pushed feature/561-init_count from bf45a3da36 to a1fa15c535

2023-08-21 16:03:22 +00:00

Compare

aarifullin force-pushed feature/561-init_count from a1fa15c535 to 07383f823a

2023-08-21 17:22:14 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 07383f823a to f45ebc0199

2023-08-21 17:23:42 +00:00

Compare

aarifullin force-pushed feature/561-init_count from f45ebc0199 to ea7af01d6f

2023-08-21 17:26:36 +00:00

Compare

aarifullin force-pushed feature/561-init_count from ea7af01d6f to ef2e8e1aea

2023-08-22 09:22:06 +00:00

Compare

aarifullin force-pushed feature/561-init_count from ef2e8e1aea to ad8c095391

2023-08-22 09:39:42 +00:00

Compare

requested review from dstepanov-yadro

2023-08-22 11:53:41 +00:00

fyrchik reviewed 2023-08-23 10:39:44 +00:00

pkg/local_object_storage/writecache/writecachebadger/cachebadger.go Outdated

					
				@ -103,2 +127,4 @@

				// Close closes db connection and stops services. Executes ObjectCounters.FlushAndClose op.

				func (c *cache) Close() error {

					c.initCounterWG.Wait()

fyrchik commented

2023-08-22 14:27:38 +00:00

Do we have a way to cancel this process?

aarifullin commented

2023-08-25 09:10:09 +00:00

I have introduced the wait method that is cancelled by timeout

fyrchik commented

2023-08-29 08:22:15 +00:00

So not we wait for some timeout, but it seems we still don't cancel anything.
Key counting process is completely auxiliary and if we have it, this means we didn't have key count on startup, thus we do not lose anything by canceling it.
It seems wrong to block by any time here.

So not we wait for some timeout, but it seems we still don't cancel anything. Key counting process is completely auxiliary and if we have it, this means we didn't have key count on startup, thus we do not lose anything by canceling it. It seems wrong to block by any time here.

aarifullin commented

2023-08-29 11:14:40 +00:00

Okay. It seems you have meant

Do we have a way to cancel the counting at the start of Close invocation

I was confused but for now I got your idea. I will make cancellation for counting

Okay. It seems you have meant > Do we have a way to cancel the counting at the start of `Close` invocation I was confused but for now I got your idea. I will make cancellation for counting

aarifullin commented

2023-08-29 13:30:04 +00:00

@fyrchik
Please, check out how I have implemented soft cancellation

for it.Rewind(); it.Valid(); it.Next() {
	select {
		case <-c.objCounters.cDB.closeCh:
			return errCountCancelled
		case <-c.objCounters.cDB.modeCh:
			return errCountCancelled
		default:
			inDB++
	}
}

And still we need to have c.initCounterWG.Wait() to correctly finish gorotine. You can check explanation for cancelInitCountBySetModeAndWait

@fyrchik Please, check out how I have implemented _soft_ cancellation ```golang for it.Rewind(); it.Valid(); it.Next() { select { case <-c.objCounters.cDB.closeCh: return errCountCancelled case <-c.objCounters.cDB.modeCh: return errCountCancelled default: inDB++ } } ``` And still we need to have `c.initCounterWG.Wait()` to correctly finish gorotine. You can check explanation for `cancelInitCountBySetModeAndWait`

pkg/local_object_storage/writecache/writecachebadger/get.go Outdated

					
				@ -96,0 +108,4 @@

					return value, metaerr.Wrap(err)

				}

				func (c *cache) getRawKeyInternal(key []byte) ([]byte, error) {

fyrchik commented

2023-08-23 10:34:10 +00:00

Why getRawKey and getRawKeyInternal are two different functions?

Why `getRawKey` and `getRawKeyInternal` are two different functions?

aarifullin marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -23,3 +41,4 @@

				}

				func (x *counters) IncDB() {

					x.txWG.Wait()

fyrchik commented

2023-08-22 14:57:12 +00:00

So new operations will hang until counters are finished?

So new operations _will_ hang until counters are finished?

aarifullin commented

2023-08-24 13:23:12 +00:00

counters are finished

Until db transaction starts :)

> counters are finished Until db transaction starts :)

aarifullin commented

2023-08-25 09:20:26 +00:00

Removed this waitGroup and replaced it with isMutable check

Removed this waitGroup and replaced it with `isMutable` check

aarifullin commented

2023-08-30 09:11:07 +00:00

P.S. I am going to resolve the conversation to make the comments readable

aarifullin marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -32,3 +52,3 @@

				func (x *counters) DB() uint64 {

					return x.cDB.Load()

					_, ok := <-x.keysCountDone

fyrchik commented

2023-08-23 10:35:56 +00:00

Again, probably not a good idea: it is easy to miss that "give me the object number" can be blocked by some uncancellable process iterated through the whole database.

aarifullin commented

2023-08-25 09:19:53 +00:00

Hence DB invocation cannot be blocked (I've removed keysCountDone) but it returns 0 if the value has not been initilized yet. I think returning 0 is the best option and better than returning error ErrNotInitYet

Hence `DB` invocation cannot be blocked (I've removed `keysCountDone`) but it returns 0 if the value has not been initilized yet. I think returning 0 is the best option and better than returning error `ErrNotInitYet`

aarifullin commented

2023-08-30 09:11:38 +00:00

P.S. I am going to resolve the conversation to make the comments readable

aarifullin marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -55,0 +74,4 @@

						if !c.options.recountKeys {

							k := keyCountPrefix()

							cDB, err := c.getRawKey(context.Background(), k[:])

fyrchik commented

2023-08-23 10:39:25 +00:00

Do we want auxiliary info receival affecting datapath metrics?

aarifullin commented

2023-08-24 13:22:29 +00:00

Sorry, what is the focus of this comment at this line?
You don't like the idea that metrics are counter within getRawKey?

Sorry, what is the focus of this comment at this line? You don't like the idea that metrics are counter within `getRawKey`?

fyrchik commented

2023-08-24 16:21:17 +00:00

I mean that getRawKey contains metrics and traces, but what we get here is not user data we store.

I mean that `getRawKey` contains metrics and traces, but what we get here is not user data we store.

aarifullin commented

2023-08-25 09:18:02 +00:00

I can agree about metrics but I would like to to offer a compromise option and make getRawKey trace writecache.getRawKey event but not calculate user metrics. WDYT?

I can agree about metrics but I would like to to offer a compromise option and make `getRawKey` trace `writecache.getRawKey` event but not calculate user metrics. WDYT?

👍 1

fyrchik commented

2023-08-25 09:47:41 +00:00

Seems ok, but why bother with traces? We have context.Background() so it will be just a small trace unrelated to anything.

Seems ok, but why bother with traces? We have `context.Background()` so it will be just a small trace unrelated to anything.

aarifullin commented

2023-08-25 11:50:32 +00:00

Alright. Initially I misunderstood how StartSpanFromContext and now I got the point that it does not make sense to pass context until initCounter <- Open .... recieve context. So, the refactoring is needed at first. I will create an issue and getRawKey and deleteRawKey no longer have ctx parameter (for a while)

Alright. Initially I misunderstood how `StartSpanFromContext` and now I got the point that it does not make sense to pass context until `initCounter <- Open ....` recieve context. So, the refactoring is needed at first. I will create an issue and `getRawKey` and `deleteRawKey` no longer have `ctx` parameter (for a while)

aarifullin commented

2023-08-30 09:10:36 +00:00

P.S. I am going to resolve the conversation to make the comments readable

aarifullin commented

2023-08-30 09:10:38 +00:00

P.S. I am going to resolve the conversation to make the comments readable

aarifullin marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebadger/storage.go Outdated

					
				@ -42,6 +42,13 @@ func addr2key(addr oid.Address) internalKey {

					return key

				}

				func keyCountPrefix() internalKey {

fyrchik commented

2023-08-23 10:38:41 +00:00

InternalKey is a fixed-length key denoting an address (so iteration is possibly affected). It is a valid address, can we explicitly make it different size and type?

InternalKey is a fixed-length key denoting an address (so iteration is possibly affected). It _is_ a valid address, can we explicitly make it different size and type?

aarifullin commented

2023-08-24 13:27:26 +00:00

Good point. I'll fix it

fyrchik commented

2023-08-29 08:19:44 +00:00

I don't see it being changed.

aarifullin commented

2023-08-29 11:04:19 +00:00

Sorry, I forgot about the comment. It seems you accidently resolved conv

aarifullin force-pushed feature/561-init_count from ad8c095391 to 9d8583b946

2023-08-25 08:34:23 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 9d8583b946 to aa87a3ea63

2023-08-25 08:38:36 +00:00

Compare

dstepanov-yadro approved these changes 2023-08-25 10:35:06 +00:00

aarifullin force-pushed feature/561-init_count from aa87a3ea63 to 1f1302fd52

2023-08-25 11:46:50 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 1f1302fd52 to 7b66d52ded

2023-08-25 12:57:27 +00:00

Compare

dstepanov-yadro approved these changes 2023-08-29 07:16:16 +00:00

fyrchik reviewed 2023-08-29 08:22:50 +00:00

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -21,0 +33,4 @@

					init bool

					// initMtx is mutex for init flag.

					initMtx sync.RWMutex

fyrchik commented

2023-08-29 08:15:26 +00:00

We have a mutex for a single primitive variable, why not just use atomics? (a question, not a suggestion)
Mutex could give stronger guarante, but I dont see them being taken for the whole operation duration (currently we have a race condition, but possibly do not exploit it)

We have a mutex for a single primitive variable, why not just use atomics? (a question, not a suggestion) Mutex could give stronger guarante, but I dont see them being taken for the whole operation duration (currently we have a race condition, but possibly do not exploit it)

aarifullin force-pushed feature/561-init_count from 7b66d52ded to 621170148e

2023-08-29 13:13:50 +00:00

Compare

aarifullin force-pushed feature/561-init_count from 621170148e to 7923b2fe0e

2023-08-29 16:20:16 +00:00

Compare

aarifullin reviewed 2023-08-29 16:21:01 +00:00

pkg/local_object_storage/writecache/writecachebadger/storage.go Outdated

					
				@ -24,0 +24,4 @@

				const metaInformationKeyLength = 20

				// metaKey is a type that defines a key for helpful purposes.

				type metaKey [metaInformationKeyLength]byte

aarifullin commented

2023-08-29 16:21:01 +00:00

@fyrchik Here is different type and length for keys_count key. It seems it doesn't break anything (I hope so)

@fyrchik Here is different type and length for `keys_count` key. It seems it doesn't break anything (I hope so)

fyrchik commented

2023-08-30 17:21:06 +00:00

Now why 20? Isn't it just len("keys_count") == 10?

Now why 20? Isn't it just `len("keys_count") == 10`?

fyrchik commented

2023-08-30 17:27:34 +00:00

I mean we can have global var keyKey = []byte("keys_count") without any functions (name can be changed)

I mean we can have global `var keyKey = []byte("keys_count")` without any functions (name can be changed)

fyrchik reviewed 2023-08-30 17:21:48 +00:00

pkg/local_object_storage/writecache/writecachebadger/cachebadger.go Outdated

					
				@ -29,3 +32,4 @@

					store

				}

				var _ writecache.Cache = (*cache)(nil)

fyrchik commented

2023-08-30 17:14:48 +00:00

How is this related to the PR?

aarifullin commented

2023-08-31 10:14:08 +00:00

Removed

pkg/local_object_storage/writecache/writecachebadger/cachebadger.go Outdated

					
				@ -104,1 +126,4 @@

				func (c *cache) Close() error {

					c.cancelInitCountByCloseAndWait()

					c.flushKeysCount()

fyrchik commented

2023-08-30 17:15:43 +00:00

Is it correct to flush keys if we canceled counter iteration?

aarifullin commented

2023-08-31 10:09:53 +00:00

If you look at flushKeysCount() definition then you can see that if objCounters is not ready to be flushed then nothing will happen. I can move these checks out of flushKeysCount

If you look at `flushKeysCount()` definition then you can see that if objCounters is not ready to be flushed then nothing will happen. I can move these checks out of `flushKeysCount`

pkg/local_object_storage/writecache/writecachebadger/get.go Outdated

					
				@ -96,0 +100,4 @@

				func (c *cache) getRawKey(key []byte) ([]byte, error) {

					value, err := Get(c.db, key)

					if err != nil {

						return nil, metaerr.Wrap(new(apistatus.ObjectNotFound))

fyrchik commented

2023-08-30 17:16:21 +00:00

Is it always NotFound?

Is it always `NotFound`?

aarifullin commented

2023-08-31 10:13:08 +00:00

Sorry, you're right. The raw key has nothing common with ObjectNotFound

Sorry, you're right. The raw key has nothing common with `ObjectNotFound`

pkg/local_object_storage/writecache/writecachebadger/state.go Outdated

					
				@ -55,0 +175,4 @@

						// Recount keys if "key_count" has not been found

						if err := c.db.View(func(tx *badger.Txn) error {

							c.objCounters.cDB.setAsMutable()

fyrchik commented

2023-08-30 17:20:18 +00:00

I think it still may be invalid: snapshot for View is fixed before function executes and we "set as mutable" inside a function, so some objects may possibly arrived and will not be in counters. Am I right?

I think it still may be invalid: snapshot for `View` is fixed before function executes and we "set as mutable" inside a function, so some objects may possibly arrived and will not be in counters. Am I right?

aarifullin commented

2023-08-31 10:26:43 +00:00

We have fixed the snapshot at the T1 moment. We have N objects at T1
An object arrived at T2 moment. It'll try to increase object counter objCounters and we have M objects while counting is going on the snapshot

func (x *counters) IncDB() {
	if x.cDB.isMutable() {
		x.cDB.value.Inc()
	}
}

While object are being counted on the snapshot [T1, Tk], we can inc/dec objCounters. Iterators are valid within View
N + M - good result

So, is something wrong?

1. We have fixed the snapshot at the `T1` moment. We have `N` objects at `T1` 2. An object arrived at `T2` moment. It'll try to increase object counter `objCounters` and we have `M` objects while counting is going on the snapshot ```golang func (x *counters) IncDB() { if x.cDB.isMutable() { x.cDB.value.Inc() } } ``` 3. While object are being counted on the snapshot `[T1, Tk]`, we can inc/dec `objCounters`. Iterators are valid within `View` 4. `N + M` - good result So, is something wrong?