Fix writecache counters #595

dstepanov-yadro · 2023-08-10T13:17:04Z

dstepanov-yadro commented

2023-08-10 13:17:04 +00:00

Closes #585

There are several problems with writecache counters:

If the big object was putted more than once and first object wasn't flushed before second put, than the counter will be always more than one.
Small object flush collector can collect objects more than once, so the counter can be negative.

This problems (especially second one) lead to invalid cache size estimation, so write cache can be logically overflown.

This PR fixes this problems.

FSTree benchmarks to check counter logic effect

FSTree master bench:

goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkSubstorageReadPerf/fstree-seq100-8         	  384123	      2823 ns/op	    3552 B/op	      45 allocs/op
BenchmarkSubstorageReadPerf/fstree-rand100-8        	  400687	      2940 ns/op	    3552 B/op	      45 allocs/op

BenchmarkSubstorageWritePerf/fstree-rand10-8         	    1333	    887592 ns/op	    5157 B/op	      74 allocs/op
BenchmarkSubstorageWritePerf/fstree-rand100-8        	    1219	    946909 ns/op	    5360 B/op	      74 allocs/op
BenchmarkSubstorageWritePerf/fstree-rand1000-8       	    1347	    861522 ns/op	    7219 B/op	      74 allocs/op
BenchmarkSubstorageWritePerf/fstree-overwrite10-8    	    1340	    856788 ns/op	    4665 B/op	      68 allocs/op
BenchmarkSubstorageWritePerf/fstree-overwrite100-8   	    1401	    923462 ns/op	    4833 B/op	      68 allocs/op
BenchmarkSubstorageWritePerf/fstree-overwrite1000-8  	    1299	    922424 ns/op	    6712 B/op	      68 allocs/op

FSTree current bench:

goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkSubstorageReadPerf/fstree_without_object_counter-seq100-8         	  387762	      4366 ns/op	    3648 B/op	      45 allocs/op
BenchmarkSubstorageReadPerf/fstree_with_object_counter-seq100-8            	  448082	      2808 ns/op	    3600 B/op	      45 allocs/op
BenchmarkSubstorageReadPerf/fstree_without_object_counter-rand100-8        	  405159	      2937 ns/op	    3648 B/op	      45 allocs/op
BenchmarkSubstorageReadPerf/fstree_with_object_counter-rand100-8           	  380102	      2977 ns/op	    3630 B/op	      45 allocs/op

BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand10-8         	    1366	    900213 ns/op	    5604 B/op	      76 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand10-8            	    1266	    894496 ns/op	    6044 B/op	      80 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand100-8        	    1332	    881587 ns/op	    5774 B/op	      76 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand100-8           	    1255	    925093 ns/op	    6237 B/op	      80 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand1000-8       	    1297	   1016633 ns/op	    7651 B/op	      76 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand1000-8          	    1368	    864864 ns/op	    8090 B/op	      80 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite10-8    	    1249	    901532 ns/op	    4928 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite10-8       	    1364	    876430 ns/op	    5323 B/op	      73 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite100-8   	    1342	    958480 ns/op	    5101 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite100-8      	    1351	    898750 ns/op	    5500 B/op	      73 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite1000-8  	    1362	    853337 ns/op	    6966 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite1000-8     	    1318	    934879 ns/op	    7373 B/op	      73 allocs/op

Closes #585 There are several problems with writecache counters: 1. If the big object was putted more than once and first object wasn't flushed before second put, than the counter will be always more than one. 2. Small object flush collector can collect objects more than once, so the counter can be negative. This problems (especially second one) lead to invalid cache size estimation, so write cache can be logically overflown. This PR fixes this problems. **FSTree benchmarks to check counter logic effect** FSTree master bench: ``` goos: linux goarch: amd64 pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkSubstorageReadPerf/fstree-seq100-8 384123 2823 ns/op 3552 B/op 45 allocs/op BenchmarkSubstorageReadPerf/fstree-rand100-8 400687 2940 ns/op 3552 B/op 45 allocs/op BenchmarkSubstorageWritePerf/fstree-rand10-8 1333 887592 ns/op 5157 B/op 74 allocs/op BenchmarkSubstorageWritePerf/fstree-rand100-8 1219 946909 ns/op 5360 B/op 74 allocs/op BenchmarkSubstorageWritePerf/fstree-rand1000-8 1347 861522 ns/op 7219 B/op 74 allocs/op BenchmarkSubstorageWritePerf/fstree-overwrite10-8 1340 856788 ns/op 4665 B/op 68 allocs/op BenchmarkSubstorageWritePerf/fstree-overwrite100-8 1401 923462 ns/op 4833 B/op 68 allocs/op BenchmarkSubstorageWritePerf/fstree-overwrite1000-8 1299 922424 ns/op 6712 B/op 68 allocs/op ``` FSTree current bench: ``` goos: linux goarch: amd64 pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkSubstorageReadPerf/fstree_without_object_counter-seq100-8 387762 4366 ns/op 3648 B/op 45 allocs/op BenchmarkSubstorageReadPerf/fstree_with_object_counter-seq100-8 448082 2808 ns/op 3600 B/op 45 allocs/op BenchmarkSubstorageReadPerf/fstree_without_object_counter-rand100-8 405159 2937 ns/op 3648 B/op 45 allocs/op BenchmarkSubstorageReadPerf/fstree_with_object_counter-rand100-8 380102 2977 ns/op 3630 B/op 45 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand10-8 1366 900213 ns/op 5604 B/op 76 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand10-8 1266 894496 ns/op 6044 B/op 80 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand100-8 1332 881587 ns/op 5774 B/op 76 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand100-8 1255 925093 ns/op 6237 B/op 80 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand1000-8 1297 1016633 ns/op 7651 B/op 76 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand1000-8 1368 864864 ns/op 8090 B/op 80 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite10-8 1249 901532 ns/op 4928 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite10-8 1364 876430 ns/op 5323 B/op 73 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite100-8 1342 958480 ns/op 5101 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite100-8 1351 898750 ns/op 5500 B/op 73 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite1000-8 1362 853337 ns/op 6966 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite1000-8 1318 934879 ns/op 7373 B/op 73 allocs/op ```

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 9334ef5593 to e0acd24642

2023-08-10 14:58:32 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from e0acd24642 to 52d9e99115

2023-08-10 15:45:57 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 83a2e8a58a to 44d42a6756

2023-08-11 06:56:12 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 44d42a6756 to 87c28d8bd8

2023-08-11 07:07:49 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 87c28d8bd8 to 8382f4b785

2023-08-11 07:16:33 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 8382f4b785 to e3de76c494

2023-08-11 07:49:36 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from e3de76c494 to a01676f9f1

2023-08-11 08:33:20 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 5a08d5c137 to bf9f4d9318

2023-08-11 09:11:58 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from bf9f4d9318 to 019a1ac0cb

2023-08-11 09:13:14 +00:00

Compare

dstepanov-yadro reviewed 2023-08-11 09:14:22 +00:00

dstepanov-yadro reviewed 2023-08-11 09:14:28 +00:00

dstepanov-yadro reviewed 2023-08-11 09:14:31 +00:00

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -397,4 +432,2 @@

				}

				// PutStream puts executes handler on a file opened for write.

				func (t *FSTree) PutStream(addr oid.Address, handler func(*os.File) error) error {

dstepanov-yadro commented

2023-08-11 09:14:31 +00:00

Unused

dstepanov-yadro reviewed 2023-08-11 09:15:31 +00:00

pkg/local_object_storage/writecache/writecachebbolt/put.go Outdated

					
				@ -74,6 +74,7 @@ func (c *cache) Put(ctx context.Context, prm common.PutPrm) (common.PutRes, erro

					if err == nil {

						added = true

					}

					c.estimateCacheSize()

dstepanov-yadro commented

2023-08-11 09:15:31 +00:00

To update metrics after object put

dstepanov-yadro reviewed 2023-08-11 09:16:44 +00:00

pkg/local_object_storage/writecache/writecachebbolt/storage.go Outdated

					
				@ -58,3 +60,4 @@

					if err := c.fsTree.Open(readOnly); err != nil {

						return fmt.Errorf("could not open FSTree: %w", err)

					}

					if err := c.fsTree.Init(); err != nil {

dstepanov-yadro commented

2023-08-11 09:16:44 +00:00

To init object counter

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 019a1ac0cb to fcbce1becb

2023-08-11 09:24:52 +00:00

Compare

dstepanov-yadro reviewed 2023-08-11 09:26:07 +00:00

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from fcbce1becb to ad4a066ead

2023-08-11 09:27:36 +00:00

Compare

dstepanov-yadro reviewed 2023-08-11 09:28:20 +00:00

pkg/local_object_storage/writecache/writecachebbolt/storage.go Outdated

					
				@ -67,2 +74,3 @@

						b := tx.Bucket(defaultBucket)

						return b.Delete([]byte(key))

						key := []byte(key)

						recordDeleted = !recordDeleted && b.Get(key) != nil

dstepanov-yadro commented

2023-08-11 09:28:20 +00:00

batch func can be called more than once, so check previous value is required

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from ad4a066ead to 3d775d5434

2023-08-11 09:31:25 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 3d775d5434 to 41837d2a10

2023-08-11 09:53:00 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 41837d2a10 to 81609b8259

2023-08-11 12:21:06 +00:00

Compare

dstepanov-yadro changed title from ~~WIP: Fix writecache counters~~ to Fix writecache counters

2023-08-11 12:21:20 +00:00

requested reviews from storage-core-committers, storage-core-developers

2023-08-11 12:21:29 +00:00

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 81609b8259 to d76a78eb5d

2023-08-14 06:53:02 +00:00

Compare

aarifullin reviewed 2023-08-14 09:07:27 +00:00

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -371,0 +397,4 @@

						t.counter.Inc()

						var targetFileExists bool

						if _, e := os.Stat(p); e == nil {

							targetFileExists = true

aarifullin commented

2023-08-14 09:07:27 +00:00

It seems you can get rid of the flag like that:

if _, e := os.Stat(p); e == nil {
   if err = os.Rename(tmpPath, p); err == nil {
      t.counter.Dec() //target already exists, tmp file removed, so no counter changes
   }
}

:)

It seems you can get rid of the flag like that: ``` if _, e := os.Stat(p); e == nil { if err = os.Rename(tmpPath, p); err == nil { t.counter.Dec() //target already exists, tmp file removed, so no counter changes } } ``` :)

dstepanov-yadro commented

2023-08-14 10:01:37 +00:00

No, we have to os.Rename the file regardless of whether it exists.

fyrchik commented

2023-08-14 10:18:17 +00:00

We still have a race condition with Stat in normal scenario (and we dont if this is done under a mutex).

We still have a race condition with `Stat` in normal scenario (and we dont if this is done under a mutex).

dstepanov-yadro commented

2023-08-14 10:53:43 +00:00

No, this is done under a mutex:

func (t *FSTree) writeAndRename(tmpPath, p string, data []byte) error {
	if t.objCounterEnabled {
		t.fileGuard.Lock(p)
		defer t.fileGuard.Unlock(p)
	}

No, this is done under a mutex: ``` func (t *FSTree) writeAndRename(tmpPath, p string, data []byte) error { if t.objCounterEnabled { t.fileGuard.Lock(p) defer t.fileGuard.Unlock(p) } ```

aarifullin reviewed 2023-08-14 09:14:30 +00:00

pkg/local_object_storage/writecache/writecachebbolt/put.go Outdated

					
				@ -97,2 +100,3 @@

						)

						c.objCounters.IncDB()

						if recordAdded {

							c.objCounters.cDB.Add(1)

aarifullin commented

2023-08-14 09:14:30 +00:00

c.objCounters.cDB.Add(1) -> c.objCounters.IncDB() ?

`c.objCounters.cDB.Add(1)` -> `c.objCounters.IncDB()` ?

dstepanov-yadro commented

2023-08-14 09:55:43 +00:00

c.objCounters.IncDB() was removed, as there is no need to have this method, all fields are available.

`c.objCounters.IncDB()` was removed, as there is no need to have this method, all fields are available.

👍 1

aarifullin reviewed 2023-08-14 09:18:35 +00:00

pkg/local_object_storage/writecache/writecachebbolt/state.go Outdated

					
				@ -15,1 +13,3 @@

					return db + fstree

					dbCount := c.objCounters.DB()

					fsCount := c.objCounters.FS()

					if fsCount > 0 {

aarifullin commented

2023-08-14 09:18:35 +00:00

I suppose dbCount and fsCount may be negative (despite they are uint64). Don't you need to consider the such case?

I mean fsCount can be MAX_UINT64_SIZE - n meaning it's negative but still the check works out and decremented

I suppose `dbCount` and `fsCount` may be negative (despite they are `uint64`). Don't you need to consider the such case? I mean `fsCount` can be `MAX_UINT64_SIZE - n` meaning it's negative but still the check works out and decremented

dstepanov-yadro commented

2023-08-14 10:00:16 +00:00

Yes, they may be negative. But what should we do?

Panic or error? Looks too much.

Use zero instead of negative value? Then we won't know for sure that we have a bug.

So negative values are good point to create a bug, but they shouldn't break the program flow.

Yes, they may be negative. But what should we do? Panic or error? Looks too much. Use zero instead of negative value? Then we won't know for sure that we have a bug. So negative values are good point to create a bug, but they shouldn't break the program flow.

fyrchik reviewed 2023-08-14 09:23:19 +00:00

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -364,10 +394,21 @@ func (t *FSTree) writeAndRename(tmpPath, p string, data []byte) error {

							case syscall.ENOSPC:

								err = common.ErrNoSpace

								_ = os.RemoveAll(tmpPath)

							case syscall.EEXIST:

fyrchik commented

2023-08-11 07:07:27 +00:00

Why is it removed?

👍 1

dstepanov-yadro commented

2023-08-14 09:43:20 +00:00

It looks like this case statement was used for retries. Retries were deleted, so this case is redundant.

It looks like this `case` statement was used for retries. Retries were deleted, so this `case` is redundant.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebbolt/put.go Outdated

					
				@ -89,2 +90,3 @@

						b := tx.Bucket(defaultBucket)

						return b.Put([]byte(obj.addr), obj.data)

						key := []byte(obj.addr)

						recordAdded = !recordAdded && b.Get(key) == nil

fyrchik commented

2023-08-14 07:15:30 +00:00

Why, though? We should always check the actual value, because we cannot assume anything else.

Why, though? We should _always_ check the actual value, because we cannot assume anything else.

dstepanov-yadro commented

2023-08-14 09:38:24 +00:00

See boltdb.Batch comment:

// 2. the function passed to Batch may be called multiple times,
// regardless of whether it returns error or not.

I understand this as function will be called at least once. If key was inserted by first call, then it will be already already in db on second call.

See `boltdb.Batch` comment: ``` // 2. the function passed to Batch may be called multiple times, // regardless of whether it returns error or not. ``` I understand this as ` function will be called at least once`. If `key` was inserted by first call, then it will be already already in db on second call.

fyrchik commented

2023-08-14 11:46:27 +00:00

No, this violates transaction isolation: transaction either applies (nil error) or not (non-nil error, and can be retried in Batch in this case). So even if we applied everything on our first execution, I doubt the changes are persisted.

fyrchik commented

2023-08-14 11:47:24 +00:00

If what I wrote is false, we have much bigger problems (consider metabase and all its checks, for example)

fyrchik commented

2023-08-14 11:50:13 +00:00

Oh, I missed the regardless part.

Oh, I missed the `regardless` part.

fyrchik commented

2023-08-14 11:54:49 +00:00

But no, still, we are talking about the data in the database, the function can be executed twice but under no circumstances should we apply a transaction twice.
Here is a call to batch trigger: https://github.com/etcd-io/bbolt/blob/master/db.go#L983
If any transaction in the batch returns false, all previous functions can be reexecuted with nil error, but not reapplied (we still return err from Update().

But no, still, we are talking about the data in the database, the function can be executed twice but under no circumstances should we apply a transaction twice. Here is a call to batch trigger: https://github.com/etcd-io/bbolt/blob/master/db.go#L983 If any transaction in the batch returns false, all previous functions can be reexecuted with nil error, but not reapplied (we still return err from Update().

fyrchik commented

2023-08-14 13:28:44 +00:00

Also, how about avoiding PUT in this case?
It should reduce the amount of pages we flush during commit (I am not sure whether its worth the effort, though, the situation is quite rare).

Also, how about avoiding PUT in this case? It should reduce the amount of pages we flush during commit (I am not sure whether its worth the effort, though, the situation is quite rare).

dstepanov-yadro commented

2023-08-14 13:48:21 +00:00

Ok, you're right. Fixed.

fyrchik marked this conversation as resolved

acid-ant approved these changes 2023-08-14 11:51:31 +00:00

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -351,3 +370,2 @@

					err = fmt.Errorf("couldn't read file after %d retries", retryCount)

					// unreachable, but precaution never hurts, especially 1 day before release.

					tmpPath := p + "#" + uuid.NewString()

acid-ant commented

2023-08-14 08:11:25 +00:00

Looks like comment above no more relevant.

dstepanov-yadro commented

2023-08-14 13:49:13 +00:00

Hm, which one? Comment was deleted.

acid-ant commented

2023-08-14 14:08:31 +00:00

...
	// So here is a solution:
	// 1. Write a file to 'name + 1'.
	// 2. If it exists, retry with temporary name being 'name + 2'.
	// 3. Set some reasonable number of attempts.
...

``` ... // So here is a solution: // 1. Write a file to 'name + 1'. // 2. If it exists, retry with temporary name being 'name + 2'. // 3. Set some reasonable number of attempts. ... ```

dstepanov-yadro commented

2023-08-14 15:04:41 +00:00

Ah. Removed this cool story completely. Thx.

fyrchik marked this conversation as resolved

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from d76a78eb5d to cf1e5fd529

2023-08-14 13:46:55 +00:00

Compare

aarifullin approved these changes 2023-08-14 15:00:01 +00:00

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from cf1e5fd529 to 193835b50b

2023-08-14 15:04:16 +00:00

Compare

fyrchik reviewed 2023-08-15 06:34:09 +00:00

pkg/local_object_storage/blobstor/fstree/counter.go Outdated

					
				@ -0,0 +17,4 @@

				func (c *noopCounter) Inc()       {}

				func (c *noopCounter) Dec()       {}

				type SimpleCounter struct {

fyrchik commented

2023-08-15 06:30:54 +00:00

s/SimpleCounter/AtomicCounter/?

`s/SimpleCounter/AtomicCounter/`?

dstepanov-yadro commented

2023-08-15 07:05:36 +00:00

No, atomic is implementation detail

No, `atomic` is implementation detail

fyrchik commented

2023-08-15 07:30:14 +00:00

Agree, can we mention in the interface comment that the counter implementation must be thread-safe?

fyrchik marked this conversation as resolved

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -528,7 +534,6 @@ func (t *FSTree) NumberOfObjects() (uint64, error) {

					if err != nil {

						return 0, fmt.Errorf("could not walk through %s directory: %w", t.RootPath, err)

					}

fyrchik commented

2023-08-15 06:32:50 +00:00

?

dstepanov-yadro commented

2023-08-15 07:06:32 +00:00

Is there some mistake here?

fyrchik commented

2023-08-15 07:19:19 +00:00

Unrelated style changes mixed with everything else lead to bigger diffs, which make it harder to review.
It is small, true, but I see it in almost every review -- It is not hard to avoid this.

Unrelated style changes mixed with everything else lead to bigger diffs, which make it harder to review. It is small, true, but I see it in almost every review -- It is not hard to avoid this.

fyrchik marked this conversation as resolved

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -351,3 +348,1 @@

					err = fmt.Errorf("couldn't read file after %d retries", retryCount)

					// unreachable, but precaution never hurts, especially 1 day before release.

					tmpPath := p + "#" + uuid.NewString()

fyrchik commented

2023-08-15 06:18:50 +00:00

// NewString creates a new random UUID and returns it as a string or panics.

We should not panic in fstree.

>// NewString creates a new random UUID and returns it as a string or panics. We should not panic in fstree.

dstepanov-yadro commented

2023-08-15 08:04:28 +00:00

uuid replaced with atomic counter.

fyrchik marked this conversation as resolved

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -371,0 +378,4 @@

						}

						err = os.Rename(tmpPath, p)

						if err == nil && targetFileExists {

							t.counter.Dec() //target already exists, tmp file removed, so no counter changes

fyrchik commented

2023-08-15 06:20:55 +00:00

Why do we Inc then Dec instead of just not doing Inc in the first place?
It is also nice to increase the counter after the file has been written.

Why do we `Inc` then `Dec` instead of just not doing `Inc` in the first place? It is also nice to increase the counter _after_ the file has been written.

dstepanov-yadro commented

2023-08-15 07:11:39 +00:00

Why do we Inc then Dec instead of just not doing Inc in the first place?

You've answered your question: It is also nice to increase the counter after the file has been written.

	if t.objCounterEnabled {
		t.counter.Inc() // here we have temp file written, so we increase counter
...
		if err == nil && targetFileExists {
			t.counter.Dec() //here we have temp file deleted, so we decrease counter
    }

> Why do we Inc then Dec instead of just not doing Inc in the first place? You've answered your question: `It is also nice to increase the counter after the file has been written.` ``` if t.objCounterEnabled { t.counter.Inc() // here we have temp file written, so we increase counter ... if err == nil && targetFileExists { t.counter.Dec() //here we have temp file deleted, so we decrease counter } ```

fyrchik commented

2023-08-15 07:15:28 +00:00

I mean when the temp file is written the object does not yet exists, it start to exist after rename, right?

dstepanov-yadro commented

2023-08-15 07:18:17 +00:00

But temp file exists, right?

fyrchik commented

2023-08-15 07:20:17 +00:00

Will Iterate see it?

Will `Iterate` see it?

dstepanov-yadro commented

2023-08-15 07:22:26 +00:00

Yes

fyrchik commented

2023-08-15 07:28:36 +00:00

As I understand, for Delete it is ok, because the lock is already taken (not obvious, but works).
Iterate does not use any locks currently, and I don't see anything changed in this PR.
Am I missing something?

As I understand, for `Delete` it is ok, because the lock is already taken (not obvious, but works). `Iterate` does not use any locks currently, and I don't see anything changed in this PR. Am I missing something?

dstepanov-yadro commented

2023-08-15 07:58:24 +00:00

renamed to file counter.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/writecachebbolt/state.go Outdated

					
				@ -72,4 +74,1 @@

						inFS-- //small.bolt DB file

					}

					c.objCounters.cDB.Store(inDB)

fyrchik commented

2023-08-15 06:26:30 +00:00

Why do we update counters for db and fstree in different places?

dstepanov-yadro commented

2023-08-15 07:17:49 +00:00

db counter relates to db. fstree counter relates to fstree. But both of counters we update in cache.Open method.

db counter relates to db. fstree counter relates to fstree. But both of counters _we_ update in `cache.Open` method.

fyrchik commented

2023-08-15 07:25:04 +00:00

Ok, I don't mind, but to me it adds cognitive complexity, so at some point sth similar to #610 will happen.

Ok, I don't mind, but to me it adds cognitive complexity, so at some point sth similar to https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/610 will happen.

fyrchik marked this conversation as resolved

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 193835b50b to 6f0a717af1

2023-08-15 07:40:57 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 6f0a717af1 to dec6e34ab2

2023-08-15 07:47:51 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from dec6e34ab2 to 5c85aa6cfa

2023-08-15 07:56:35 +00:00

Compare

dstepanov-yadro force-pushed fix/writecache_bbolt_db_counter from 5c85aa6cfa to 2efe9cc1be

2023-08-16 11:48:06 +00:00