Use FSTree only for writecache #1367

dstepanov-yadro · 2024-09-11T07:18:47Z

dstepanov-yadro commented

2024-09-11 07:18:47 +00:00

What has changed:

Writecache uses fstree only to store objects. For upgrade scenario added flush and drop bbolt db step on init.
FSTree objects flushing concurrently.
To limit total amount of memory used for background flush limiter added.
As there is no more small and large objects in writecache, fixed writecache size/objects counter: it uses real object sizes instead of small_object_size and max_object_size.
Drop unused configs/metrics.

Hardware test results:

Test scenario: 5 min put, 100 containers, REP 2

What has changed: 1. Writecache uses fstree only to store objects. For upgrade scenario added `flush and drop bbolt db` step on init. 2. FSTree objects flushing concurrently. 3. To limit total amount of memory used for background flush limiter added. 4. As there is no more small and large objects in writecache, fixed writecache size/objects counter: it uses real object sizes instead of `small_object_size` and `max_object_size`. 5. Drop unused configs/metrics. Hardware test results: ![image](/attachments/0e5c1afb-1ca7-4d62-a3d0-ae7fbd205b37) Test scenario: 5 min put, 100 containers, REP 2

image.png

60 KiB

🚀 1

dstepanov-yadro force-pushed feat/wc_fstree from 10652370c2 to 595b8df21a

2024-09-11 07:20:58 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 595b8df21a to 9258977d51

2024-09-11 07:54:12 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 9258977d51 to 88ed60c76a

2024-09-11 07:56:29 +00:00

Compare

dstepanov-yadro changed title from ~~WIP: Use FSTree only for writecache~~ to Use FSTree only for writecache

2024-09-11 08:01:22 +00:00

requested reviews from storage-core-committers, storage-core-developers

2024-09-11 08:01:29 +00:00

dstepanov-yadro force-pushed feat/wc_fstree from 88ed60c76a to 87c99bab9e

2024-09-11 08:10:33 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 87c99bab9e to 6a5f0c9d50

2024-09-11 11:32:23 +00:00

Compare

fyrchik requested changes 2024-09-11 12:48:27 +00:00

pkg/local_object_storage/blobstor/fstree/counter.go Outdated

					
				@ -29,3 +31,2 @@

				func NewSimpleCounter() *SimpleCounter {

					return &SimpleCounter{}

				func NewSimpleCounter(assert bool) *SimpleCounter {

fyrchik commented

2024-09-11 12:28:48 +00:00

The argument is unused.
What is the goal of having this assert? Just panic unconditionally.

The argument is unused. What is the goal of having this assert? Just panic unconditionally.

dstepanov-yadro commented

2024-09-11 13:15:56 +00:00

Fixed

fyrchik marked this conversation as resolved

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -225,0 +228,4 @@

				}

				type IterateInfoHandler func(ObjectInfo) error

				func (t *FSTree) IterateInfo(ctx context.Context, handler IterateInfoHandler) error {

fyrchik commented

2024-09-11 12:41:53 +00:00

What do we add this method for?
It seems we achieve nothing: we are limiting the amount of memory we consume by some criteria, it doesn't matter when we read object.

What do we add this method for? It seems we achieve nothing: we are limiting the amount of memory we consume by some criteria, it doesn't matter when we read object.

dstepanov-yadro commented

2024-09-11 13:23:23 +00:00

This implementation is stricter: writecache doesn't read object before limiter allows it.
Using of ftree.Iterate is also possible, but the amount and complexity of the new code are small, so I chose a more strict implementation.

This implementation is stricter: writecache doesn't read object before limiter allows it. Using of `ftree.Iterate` is also possible, but the amount and complexity of the new code are small, so I chose a more strict implementation.

dstepanov-yadro commented

2024-09-11 14:05:25 +00:00

Also it is now more than one flush worker for fstree, so using fstree.Iterate from many goroutines requires some kind of synchronization to not to read the same objects.

Also it is now more than one flush worker for fstree, so using `fstree.Iterate` from many goroutines requires some kind of synchronization to not to read the same objects.

fyrchik commented

2024-09-12 08:12:11 +00:00

and complexity of the new code are small

Hardly disagree, we have a whole new method to support and this is not a single point of adding small complexity in this PR.

writecache doesn't read object before limiter allows it.

I think the strictness doesn't matter here: the object is small and we do not perfectly control the amount of memory anyway, consider multiple slices for decompressed data and allocations during unmarshaling.

>and complexity of the new code are small Hardly disagree, we have a whole new method to support and this is not a single point of adding small complexity in this PR. >writecache doesn't read object before limiter allows it. I think the strictness doesn't matter here: the object is small and we do not perfectly control the amount of memory anyway, consider multiple slices for decompressed data and allocations during unmarshaling.

dstepanov-yadro commented

2024-09-12 09:38:43 +00:00

If it will be more than on flush worker that uses fstree.Iterate then it is possible that the same file will be readed more than once and more than once flushed to blobstore (this action requires additional memory and CPU).

If it will be more than on flush worker that uses `fstree.Iterate` then it is possible that the same file will be readed more than once and more than once flushed to blobstore (this action requires additional memory and CPU).

fyrchik commented

2024-09-12 12:02:20 +00:00

If multiple flush workers iterate over the same directory that is some really bad design, unless they distribute work somehow, and thus not have the problem you described.

pkg/local_object_storage/blobstor/fstree/fstree_write_generic.go Outdated

					
				@ -128,0 +124,4 @@

					w.fileGuard.Lock(p)

					defer w.fileGuard.Unlock(p)

					stat, err := os.Stat(p)

fyrchik commented

2024-09-11 12:32:14 +00:00

Honestly, I do not like the idea of using a separate syscall just to be able to track size.
Precise estimation is obviously a new feature with non-zero cost, do we need it?

Honestly, I do not like the idea of using a separate syscall just to be able to track size. Precise estimation is obviously a new feature with non-zero cost, do we need it?

dstepanov-yadro commented

2024-09-11 13:14:58 +00:00

Bbolt + fstree implementation estimates writecache size as dbCount * small_object_size + fstreeCount * max_object_size. Such estimate is not very accurate.
If the same estimate were applied for the current implementation, then the cache size limit would depend only on the objects count, and not on the actual size of the data.

`Bbolt + fstree` implementation estimates writecache size as `dbCount * small_object_size + fstreeCount * max_object_size`. Such estimate is not very accurate. If the same estimate were applied for the current implementation, then the cache size limit would depend only on the objects count, and not on the actual size of the data.

fyrchik commented

2024-09-12 08:43:41 +00:00

In the writecache flush thread you actually read the file before removing it, so you know the size exactly.
Do we need to extend fstree?

In the writecache flush thread you actually read the file before removing it, so you know the size exactly. Do we need to extend fstree?

dstepanov-yadro commented

2024-09-12 09:34:46 +00:00

Without fstree extend (adding filelocker) it is required to move filelocker to writecache and to add fstree.Exists check on writecache.Put to handle save the same object twice. fstree.Exists on writecache.Put adds os.Stat call on direct client request.
So I prefer to have extra syscall for background flushing but not for Put request handler.
Also now writecache passes size to fstree.Delete to not to call os.Stat

Without fstree extend (adding filelocker) it is required to move filelocker to writecache and to add `fstree.Exists` check on `writecache.Put` to handle save the same object twice. `fstree.Exists` on `writecache.Put` adds `os.Stat` call on direct client request. So I prefer to have extra syscall for background flushing but not for Put request handler. Also now writecache passes size to `fstree.Delete` to not to call `os.Stat`

fyrchik marked this conversation as resolved

pkg/local_object_storage/blobstor/fstree/fstree_write_linux.go Outdated

					
				@ -51,6 +61,10 @@ func (w *linuxWriter) writeData(p string, data []byte) error {

				}

				func (w *linuxWriter) writeFile(p string, data []byte) error {

					if w.fileCounterEnabled {

fyrchik commented

2024-09-11 12:33:22 +00:00

We didn't have this before, was it a bug?

dstepanov-yadro commented

2024-09-11 13:24:13 +00:00

No, as we don't have unix.Stat call.

No, as we don't have `unix.Stat` call.

dstepanov-yadro commented

2024-09-11 14:49:21 +00:00

Oh, sorry. I missed that point is about writeFile.
I tested object counter with unit test and without using file guard for writeFile I got panic about invalid count:

go test -timeout 60m -tags integration -run ^TestObjectCounter$ git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree -count=100
panic: fstree.SimpleCounter: invalid count

goroutine 44 [running]:
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*SimpleCounter).Dec(0xc0005858f0, 0x56)
        /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/counter.go:58 +0xb8
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*linuxWriter).removeFile(0xc000034230, {0xc000332e80, 0x80})
        /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree_write_linux.go:108 +0x2a4
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*FSTree).Delete(0xc000118070, {0xd742e0, 0xc000034280}, {{{0x4a, 0x9e, 0xc1, 0x7d, 0x5d, 0x2a, 0x88, ...}, ...}, ...})
        /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree.go:342 +0x55a
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.TestObjectCounter.func3()
        /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree_test.go:86 +0xd0
golang.org/x/sync/errgroup.(*Group).Go.func1()
        /home/dstepanov/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x50
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 42
        /home/dstepanov/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x96
FAIL    git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree     5.279s

Oh, sorry. I missed that point is about `writeFile`. I tested object counter with unit test and without using file guard for `writeFile` I got panic about invalid count: ``` go test -timeout 60m -tags integration -run ^TestObjectCounter$ git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree -count=100 panic: fstree.SimpleCounter: invalid count goroutine 44 [running]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*SimpleCounter).Dec(0xc0005858f0, 0x56) /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/counter.go:58 +0xb8 git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*linuxWriter).removeFile(0xc000034230, {0xc000332e80, 0x80}) /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree_write_linux.go:108 +0x2a4 git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.(*FSTree).Delete(0xc000118070, {0xd742e0, 0xc000034280}, {{{0x4a, 0x9e, 0xc1, 0x7d, 0x5d, 0x2a, 0x88, ...}, ...}, ...}) /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree.go:342 +0x55a git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree.TestObjectCounter.func3() /home/dstepanov/src/frostfs-node/pkg/local_object_storage/blobstor/fstree/fstree_test.go:86 +0xd0 golang.org/x/sync/errgroup.(*Group).Go.func1() /home/dstepanov/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:78 +0x50 created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 42 /home/dstepanov/go/pkg/mod/golang.org/x/sync@v0.7.0/errgroup/errgroup.go:75 +0x96 FAIL git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor/fstree 5.279s ```

fyrchik commented

2024-09-12 08:14:44 +00:00

Locking certainly fixes the bug, but why does it happen?
We can only Unlink once with zero error, what is the problem?

Locking certainly fixes the bug, but why does it happen? We can only `Unlink` once with zero error, what is the problem?

dstepanov-yadro commented

2024-09-12 09:43:35 +00:00

If object was deleted before counter.Inc(), then size will be negative.

If object was deleted before `counter.Inc()`, then size will be negative.

fyrchik commented

2024-09-12 11:43:34 +00:00

The same situation could happen with Inc()/Dec(), no? We just add a different constant here.

The same situation could happen with `Inc()`/`Dec()`, no? We just add a different constant here.

dstepanov-yadro commented

2024-09-12 11:45:04 +00:00

Yes. Just before atomic was used and writecache has negative values handling.

pkg/local_object_storage/blobstor/fstree/fstree_write_linux.go Outdated

					
				@ -52,2 +62,4 @@

				func (w *linuxWriter) writeFile(p string, data []byte) error {

					if w.fileCounterEnabled {

						w.fileGuard.Lock(p)

fyrchik commented

2024-09-11 12:34:45 +00:00

The whole point of linuxWriter was to avoid locking and use native linux facilities to ensure atomic write.
This purpose is not defeated.
We have generic writer, which locks and does the same thing.

The whole point of `linuxWriter` was to avoid locking and use native linux facilities to ensure atomic write. This purpose is not defeated. We have generic writer, which locks and does the same thing.

dstepanov-yadro commented

2024-09-11 13:38:25 +00:00

As far as I remember, the main idea of linuxWriter was to automatically delete temporary files in case of fail. Anyway locking doesn't affect too much:

goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand10-8         	    1339	    932631 ns/op	    4567 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand10-8            	    1176	    914682 ns/op	    4585 B/op	      71 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand100-8        	    1392	    849483 ns/op	    4774 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand100-8           	    1374	    864169 ns/op	    4776 B/op	      71 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand1000-8       	    1261	    928926 ns/op	    6619 B/op	      70 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand1000-8          	    1327	    857081 ns/op	    6625 B/op	      71 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite10-8    	    1363	    934212 ns/op	    3925 B/op	      65 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite10-8       	    1186	    887559 ns/op	    3953 B/op	      66 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite100-8   	    1338	    847966 ns/op	    4137 B/op	      65 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite100-8      	    1339	    909287 ns/op	    4152 B/op	      66 allocs/op
BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite1000-8  	    1252	    887396 ns/op	    6001 B/op	      65 allocs/op
BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite1000-8     	    1303	    892061 ns/op	    5994 B/op	      66 allocs/op

As far as I remember, the main idea of `linuxWriter` was to automatically delete temporary files in case of fail. Anyway locking doesn't affect too much: ``` goos: linux goarch: amd64 pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/blobstor cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand10-8 1339 932631 ns/op 4567 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand10-8 1176 914682 ns/op 4585 B/op 71 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand100-8 1392 849483 ns/op 4774 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand100-8 1374 864169 ns/op 4776 B/op 71 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-rand1000-8 1261 928926 ns/op 6619 B/op 70 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-rand1000-8 1327 857081 ns/op 6625 B/op 71 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite10-8 1363 934212 ns/op 3925 B/op 65 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite10-8 1186 887559 ns/op 3953 B/op 66 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite100-8 1338 847966 ns/op 4137 B/op 65 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite100-8 1339 909287 ns/op 4152 B/op 66 allocs/op BenchmarkSubstorageWritePerf/fstree_without_object_counter-overwrite1000-8 1252 887396 ns/op 6001 B/op 65 allocs/op BenchmarkSubstorageWritePerf/fstree_with_object_counter-overwrite1000-8 1303 892061 ns/op 5994 B/op 66 allocs/op ```

pkg/local_object_storage/writecache/cache.go Outdated

fyrchik commented

2024-09-11 12:20:04 +00:00

Unrelated to the commit.

dstepanov-yadro commented

2024-09-11 13:40:30 +00:00

Please retry comment: I don't see it now.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/cache.go Outdated

					
				@ -0,0 +132,4 @@

					defer c.modeMtx.Unlock()

					var err error

					if c.fsTree != nil {

fyrchik commented

2024-09-11 12:20:34 +00:00

fsTree was here before, why there was no closing logic?

dstepanov-yadro commented

2024-09-11 13:41:23 +00:00

I don't know. fstree.Close just closes metrics, which are not defined for writecache. I decided to add Close call anyway.

I don't know. `fstree.Close` just closes metrics, which are not defined for writecache. I decided to add `Close` call anyway.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/flush.go Outdated

					
				@ -197,9 +153,6 @@ func (c *cache) flushFSTree(ctx context.Context, ignoreErrors bool) error {

						err = c.flushObject(ctx, &obj, e.ObjectData, StorageTypeFSTree)

						if err != nil {

							if ignoreErrors {

fyrchik commented

2024-09-11 12:39:24 +00:00

Why?

dstepanov-yadro commented

2024-09-11 13:49:08 +00:00

Good question!
bbolt + fstree writecache has different logic for error in case of flush for small and large objects:

pkg/local_object_storage/writecache/flush.go
Line 326 in ec8da40
if err := c.flushObject(ctx, &obj, item.data, StorageTypeDB); err != nil {
pkg/local_object_storage/writecache/flush.go
Line 199 in ec8da40
if err != nil {

There also unit test that checks that flush returns an error in case of failed flush:

require.Error(t, wc.SetMode(mode.Degraded))

So I changed flush implementation for FSTree to pass unit test.

Good question! `bbolt + fstree` writecache has different logic for error in case of flush for small and large objects: 1. https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/ec8da4056704d81107f514bcb998aa6f3dd7b07f/pkg/local_object_storage/writecache/flush.go#L326 2. https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/ec8da4056704d81107f514bcb998aa6f3dd7b07f/pkg/local_object_storage/writecache/flush.go#L199 There also unit test that checks that flush returns an error in case of failed flush: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/ec8da4056704d81107f514bcb998aa6f3dd7b07f/pkg/local_object_storage/writecache/flush_test.go#L162 So I changed flush implementation for FSTree to pass unit test.

fyrchik commented

2024-09-12 11:48:08 +00:00

Ok, it seems new implementation is correct:

We ignore read errors (consider flushing from bad disk).
We report write errors (because otherwise noop flush would be ok) and ignoringflush completely leads to the same result.

Ok, it seems new implementation is correct: 1. We ignore read errors (consider flushing from bad disk). 2. We report write errors (because otherwise noop flush would be ok) and ignoring`flush` completely leads to the same result.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/flush.go Outdated

					
				@ -172,0 +110,4 @@

						Address: objInfo.addr,

					})

					if err != nil {

						if !client.IsErrObjectNotFound(err) {

fyrchik commented

2024-09-11 12:38:40 +00:00

This looks like a kludge, so this function is more like flushIfAnObjectExistsWorker.
Why do we delete in this thread and not in the one we list?

This looks like a kludge, so this function is more like `flushIfAnObjectExistsWorker`. Why do we delete in this thread and not in the one we list?

dstepanov-yadro commented

2024-09-11 13:58:25 +00:00

Why do we delete in this thread and not in the one we list?

I decided to split read worker that reads object addresses and flush workers that read object data, save to blobstore and delete object. This is required to have more than one flush worker for fstree.

this function is more like flushIfAnObjectExistsWorker

Fixed.

> Why do we delete in this thread and not in the one we list? I decided to split `read` worker that reads object addresses and `flush` workers that read object data, save to blobstore and delete object. This is required to have more than one flush worker for fstree. > this function is more like flushIfAnObjectExistsWorker Fixed.

fyrchik commented

2024-09-12 08:03:14 +00:00

I don't see how this is required, could you elaborate.
It doesn't matter in what worker we read file.

I don't see how this is required, could you elaborate. It doesn't matter in what worker we read file.

dstepanov-yadro commented

2024-09-12 08:12:32 +00:00

If there are 5 workers that read the same FSTree with current implementation of fstree.Iterate, they could read the same file at the same time. So this comes to read overhead.

If there are 5 workers that read the same FSTree with current implementation of `fstree.Iterate`, they could read the same file at the same time. So this comes to read overhead.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/flush.go Outdated

					
				@ -305,0 +238,4 @@

					address string

				}

				func (c *cache) flushAndDropBBoltDB(ctx context.Context) error {

fyrchik commented

2024-09-11 12:21:35 +00:00

Could you put all flushAndDrop* (i.e. something we would remove in future) in a separate file?

Could you put all `flushAndDrop*` (i.e. something we would remove in future) in a separate file?

dstepanov-yadro commented

2024-09-11 14:09:42 +00:00

Done

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/flush.go Outdated

					
				@ -321,3 +269,1 @@

									continue

								}

								return err

								return fmt.Errorf("unmarshal object from database: %w", err)

fyrchik commented

2024-09-11 12:22:32 +00:00

Why was if ignoreErrors dropped?
Why was reportFlushError dropped?

Why was `if ignoreErrors` dropped? Why was `reportFlushError` dropped?

dstepanov-yadro commented

2024-09-11 14:16:09 +00:00

This method is used now to flush and drop small.bolt database file by writecache.Init(). So these flags are not defined for Init. I think we should not to ignore errors as we want to drop the whole file. Also report flush error is useless, as shard will not be initialized in case of error.

This method is used now to flush and drop `small.bolt` database file by `writecache.Init()`. So these flags are not defined for `Init`. I think we should not to ignore errors as we want to drop the whole file. Also report flush error is useless, as shard will not be initialized in case of error.

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/limiter.go Outdated

					
				@ -0,0 +5,4 @@

					"sync"

				)

				type flushLimiter struct {

fyrchik commented

2024-09-11 12:23:28 +00:00

Please provide some comment about what this struct do, and what it is used for.

dstepanov-yadro commented

2024-09-11 14:38:27 +00:00

Done

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/limiter.go Outdated

					
				@ -0,0 +18,4 @@

					}

				}

				func (l *flushLimiter) acquire(ctx context.Context, size uint64) error {

fyrchik commented

2024-09-11 12:45:39 +00:00

Do we need to accept context here?
IMO it just makes it a lot harder to understand and use properly.
All stdlib sync facilities could be modelled as state machines, the context adds a whole new level of complexity because it can expire in any place in code, concurrently to the main logic.

Signalling can be done with a variable: it is some Condition you could wait for.

Do we _need_ to accept context here? IMO it just makes it a lot harder to understand and use properly. All stdlib `sync` facilities could be modelled as state machines, the context adds a whole new level of complexity because it can expire in _any_ place in code, concurrently to the main logic. Signalling can be done with a variable: it is some **Cond**ition you could wait for.

dstepanov-yadro commented

2024-09-11 14:28:48 +00:00

I think we need to use ctx here to stop flushing as soon as possible. As far as I remember we already had bug with unable to stop service with timeout because of flushing objects from writecache.

I think we need to use `ctx` here to stop flushing as soon as possible. As far as I remember we already had bug with unable to stop service with timeout because of flushing objects from writecache.

fyrchik commented

2024-09-12 08:04:25 +00:00

I agree we need to stop flushing by context.
My question was about ctx being a part of the flushLimiter interface vs being used in the flush itself.

I agree we need to stop flushing by context. My question was about `ctx` being a part of the `flushLimiter` interface vs being used in the `flush` itself.

dstepanov-yadro commented

2024-09-12 08:58:28 +00:00

Ok, I moved this out from limiter.

dstepanov-yadro commented

2024-09-12 11:16:25 +00:00

Also moved limiter from writecache fields

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/limiter.go Outdated

					
				@ -0,0 +19,4 @@

				}

				func (l *flushLimiter) acquire(ctx context.Context, size uint64) error {

					stopf := context.AfterFunc(ctx, func() {

fyrchik commented

2024-09-11 12:25:07 +00:00

Will we leak a goroutine if we call this with context.Background()?

Will we leak a goroutine if we call this with `context.Background()`?

dstepanov-yadro commented

2024-09-11 14:25:31 +00:00

See https://pkg.go.dev/context#AfterFunc
If ctx is context.Backgroud(), then func()... will not be called.
Am I missing something?

See https://pkg.go.dev/context#AfterFunc If `ctx` is `context.Backgroud()`, then `func()...` will not be called. Am I missing something?

aarifullin commented

2024-09-12 09:33:19 +00:00

Will we leak a goroutine if we call this with context.Background()?

It seems if parent context (context.Background()) was never cancelled, stopf cancells inherited context and, thus, goroutine stops

> Will we leak a goroutine if we call this with context.Background()? It seems if parent context (`context.Background()`) was never cancelled, `stopf` cancells inherited context and, thus, goroutine stops

fyrchik marked this conversation as resolved

pkg/local_object_storage/writecache/limiter.go Outdated

					
				@ -0,0 +49,4 @@

					if l.size >= size {

						l.size -= size

					} else {

						l.size = 0

fyrchik commented

2024-09-11 12:26:01 +00:00

How about a panic here? We seem to hide some broken invariant by silently failing here.

dstepanov-yadro commented

2024-09-11 14:38:13 +00:00

Done

fyrchik marked this conversation as resolved

aarifullin reviewed 2024-09-11 13:01:28 +00:00

pkg/local_object_storage/writecache/limiter.go Outdated

					
				@ -0,0 +32,4 @@

					// it is allowed to overflow maxSize to allow flushing objects with size > maxSize

					for l.count > 0 && l.size+size > l.maxSize {

						l.cond.Wait()

aarifullin commented

2024-09-11 13:01:28 +00:00

This is the awesome approach to manage acquiring/releasing by goroutines 👍

dstepanov-yadro force-pushed feat/wc_fstree from 6a5f0c9d50 to 78a554616b

2024-09-11 13:15:52 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 78a554616b to d1c44de01c

2024-09-11 13:58:13 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 3b1795d22f to c9286b2d69

2024-09-11 14:17:47 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from c9286b2d69 to cfde039d2f

2024-09-11 14:38:04 +00:00

Compare

dstepanov-yadro reviewed 2024-09-11 14:40:09 +00:00

pkg/local_object_storage/writecache/flush.go Outdated

					
				@ -164,0 +69,4 @@

								}:

									return nil

								case <-ctx.Done():

									c.flushLimiter.release(oi.DataSize)

dstepanov-yadro commented

2024-09-11 14:40:09 +00:00

Added, missed this condition.

dstepanov-yadro force-pushed feat/wc_fstree from cfde039d2f to f1972fe26b

2024-09-12 08:56:42 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 5e038bdb19 to 95dfb724cc

2024-09-12 11:18:29 +00:00

Compare

dstepanov-yadro force-pushed feat/wc_fstree from 95dfb724cc to 7ad0271a5b

2024-09-12 11:22:40 +00:00

Compare

fyrchik reviewed 2024-09-12 11:39:42 +00:00

pkg/local_object_storage/writecache/limiter_test.go Outdated

					
				@ -0,0 +19,4 @@

							defer l.release(single)

							defer currSize.Add(-1)

							l.acquire(single)

							require.True(t, currSize.Add(1) <= 3)

fyrchik commented

2024-09-12 11:39:36 +00:00

I believe some of the t methods are not thread safe (remember dataraces related to zaptest.New logger), could you ensure require.True can be safely called here?

I believe _some_ of the `t` methods are not thread safe (remember dataraces related to `zaptest.New` logger), could you ensure `require.True` can be safely called here?

dstepanov-yadro commented

2024-09-12 11:49:56 +00:00

require.True calls t.Fail(), that protected by mtx.

`require.True` calls `t.Fail()`, that protected by mtx.

fyrchik marked this conversation as resolved

fyrchik reviewed 2024-09-12 12:00:53 +00:00

pkg/local_object_storage/blobstor/fstree/fstree.go Outdated

					
				@ -225,0 +278,4 @@

							continue

						}

						path := filepath.Join(curPath...)

						info, err := os.Stat(path)

fyrchik commented

2024-09-12 12:00:16 +00:00

entries[i].Info()?

`entries[i].Info()`?

dstepanov-yadro commented

2024-09-12 12:06:58 +00:00

fixed

pkg/local_object_storage/blobstor/fstree/fstree_write_generic.go Outdated

					
				@ -119,4 +114,3 @@

					} else {

						err = os.Remove(p)

					}

fyrchik commented

2024-09-12 11:54:20 +00:00

unrelated

dstepanov-yadro commented

2024-09-12 12:07:03 +00:00

fixed

dstepanov-yadro force-pushed feat/wc_fstree from 7ad0271a5b to b33559754d

2024-09-12 12:06:55 +00:00

Compare

fyrchik merged commit b33559754d into master

2024-09-12 13:20:05 +00:00

fyrchik referenced this pull request from a commit

2024-09-12 13:20:07 +00:00

[#1367] writecache: Drop bbolt DB

fyrchik referenced this pull request from a commit

2024-09-12 13:20:07 +00:00

[#1367] fstree: Add size to file counter

fyrchik referenced this pull request from a commit

2024-09-12 13:20:07 +00:00

[#1367] fstree: Add IterateInfo method

fyrchik referenced this pull request from a commit

2024-09-12 13:20:07 +00:00

[#1367] writecache: Flush from FSTree concurrently

fyrchik referenced this pull request from a commit

2024-09-12 13:20:08 +00:00

[#1367] writecache: Add background flushing objects limiter

fyrchik referenced this pull request from a commit

2024-09-12 13:20:08 +00:00

[#1367] writecache: Drop BBolt related config variables

fyrchik referenced this pull request from a commit

2024-09-12 13:20:08 +00:00

[#1367] writecache: Drop DB label from metrics

fyrchik referenced this pull request from a commit

2024-09-12 13:20:08 +00:00

[#1367] writecache: Move DB related code to upgrade.go

fyrchik referenced this pull request from a commit

2024-09-12 13:20:08 +00:00

[#1367] fstree: Add size hint for Delete

Sign in to join this conversation.

No reviewers

TrueCloudLab/storage-core-developers

No labels

No milestone

No project

No assignees

3 participants

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1367

Rows
Columns