engine: Check object existence concurrently #912

Merged
fyrchik merged 1 commit from dstepanov-yadro/frostfs-node:feat/engine_existance_concurrently into master 2024-01-23 07:16:33 +00:00

Relates #874

Now existence check for storage engine performs concurrently.

Benchmarks:

branch: 

goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkExists/2_shards-8         	   46248	     25421 ns/op	    6172 B/op	     130 allocs/op
BenchmarkExists/4_shards-8         	   34814	     34121 ns/op	    9968 B/op	     212 allocs/op
BenchmarkExists/8_shards-8         	   30361	     38975 ns/op	   12450 B/op	     262 allocs/op
BenchmarkExists/12_shards-8        	   26116	     45483 ns/op	   15443 B/op	     324 allocs/op
BenchmarkExists/16_shards-8        	   24202	     45179 ns/op	   15401 B/op	     323 allocs/op
PASS
ok  	git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine	58.800s

master:

goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkExists/2_shards-8         	   54979	     20443 ns/op	    5456 B/op	     120 allocs/op
BenchmarkExists/4_shards-8         	   28765	     39547 ns/op	   10510 B/op	     230 allocs/op
BenchmarkExists/8_shards-8         	   14846	     78923 ns/op	   20611 B/op	     450 allocs/op
BenchmarkExists/12_shards-8        	   10362	    120944 ns/op	   30667 B/op	     670 allocs/op
BenchmarkExists/16_shards-8        	   10677	    116825 ns/op	   30667 B/op	     670 allocs/op
PASS
ok  	git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine	59.553s
Relates #874 Now existence check for storage engine performs concurrently. Benchmarks: ``` branch: goos: linux goarch: amd64 pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkExists/2_shards-8 46248 25421 ns/op 6172 B/op 130 allocs/op BenchmarkExists/4_shards-8 34814 34121 ns/op 9968 B/op 212 allocs/op BenchmarkExists/8_shards-8 30361 38975 ns/op 12450 B/op 262 allocs/op BenchmarkExists/12_shards-8 26116 45483 ns/op 15443 B/op 324 allocs/op BenchmarkExists/16_shards-8 24202 45179 ns/op 15401 B/op 323 allocs/op PASS ok git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine 58.800s master: goos: linux goarch: amd64 pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkExists/2_shards-8 54979 20443 ns/op 5456 B/op 120 allocs/op BenchmarkExists/4_shards-8 28765 39547 ns/op 10510 B/op 230 allocs/op BenchmarkExists/8_shards-8 14846 78923 ns/op 20611 B/op 450 allocs/op BenchmarkExists/12_shards-8 10362 120944 ns/op 30667 B/op 670 allocs/op BenchmarkExists/16_shards-8 10677 116825 ns/op 30667 B/op 670 allocs/op PASS ok git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/engine 59.553s ```
dstepanov-yadro requested review from storage-core-committers 2024-01-16 09:43:20 +00:00
dstepanov-yadro requested review from storage-core-developers 2024-01-16 09:43:21 +00:00
acid-ant reviewed 2024-01-16 10:24:06 +00:00
@ -47,1 +37,4 @@
}
if client.IsErrObjectAlreadyRemoved(err) {
return err
Member

Why not to perform cancel here too? Looks like the result will be obtained faster in this case.

Why not to perform `cancel` here too? Looks like the result will be obtained faster in this case.
Author
Member

Fixed, thanks

Fixed, thanks
Author
Member

Oh, no. It's ok. If eg.Go returns error, than egCtx will be cancelled by errgroup.

Oh, no. It's ok. If `eg.Go` returns error, than egCtx will be cancelled by errgroup.
dstepanov-yadro force-pushed feat/engine_existance_concurrently from 2d312d8c6a to d73ae9866c 2024-01-16 10:39:12 +00:00 Compare
dstepanov-yadro force-pushed feat/engine_existance_concurrently from d73ae9866c to 075984a07a 2024-01-16 10:40:49 +00:00 Compare
acid-ant approved these changes 2024-01-16 10:42:59 +00:00
acid-ant approved these changes 2024-01-16 10:44:36 +00:00
elebedeva approved these changes 2024-01-16 11:44:17 +00:00
fyrchik changed title from engine: Check object existance concurrently to engine: Check object existence concurrently 2024-01-16 11:48:37 +00:00
Owner

Can we also either perform benchmarks on some actual hardware (server-grade CPU) or attach some k6 results (local.js is enough)?
The change is obviously on the critical path, so we don't want to acidentally add regressions.

Can we also either perform benchmarks on some actual hardware (server-grade CPU) or attach some k6 results (local.js is enough)? The change is obviously on the critical path, so we don't want to acidentally add regressions.
aarifullin approved these changes 2024-01-17 10:25:09 +00:00
dstepanov-yadro force-pushed feat/engine_existance_concurrently from 075984a07a to 4b82c6b35c 2024-01-18 08:23:39 +00:00 Compare
Author
Member

Can we also either perform benchmarks on some actual hardware (server-grade CPU) or attach some k6 results (local.js is enough)?
The change is obviously on the critical path, so we don't want to acidentally add regressions.

Test write 8kb 600th 15min

k6 output:

original:

     data_received..............: 0 B     0 B/s
     data_sent..................: 19 GB   21 MB/s
     frostfs_obj_put_duration...: avg=35.12ms  min=4.77ms   med=36.21ms  max=100.73ms p(90)=43.43ms  p(95)=46.87ms 
     frostfs_obj_put_total......: 2283168 2536.235573/s
     iteration_duration.........: avg=236.52ms min=124.72µs med=237.08ms max=301.43ms p(90)=245.66ms p(95)=251.49ms
     iterations.................: 2283168 2536.235573/s
     vus........................: 600     min=600       max=600

with fix

     data_received..............: 0 B     0 B/s
     data_sent..................: 19 GB   21 MB/s
     frostfs_obj_put_duration...: avg=34.88ms  min=4.58ms  med=35.98ms  max=120.21ms p(90)=42.53ms  p(95)=45.82ms 
     frostfs_obj_put_total......: 2284622 2537.753398/s
     iteration_duration.........: avg=236.36ms min=139.1µs med=236.86ms max=320.93ms p(90)=244.71ms p(95)=251.52ms
     iterations.................: 2284622 2537.753398/s
     vus........................: 600     min=600       max=600

On grafana boards there is no significant changes.

> Can we also either perform benchmarks on some actual hardware (server-grade CPU) or attach some k6 results (local.js is enough)? > The change is obviously on the critical path, so we don't want to acidentally add regressions. Test `write 8kb 600th 15min` k6 output: original: ``` data_received..............: 0 B 0 B/s data_sent..................: 19 GB 21 MB/s frostfs_obj_put_duration...: avg=35.12ms min=4.77ms med=36.21ms max=100.73ms p(90)=43.43ms p(95)=46.87ms frostfs_obj_put_total......: 2283168 2536.235573/s iteration_duration.........: avg=236.52ms min=124.72µs med=237.08ms max=301.43ms p(90)=245.66ms p(95)=251.49ms iterations.................: 2283168 2536.235573/s vus........................: 600 min=600 max=600 ``` with fix ``` data_received..............: 0 B 0 B/s data_sent..................: 19 GB 21 MB/s frostfs_obj_put_duration...: avg=34.88ms min=4.58ms med=35.98ms max=120.21ms p(90)=42.53ms p(95)=45.82ms frostfs_obj_put_total......: 2284622 2537.753398/s iteration_duration.........: avg=236.36ms min=139.1µs med=236.86ms max=320.93ms p(90)=244.71ms p(95)=251.52ms iterations.................: 2284622 2537.753398/s vus........................: 600 min=600 max=600 ``` On grafana boards there is no significant changes.
dstepanov-yadro force-pushed feat/engine_existance_concurrently from 4b82c6b35c to 05838c26f7 2024-01-22 15:52:35 +00:00 Compare
fyrchik approved these changes 2024-01-22 16:04:11 +00:00
@ -48,0 +54,4 @@
}
return nil
}
if exists.CompareAndSwap(false, res.Exists()) {
Owner

How is it different from Store(true)? I would expect Store to have less overhead and more obvious, calling cancel() twice is allowed too. Don't mind using your approach too.

How is it different from `Store(true)`? I would expect `Store` to have less overhead and more obvious, calling `cancel()` twice is allowed too. Don't mind using your approach too.
Author
Member

cancel should be called only if object found, so we have to check res.Exists(). Or I don't understand your point.

`cancel` should be called only if object found, so we have to check `res.Exists()`. Or I don't understand your point.
Owner

I mean if res.Exists() { exists.Store(true); cancel() } instead of CompareAndSwap
In my approach we have estimated number of atomic operations equal to 1.

I mean `if res.Exists() { exists.Store(true); cancel() }` instead of `CompareAndSwap` In my approach we have estimated number of atomic operations equal to 1.
Author
Member

fixed.
Also dropped comment about concurrent exists.

fixed. Also dropped comment about concurrent exists.
@ -67,6 +67,12 @@ func (db *DB) Exists(ctx context.Context, prm ExistsPrm) (res ExistsRes, err err
return res, ErrDegradedMode
}
select {
Owner

Do all these select optimizations affect running speed? Asking because context expiring in Exists() is not likely, it is the first step in put, much faster than the expected operation time.

Do all these `select` optimizations affect running speed? Asking because context expiring in `Exists()` is not likely, it is the first step in put, much faster than the expected operation time.
Author
Member

These select are most required for blobovnicza tree.
I think it is good practice to check ctx.Done() before any IO operation.

These `select` are most required for blobovnicza tree. I think it is good practice to check `ctx.Done()` before any IO operation.
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed feat/engine_existance_concurrently from 05838c26f7 to f526f49995 2024-01-23 06:29:12 +00:00 Compare
fyrchik approved these changes 2024-01-23 07:16:18 +00:00
fyrchik merged commit f526f49995 into master 2024-01-23 07:16:33 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#912
No description provided.