Shard OPS limiter #1636

Merged
dstepanov-yadro merged 12 commits from dstepanov-yadro/frostfs-node:feat/shard_io_limiter into master 2025-03-03 10:52:34 +00:00

A limiter is used to limit the rate of shard operations.
If resources are exceeded, a special error is returned.

A limiter is used to limit the rate of shard operations. If resources are exceeded, a special error is returned.
dstepanov-yadro added 1 commit 2025-02-05 07:51:35 +00:00
[#9999] config: Add shard.limits config
Some checks failed
DCO action / DCO (pull_request) Successful in 43s
Vulncheck / Vulncheck (pull_request) Successful in 50s
Pre-commit hooks / Pre-commit (pull_request) Failing after 1m18s
Build / Build Components (pull_request) Successful in 1m26s
Tests and linters / Lint (pull_request) Failing after 1m28s
Tests and linters / Run gofumpt (pull_request) Successful in 1m33s
Tests and linters / Staticcheck (pull_request) Successful in 1m57s
Tests and linters / gopls check (pull_request) Failing after 2m6s
Tests and linters / Tests (pull_request) Successful in 3m13s
Tests and linters / Tests with -race (pull_request) Successful in 3m52s
6686448d86
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro force-pushed feat/shard_io_limiter from 6686448d86 to 619a115f66 2025-02-05 08:02:50 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 619a115f66 to a3e271cc23 2025-02-05 08:13:23 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from a3e271cc23 to 77f67f837d 2025-02-05 08:16:44 +00:00 Compare
dstepanov-yadro added 1 commit 2025-02-05 09:31:27 +00:00
[#9999] qos: Add Limiter
Some checks failed
DCO action / DCO (pull_request) Successful in 43s
Vulncheck / Vulncheck (pull_request) Successful in 56s
Build / Build Components (pull_request) Failing after 1m9s
Tests and linters / Tests (pull_request) Failing after 1m4s
Tests and linters / Staticcheck (pull_request) Failing after 1m8s
Tests and linters / Lint (pull_request) Failing after 1m16s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m33s
Tests and linters / Run gofumpt (pull_request) Successful in 2m29s
Tests and linters / gopls check (pull_request) Successful in 3m6s
Tests and linters / Tests with -race (pull_request) Failing after 3m15s
44218bf4f6
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro force-pushed feat/shard_io_limiter from 44218bf4f6 to bb4241472d 2025-02-05 09:37:10 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from bb4241472d to ac30e6f324 2025-02-05 12:58:09 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from ac30e6f324 to f383ab1fe6 2025-02-05 13:27:58 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from f383ab1fe6 to c82056f963 2025-02-06 14:27:11 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from c82056f963 to da05997a6d 2025-02-07 14:26:26 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from da05997a6d to 3f7ef8ee01 2025-02-10 11:02:57 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 3f7ef8ee01 to 519cf71a27 2025-02-10 13:25:16 +00:00 Compare
dstepanov-yadro added 1 commit 2025-02-10 14:57:02 +00:00
[#9999] object: Fix IO tag adjustment for Put/Patch
Some checks failed
DCO action / DCO (pull_request) Successful in 44s
Vulncheck / Vulncheck (pull_request) Successful in 1m1s
Build / Build Components (pull_request) Successful in 1m33s
Pre-commit hooks / Pre-commit (pull_request) Failing after 1m34s
Tests and linters / gopls check (pull_request) Successful in 2m46s
Tests and linters / Tests with -race (pull_request) Successful in 3m9s
Tests and linters / Run gofumpt (pull_request) Successful in 4m27s
Tests and linters / Lint (pull_request) Successful in 4m45s
Tests and linters / Staticcheck (pull_request) Successful in 4m47s
Tests and linters / Tests (pull_request) Successful in 4m56s
6da33cd724
There was no tag adjustment for CloseAndRecv.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro force-pushed feat/shard_io_limiter from 6da33cd724 to 9553f19670 2025-02-12 13:46:31 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 9553f19670 to 8b7517cb6d 2025-02-12 13:51:50 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 8b7517cb6d to 5453747d80 2025-02-13 08:35:51 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 5453747d80 to 5e60db6705 2025-02-14 07:06:52 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 5e60db6705 to 083299742f 2025-02-14 07:11:03 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 083299742f to ab5b48e28b 2025-02-14 07:18:31 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from ab5b48e28b to 49c384ef9d 2025-02-14 07:30:48 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 49c384ef9d to f22519cef7 2025-02-14 07:37:01 +00:00 Compare
dstepanov-yadro changed title from WIP: Shard OPS limiter to Shard OPS limiter 2025-02-14 07:56:23 +00:00
requested reviews from storage-core-committers, storage-core-developers 2025-02-14 07:56:24 +00:00
a-savchuk reviewed 2025-02-18 13:08:30 +00:00
@ -125,6 +126,14 @@ func (x *Config) GC() *gcconfig.Config {
)
}
// Limits returns "limits" subsection as a gcconfig.Config.
Member

gcconfig?

gcconfig?
Author
Member

fixed

fixed
Author
Member

fixed

fixed
a-savchuk marked this conversation as resolved
@ -139,7 +139,10 @@
"skip_session_token_issuer_verification": true
},
"get": {
"priority": ["$attribute:ClusterName", "$attribute:UN-LOCODE"]
Member

Please revert it

Please revert it
Author
Member

fixed

fixed
a-savchuk marked this conversation as resolved
@ -303,1 +304,4 @@
### `limits` subsection
### `limits` subsection
Member

You defined this section twice

You defined this section twice
Author
Member

fixed

fixed
Author
Member

fixed

fixed
a-savchuk marked this conversation as resolved
@ -0,0 +57,4 @@
return &semaphore{limit: int64(config.MaxRunningOps)}, nil
}
return scheduling.NewMClock(
uint64(config.MaxRunningOps), uint64(config.MaxWaitingOps),
Member

You do a cast from int64 to int when reading a config, then you cast int to uint64 here. What about making config.MaxRunningOps and config.MaxWaitingOps of type uint64?

You do a cast from `int64` to `int` when reading a config, then you cast `int` to `uint64` here. What about making `config.MaxRunningOps` and `config.MaxWaitingOps` of type `uint64`?
Author
Member

Fixed: now config variables have int64 type.

Fixed: now config variables have int64 type.
Author
Member

Fixed: now config variables have int64 type.

Fixed: now config variables have int64 type.
a-savchuk marked this conversation as resolved
dstepanov-yadro force-pushed feat/shard_io_limiter from f22519cef7 to 865c919bda 2025-02-19 11:48:50 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 865c919bda to a9d777da11 2025-02-21 07:26:11 +00:00 Compare
a-savchuk reviewed 2025-02-21 13:28:33 +00:00
@ -372,0 +381,4 @@
return err
}
if newConfig.limiter != nil {
newConfig.limiter.Close()
Member

You close a limiter of the new config is closed, not the old one. Is it right?

You close a limiter of the _new_ config is closed, not the _old_ one. Is it right?
Author
Member

It is right. Performed refactoring: renamed newConfig -> target, oldConfig -> source.

It is right. Performed refactoring: renamed `newConfig` -> `target`, `oldConfig` -> `source`.
a-savchuk marked this conversation as resolved
@ -0,0 +40,4 @@
}
readScheduler, err := createScheduler(c.Read())
if err != nil {
return nil, fmt.Errorf("failed to create read scheduler: %w", err)
Member

Should be without "failed to"

Should be without "failed to"
Author
Member

ok, fixed

ok, fixed
a-savchuk marked this conversation as resolved
@ -0,0 +78,4 @@
return fmt.Errorf("invalid limit_ops for tag %s: must be positive value", t.String())
}
if float64Value(v.Reserved) < 0 || math.IsNaN(float64Value(v.Reserved)) {
return fmt.Errorf("invalid limit_ops for tag %s: must be positive value", t.String())
Member

limit_ops -> reserved_ops

`limit_ops` -> `reserved_ops`
Author
Member

fixed

fixed
a-savchuk marked this conversation as resolved
dstepanov-yadro force-pushed feat/shard_io_limiter from a9d777da11 to cef12da660 2025-02-21 13:41:23 +00:00 Compare
dstepanov-yadro added 1 commit 2025-02-24 13:07:26 +00:00
[#1636] shard: Change ops limiter on shard reload
Some checks failed
DCO action / DCO (pull_request) Successful in 43s
Tests and linters / Tests (pull_request) Failing after 58s
Vulncheck / Vulncheck (pull_request) Successful in 56s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m24s
Build / Build Components (pull_request) Successful in 1m28s
Tests and linters / Tests with -race (pull_request) Failing after 2m6s
Tests and linters / Run gofumpt (pull_request) Successful in 2m21s
Tests and linters / Lint (pull_request) Successful in 2m49s
Tests and linters / Staticcheck (pull_request) Successful in 2m44s
Tests and linters / gopls check (pull_request) Successful in 3m41s
3e319e6789
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro force-pushed feat/shard_io_limiter from 3e319e6789 to 1ef6b5c4cd 2025-02-24 13:39:40 +00:00 Compare
fyrchik reviewed 2025-02-26 11:21:39 +00:00
@ -0,0 +71,4 @@
MaxRunningOps int64
// IdleTimeout returns the value of "idle_timeout" config parameter.
//
// Returns DefaultIdleTimeout if the value is not a valid duration.
Owner

Struct field doesn't return anything. Sth about default value here?

Struct field doesn't return anything. Sth about default value here?
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
fyrchik reviewed 2025-02-26 11:24:38 +00:00
@ -0,0 +2,4 @@
import "strings"
func Cond(cond bool, details ...string) {
Owner

Don't mind the naming, but do you have any examples of it being used somewhere?

assert.True is what I have seen in e.g. testify library.

Don't mind the naming, but do you have any examples of it being used somewhere? `assert.True` is what I have seen in e.g. testify library.
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
fyrchik reviewed 2025-02-26 11:25:26 +00:00
@ -0,0 +2,4 @@
import "strings"
func Cond(cond bool, details ...string) {
Owner

Btw, good call with introducing this function!

Btw, good call with introducing this function!
fyrchik marked this conversation as resolved
fyrchik reviewed 2025-02-26 11:32:46 +00:00
@ -0,0 +27,4 @@
default:
}
v := s.count.Add(1)
Owner

I have seen similar code in frostfs-qos/limiting/semaphore by @a-savchuk , have you not reused it deliberately?

I have seen similar code in `frostfs-qos/limiting/semaphore` by @a-savchuk , have you not reused it deliberately?
Author
Member

I just didn't wait for merge.
Fixed.

I just didn't wait for merge. Fixed.
fyrchik marked this conversation as resolved
@ -195,0 +175,4 @@
type rebuildLimiter struct {
concurrencyLimiter common.ConcurrencyLimiter
rateLimiter qos.Limiter
Owner

I think rebuild test (and maybe all tests, acutally) should check that limiter is in some ground state after tests.
Current limiter API makes it possible to forget calling release.
WIth tests we can ensure that it is called at least in the code we have tested.

I think rebuild test (and maybe all tests, acutally) should check that limiter is in some ground state after tests. Current limiter API makes it possible to forget calling release. WIth tests we can ensure that it is called at least in the code we have tested.
Owner

Hm, may be we can do this in storageEngine.Close()?
It is already called in Cleanup() and after all wg.Wait() I would expect limiter to be empty.
The problem is that there we have an interface, casting looks nice only in tests.

Hm, may be we can do this in `storageEngine.Close()`? It is already called in `Cleanup()` and after all `wg.Wait()` I would expect limiter to be empty. The problem is that there we have an interface, casting looks nice only in tests.
Author
Member

Added validation to rebuild tests.

Current limiter API makes it possible to forget calling release.

I hope that there will be no such cases, because the compiler should highlight this as an unused variable.

Added validation to rebuild tests. > Current limiter API makes it possible to forget calling release. I hope that there will be no such cases, because the compiler should highlight this as an unused variable.
Author
Member

Also added validation to engine tests.

Also added validation to engine tests.
Owner

the compiler should highlight this as an unused variable

Hm, good point

>the compiler should highlight this as an unused variable Hm, good point
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed feat/shard_io_limiter from 1ef6b5c4cd to 9b91325160 2025-02-28 06:28:54 +00:00 Compare
dstepanov-yadro added 1 commit 2025-02-28 07:05:23 +00:00
[#1636] blobovniczatree: Validate limiter release in rebuild unit tests
Some checks failed
DCO action / DCO (pull_request) Successful in 38s
Vulncheck / Vulncheck (pull_request) Successful in 1m5s
Pre-commit hooks / Pre-commit (pull_request) Failing after 1m29s
Build / Build Components (pull_request) Successful in 1m32s
Tests and linters / Run gofumpt (pull_request) Successful in 1m48s
Tests and linters / Staticcheck (pull_request) Successful in 2m1s
Tests and linters / Tests (pull_request) Successful in 2m32s
Tests and linters / gopls check (pull_request) Successful in 2m32s
Tests and linters / Lint (pull_request) Successful in 3m0s
Tests and linters / Tests with -race (pull_request) Successful in 4m25s
fa48aabdf3
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro force-pushed feat/shard_io_limiter from fa48aabdf3 to 5740fe3418 2025-02-28 07:28:53 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 5740fe3418 to 5e6764b51e 2025-02-28 07:30:34 +00:00 Compare
dstepanov-yadro force-pushed feat/shard_io_limiter from 5e6764b51e to 303490c34e 2025-02-28 11:38:42 +00:00 Compare
fyrchik approved these changes 2025-02-28 14:17:21 +00:00
Dismissed
dstepanov-yadro force-pushed feat/shard_io_limiter from 303490c34e to 4685afb1dc 2025-02-28 14:25:58 +00:00 Compare
dstepanov-yadro dismissed fyrchik's review 2025-02-28 14:25:58 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

requested reviews from storage-core-committers, storage-core-developers 2025-02-28 14:33:58 +00:00
fyrchik approved these changes 2025-03-03 05:35:51 +00:00
fyrchik added this to the v0.45.0 milestone 2025-03-03 05:36:04 +00:00
aarifullin approved these changes 2025-03-03 10:11:14 +00:00
aarifullin left a comment
Member

awesome

awesome
dstepanov-yadro merged commit 4685afb1dc into master 2025-03-03 10:52:34 +00:00
dstepanov-yadro deleted branch feat/shard_io_limiter 2025-03-03 10:52:36 +00:00
dstepanov-yadro referenced this pull request from a commit 2025-03-03 10:52:37 +00:00
acid-ant approved these changes 2025-03-03 15:26:33 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1636
No description provided.