Support active RPC limiting #1639
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1639
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "a-savchuk/frostfs-node:rps-limiting"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adopt TrueCloudLab/frostfs-qos#4 to limit the number of active RPCs
I used a virtual cluster to check limits are applied, the limits config looks as follows:
Since this limiter limits the number of active RPC rather than the rate, I tried to estimate that number using the existing metrics, calculating it as a product of a RPC rate and RPC latency. You can see the results below.
Also I check the ability to change limits on a service reload, changed the config from this
to this
See results below.
replicator.pool_size
from other settings 12839e0087WIP: node: Support active RPC limitingto WIP: Support active RPC limitingbc8616323d
to36844001c3
36844001c3
to94c2d596c1
94c2d596c1
to81ba3de84a
81ba3de84a
to084218f100
084218f100
tobbc56251cb
bbc56251cb
toda330ff5cb
da330ff5cb
to02bbacb071
WIP: Support active RPC limitingto Support active RPC limiting@ -52,0 +79,4 @@
if !ok {
return new(apistatus.ResourceExhausted)
}
defer release()
The current behavior is as follows: the counter increases, the handler is called, the counter decreases,
ss grpc.ServerStream
will continue to work.You need to make a wrapper over
ss grpc.ServerStream
that will decrease the counter when the stream is closed.Are you sure? I always thought an interceptor's handler is called only once. I don't need to track each message, so I can reduce the counter on the handler exit, can't I?
Now I'm not sure:) So if you checked and it works as expected, then it is great!
I'll check it one more time
I think the handler calls
RecvMsg
and other stream methods itself, so if I don't need to process each stream message, I can simply call the handler.65c6718afb/server.go (L1570-L1579)
65c6718afb/server.go (L1694-L1703)
65c6718afb/examples/features/interceptor/server/main.go
@ -20,4 +20,0 @@
// LogWorkerPoolError writes debug error message of object worker pool to provided logger.
func LogWorkerPoolError(ctx context.Context, l *logger.Logger, req string, err error) {
l.Error(ctx, logs.UtilCouldNotPushTaskToWorkerPool,
Also drop
logs.UtilCouldNotPushTaskToWorkerPool
Done
02bbacb071
to72685e35ee
@ -29,2 +31,4 @@
// PoolSize returns the value of "pool_size" config parameter
// from "replicator" section.
//
// Returns PoolSizeDefault if the value is not positive integer.
not positive
->non-positive
:)Fixed
@ -0,0 +22,4 @@
var limits []LimitConfig
var i uint64
for ; ; i++ {
Optional change request
How about
for i := uint64(0); ; i++
as we don't this variable outside?Done
72685e35ee
toad3b32eade
New commits pushed, approval review dismissed automatically according to repository settings
@ -134,11 +136,13 @@ func getGrpcServerOpts(ctx context.Context, c *cfg, sc *grpcconfig.Config) ([]gr
qos.NewUnaryServerInterceptor(),
metrics.NewUnaryServerInterceptor(),
tracing.NewUnaryServerInterceptor(),
qosInternal.NewMaxActiveRPCLimiterUnaryServerInterceptor(func() limiting.Limiter { return c.cfgGRPC.limiter.Load() }),
We could use
c.cfgGRPC.limiter.Load
instead offunc() limiting.Limiter { return c.cfgGRPC.limiter.Load() }
.Was your choice deliberate?
NewMaxActiveRPCLimiterUnaryServerInterceptor
receives theLimiter
interface, butc.cfgGRPC.limiter.Load
returns the*SemaphoreLimiter
@ -52,0 +68,4 @@
}
}
//nolint:contextcheck (false positive)
false positive
implies a bug in linter.Here we have no bug in linter, just cumbersome gRPC API.
Made the message more clear
@ -52,0 +69,4 @@
}
//nolint:contextcheck (false positive)
func NewMaxActiveRPCLimiterStreamServerInterceptor(getLimiter func() limiting.Limiter) grpc.StreamServerInterceptor {
Refs #1656.
@ -1416,0 +1413,4 @@
func (c *cfg) reloadLimits() error {
var limits []limiting.KeyLimit
for _, l := range rpcconfig.Limits(c.appCfg) {
limits = append(limits, limiting.KeyLimit{Keys: l.Methods, Limit: l.MaxOps})
We currently do not check for
Keys
uniqueness insidelimits
array.This may be either a feature or a misconfiguration.
I don't think we need such feature, at the very least, the last item should take priority. And without wildcards, such feature has very limited use.
So, what about checking for uniqueness?
Also, there may be typos, proto names are not short. Do we have any solution for this?
c.cfgGRPC.servers[0].Server.GetServiceInfo()
could be of use.As an example of where this will be useful -- future transition from
neo.fs
tofrostfs
namespace.The limiter already do this check
func (lr *SemaphoreLimiter) addLimit(limit *KeyLimit, sem *semaphore.Semaphore) error {
for _, key := range limit.Keys {
if _, exists := lr.m[key]; exists {
return fmt.Errorf("duplicate key %q", key)
}
lr.m[key] = sem
}
return nil
}
I'll add it
ad3b32eade
to611b96f54b