Fix connection leak and panic at close operation #171
No reviewers
Labels
No labels
P0
P1
P2
P3
good first issue
pool
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-sdk-go#171
Loading…
Reference in a new issue
No description provided.
Delete branch "alexvanin/frostfs-sdk-go:fix/pool-panic-leak"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR fixes two issues.
Connection leak at restarts
Consider endpoint which accepts pool connections, but returns error on all RPCs like getting node info or session token.
After successful
dial()
, client keeps an open connection, but it is marked as unhealthy. After rebalance interval, unhealthy connection is re-established inrestartIfUnhealthy
function by creating new client and callingdial()
once again. While this is correct behaviour, pool should close unhealthy connection before that.To do that, pool should maintain dialing status. The easiest way is to modify health status to store three states:
Then
restartIfUnhealthy
may close all dialed connection and avoid connection leak during restarts.Panic at connection closing
If some pool connection are dead at initialization or during rebalance, then
dial()
returns error and connection is not established. Pool can't callClose()
on such connections, because it leads to panics.New health status allows to close only dialed connections therefore avoid panic.
9a950b4195
tode65d27636
@ -2780,3 +2812,3 @@
for _, pools := range p.innerPools {
for _, cli := range pools.clients {
_ = cli.close()
if cli.isDialed() {
Wouldn't it be better from the client POV to just allow closing even undialed clients?
It would, but simple solution down below didn't work, so I suppose changes should be done on api-go level as well.
Function comment clearly tells us
MUST NOT be called before successful Dial
. So invoker (pool) should check dial state, which it does now.@ -86,6 +86,8 @@ type client interface {
type clientStatus interface {
// isHealthy checks if the connection can handle requests.
isHealthy() bool
// isDialled checks if the connection was created.
isDialled -> isDialed
de65d27636
to60463871db