Introduce retry mechanism for event subscriber #685
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#685
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Consider the subscriber that is used to get specific events from the blockchain. The subscriber wraps the morph client.
The subscriber is able to reconnect to the notificator endpoint if the connection has been lost/reset.
The problem is that if the length of endpoints list equals to
1
:len(c.endpoints.list) == 1
then this means SwitchRPC fails if the single endpoint is unavaiable for a while. By the way, it may be fine if there are few endpoints because we have good chance to swtich to working endpoint.That happens because the websocket client constructor does not attempt to reconnect after failure -
DialTimeout
for the WS-client is used forHandshakeTimeout
and does not help us at all because the connection won't be established until the peer is on.There are two ways to fix this problem:
Introduce retry mechanism for morph client used by subscriberto Introduce retry mechanism for event subscriberneo-go client shouldn't be changed, retry logic should be implemented in morph only.