Update gRPC dependency #1268

Closed
opened 2024-07-23 12:52:23 +00:00 by fyrchik · 3 comments
Owner

DialContext is now deprecated 0231b0d942/clientconn.go (L221)

It uses NewClient internally, we should move to it too.
We have one important case, which needs to continue to work:

  1. Dial timeout is X
  2. Operation timeout is Y, where X << Y

We need it to properly work during failover: if the host is down, we would like to wait for a small timeout, not for a big one.
Operation timeout is comparable to the client timeout, so the scenario above somewhat prevents client errors.

As can be seen from the DialContext source, WaitForStateChange may be of help.

`DialContext` is now deprecated https://github.com/grpc/grpc-go/blob/0231b0d9429d46c8c5e534450f7dd97a4c53812f/clientconn.go#L221 It uses `NewClient` internally, we should move to it too. We have one important case, which needs to continue to work: 1. Dial timeout is X 2. Operation timeout is Y, where X << Y We need it to properly work during failover: if the host is down, we would like to wait for a small timeout, not for a big one. Operation timeout is comparable to the client timeout, so the scenario above somewhat prevents client errors. As can be seen from the `DialContext` source, `WaitForStateChange` may be of help.
fyrchik added the
frostfs-node
internal
labels 2024-07-23 12:52:52 +00:00
fyrchik added this to the v0.43.0 milestone 2024-07-23 12:53:13 +00:00
Author
Owner

In this task, also check api-go and SDK, we might want to create a generic helper somewhere.
However, this is an antipattern (WithBlock, see https://github.com/grpc/grpc-go/blob/master/Documentation/anti-patterns.md), so another solution to failover cases may be possible.

In this task, also check api-go and SDK, we might want to create a generic helper somewhere. However, this is an antipattern (`WithBlock`, see https://github.com/grpc/grpc-go/blob/master/Documentation/anti-patterns.md), so another solution to failover cases may be possible.
Author
Owner

Another appproach to failover cases:

  1. Get rid of dial timeouts.
  2. Do not reconnect to the clients for some time.

Currently this "some time" is constant. The suggestion is to create a background thread, waiting for a state change. New RPC can then be sent to that thread. It seems this way we can handle "node is permanently offline" situation well.

Another appproach to failover cases: 1. Get rid of dial timeouts. 2. Do not reconnect to the clients for some time. Currently this "some time" is constant. The suggestion is to create a background thread, waiting for a state change. New RPC can then be sent to that thread. It seems this way we can handle "node is permanently offline" situation well.
dstepanov-yadro was assigned by fyrchik 2024-09-13 13:17:18 +00:00
Author
Owner

Closed via #1374

Closed via #1374
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1268
No description provided.