Update gRPC dependency #1268

New issue

Closed

opened 2024-07-23 12:52:23 +00:00 by fyrchik · 3 comments

fyrchik commented

2024-07-23 12:52:23 +00:00

Owner

DialContext is now deprecated 0231b0d942/clientconn.go (L221)

It uses NewClient internally, we should move to it too.
We have one important case, which needs to continue to work:

Dial timeout is X
Operation timeout is Y, where X << Y

We need it to properly work during failover: if the host is down, we would like to wait for a small timeout, not for a big one.
Operation timeout is comparable to the client timeout, so the scenario above somewhat prevents client errors.

As can be seen from the DialContext source, WaitForStateChange may be of help.

`DialContext` is now deprecated https://github.com/grpc/grpc-go/blob/0231b0d9429d46c8c5e534450f7dd97a4c53812f/clientconn.go#L221 It uses `NewClient` internally, we should move to it too. We have one important case, which needs to continue to work: 1. Dial timeout is X 2. Operation timeout is Y, where X << Y We need it to properly work during failover: if the host is down, we would like to wait for a small timeout, not for a big one. Operation timeout is comparable to the client timeout, so the scenario above somewhat prevents client errors. As can be seen from the `DialContext` source, `WaitForStateChange` may be of help.

fyrchik added the

frostfs-node

internal

labels 2024-07-23 12:52:52 +00:00

fyrchik added this to the v0.43.0 milestone 2024-07-23 12:53:13 +00:00

fyrchik commented

2024-07-23 12:54:28 +00:00

Author

Owner

In this task, also check api-go and SDK, we might want to create a generic helper somewhere.
However, this is an antipattern (WithBlock, see https://github.com/grpc/grpc-go/blob/master/Documentation/anti-patterns.md), so another solution to failover cases may be possible.

In this task, also check api-go and SDK, we might want to create a generic helper somewhere. However, this is an antipattern (`WithBlock`, see https://github.com/grpc/grpc-go/blob/master/Documentation/anti-patterns.md), so another solution to failover cases may be possible.

fyrchik commented

2024-07-23 15:18:49 +00:00

Author

Owner

Another appproach to failover cases:

Get rid of dial timeouts.
Do not reconnect to the clients for some time.

Currently this "some time" is constant. The suggestion is to create a background thread, waiting for a state change. New RPC can then be sent to that thread. It seems this way we can handle "node is permanently offline" situation well.

Another appproach to failover cases: 1. Get rid of dial timeouts. 2. Do not reconnect to the clients for some time. Currently this "some time" is constant. The suggestion is to create a background thread, waiting for a state change. New RPC can then be sent to that thread. It seems this way we can handle "node is permanently offline" situation well.

fyrchik referenced this issue from TrueCloudLab/frostfs-api-go

2024-08-01 05:50:15 +00:00

rpc: Accept interface in place of ClientConn #98

fyrchik referenced this issue from a commit

2024-08-01 05:50:41 +00:00

[#98] rpc: Accept interface in place of ClientConn

fyrchik referenced this issue

2024-08-14 11:39:41 +00:00

ir: Add health status reporting on reconfiguration #1311