`context.Cancelled` and `context.DeadlineExceeded` reasons for client sleep #24

New issue

Closed

opened 2023-01-23 13:59:54 +00:00 by carpawell · 0 comments

carpawell commented

2023-01-23 13:59:54 +00:00

(Migrated from github.com)

Context

After the https://github.com/nspcc-dev/neofs-node/pull/2164, any error returned by a client (means real error, non-status (SplitInfo is an exception)) makes all the following requests to the client fail. That could help with some high-load (or failover) scenarios.

Problem

On the other hand, I do not totally agree that such a solution should be considered our best effort. At least context.Cancelled is not clear at all to me.

Thoughts

I guess we can tune network communication, turn the Replicator off for some time, add some feedback mechanism for our components, etc, but not only sleep for 30s and hope that everything will be fine. Moreover, the current implementation will still read objects from disk and fail and the almost latest step (HEAD, before the final PUT). Also, does anybody ever think about the "pull" replication mechanism vs the current "push"?

#### Context After the https://github.com/nspcc-dev/neofs-node/pull/2164, _any_ `error` returned by a client (means real `error`, non-status (`SplitInfo` is an exception)) makes all the following requests to the client fail. That could help with some high-load (or failover) scenarios. #### Problem On the other hand, I do not totally agree that such a solution should be considered our best effort. At least `context.Cancelled` is not clear at all to me. #### Thoughts I guess we can tune network communication, turn the `Replicator` off for some time, add some feedback mechanism for our components, etc, but not only sleep for `30s` and hope that everything will be fine. Moreover, the current implementation will still read objects from disk and fail and the almost latest step (`HEAD`, before the final `PUT`). Also, does anybody ever think about the "pull" replication mechanism vs the current "push"?