ir: Add health status reporting on reconfiguration #1311

Merged
fyrchik merged 2 commits from elebedeva/frostfs-node:fix/ir-reload-notify-systemd into master 2024-09-04 19:51:11 +00:00
Member

Fix #1135
Relates to #1262

frostfs-ir service reconfigures correctly and service's statuses are being reported to systemd. However, since we replaced go:linkname & nanotime() with time.Since(), systemd refuses to accept reload signal response from frostfs-ir. To maintain correct behaviour it was decided to revevrt systemd-related changes until a better solution is found.

Signed-off-by: Ekaterina Lebedeva ekaterina.lebedeva@yadro.com

Fix #1135 Relates to #1262 `frostfs-ir` service reconfigures correctly and service's statuses are being reported to systemd. However, since we replaced `go:linkname` & `nanotime()` with `time.Since()`, systemd refuses to accept reload signal response from `frostfs-ir`. To maintain correct behaviour it was decided to revevrt[ systemd-related changes](https://git.frostfs.info/elebedeva/frostfs-node/commit/327d364f34de730879f330ea51d8801f5c6bddc9) until a better solution is found. Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
elebedeva added the
frostfs-ir
internal
labels 2024-08-14 11:28:59 +00:00
elebedeva requested review from storage-core-committers 2024-08-14 11:29:54 +00:00
elebedeva requested review from storage-core-developers 2024-08-14 11:29:58 +00:00
fyrchik reviewed 2024-08-14 11:43:18 +00:00
go.mod Outdated
@ -45,2 +44,2 @@
google.golang.org/grpc v1.63.2
google.golang.org/protobuf v1.33.0
golang.org/x/term v0.21.0
google.golang.org/grpc v1.65.0
Owner

no, too early #1268

no, too early https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/1268
Author
Member

fixed

fixed
fyrchik marked this conversation as resolved
@ -17,7 +16,6 @@ const (
var (
socket *net.UnixAddr
start = time.Now()
Owner

Have you tried using start time.Time (i.e. the default value)?

Have you tried using `start time.Time` (i.e. the default value)?
Author
Member

The time.Time default value is 0001-01-01 00:00:00 +0000 UTC, time.Since() returns time.Duration which is an alias of int64. According to Go doc:

The representation limits the largest representable duration to approximately 290 years.

time.Since(time.Time{}) overflows.

The `time.Time` default value is `0001-01-01 00:00:00 +0000 UTC`, `time.Since()` returns `time.Duration` which is an alias of `int64`. According to [Go doc](https://pkg.go.dev/time#Duration): > The representation limits the largest representable duration to approximately 290 years. `time.Since(time.Time{})` overflows.
fyrchik marked this conversation as resolved
@ -53,7 +51,7 @@ func FlagAndStatus(status string) error {
// must be sent, containing "READY=1".
//
// For MONOTONIC_USEC format refer to https://www.man7.org/linux/man-pages/man3/sd_notify.3.html
Owner

Maybe in IR SIGHUP is so fast that we send the same MONOTONIC_USEC? This would've explained the problem.

Maybe in IR SIGHUP is so fast that we send the same `MONOTONIC_USEC`? This would've explained the problem.
Author
Member

Receiving the same MONOTONIC_USEC doesn't seem like a problem to systemd. I tried sending time.Since(time.Time{}) (always the same value), systemd is OK with it, service reload is successful.

It appears as if systemd does not accept values less than some minimum and greater than some maximum, and time in us since start is deemed as being too small. Time in ns since start works fine most of the time but not always (i've got hang-ups a couple times). Haven't found those min & max values in systemd source code yet.

Passing a math.MaxInt64 as MONOTONIC_USEC works fine but math.MaxUint64 is not accepted.

Receiving the same `MONOTONIC_USEC` doesn't seem like a problem to `systemd`. I tried sending `time.Since(time.Time{})` (always the same value), `systemd` is OK with it, service reload is successful. It appears as if `systemd` does not accept values less than some minimum and greater than some maximum, and time in us since start is deemed as being too small. Time in ns since start works fine most of the time but not always (i've got hang-ups a couple times). Haven't found those min & max values in systemd source code yet. Passing a `math.MaxInt64` as `MONOTONIC_USEC` works fine but `math.MaxUint64` is not accepted.
fyrchik marked this conversation as resolved
elebedeva force-pushed fix/ir-reload-notify-systemd from bb3945f62a to 5da41f1fe5 2024-08-15 13:55:17 +00:00 Compare
fyrchik approved these changes 2024-08-15 13:57:23 +00:00
fyrchik merged commit 5da41f1fe5 into master 2024-08-15 14:00:10 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1311
No description provided.