ir: Add health status reporting on reconfiguration #1311

Merged
fyrchik merged 1 commit from elebedeva/frostfs-node:fix/ir-reload-notify-systemd into master 2024-09-04 19:51:11 +00:00
3 changed files with 13 additions and 3 deletions

View file

@ -0,0 +1,10 @@
package sdnotify
import (
// For go:linkname to work.
_ "unsafe"
)
//go:noescape
//go:linkname nanotime runtime.nanotime
func nanotime() int64

View file

@ -0,0 +1,2 @@
// The file is intentionally empty.
// It is a workaround for https://github.com/golang/go/issues/15006

View file

@ -6,7 +6,6 @@ import (
"net" "net"
"os" "os"
"strings" "strings"
"time"
) )
const ( const (
@ -17,7 +16,6 @@ const (
var ( var (
socket *net.UnixAddr socket *net.UnixAddr
start = time.Now()
fyrchik marked this conversation as resolved Outdated

Have you tried using start time.Time (i.e. the default value)?

Have you tried using `start time.Time` (i.e. the default value)?

The time.Time default value is 0001-01-01 00:00:00 +0000 UTC, time.Since() returns time.Duration which is an alias of int64. According to Go doc:

The representation limits the largest representable duration to approximately 290 years.

time.Since(time.Time{}) overflows.

The `time.Time` default value is `0001-01-01 00:00:00 +0000 UTC`, `time.Since()` returns `time.Duration` which is an alias of `int64`. According to [Go doc](https://pkg.go.dev/time#Duration): > The representation limits the largest representable duration to approximately 290 years. `time.Since(time.Time{})` overflows.
errSocketVariableIsNotPresent = errors.New("\"NOTIFY_SOCKET\" environment variable is not present") errSocketVariableIsNotPresent = errors.New("\"NOTIFY_SOCKET\" environment variable is not present")
errSocketIsNotInitialized = errors.New("socket is not initialized") errSocketIsNotInitialized = errors.New("socket is not initialized")
@ -53,7 +51,7 @@ func FlagAndStatus(status string) error {
// must be sent, containing "READY=1". // must be sent, containing "READY=1".
// //
// For MONOTONIC_USEC format refer to https://www.man7.org/linux/man-pages/man3/sd_notify.3.html // For MONOTONIC_USEC format refer to https://www.man7.org/linux/man-pages/man3/sd_notify.3.html
fyrchik marked this conversation as resolved
Review

Maybe in IR SIGHUP is so fast that we send the same MONOTONIC_USEC? This would've explained the problem.

Maybe in IR SIGHUP is so fast that we send the same `MONOTONIC_USEC`? This would've explained the problem.
Review

Receiving the same MONOTONIC_USEC doesn't seem like a problem to systemd. I tried sending time.Since(time.Time{}) (always the same value), systemd is OK with it, service reload is successful.

It appears as if systemd does not accept values less than some minimum and greater than some maximum, and time in us since start is deemed as being too small. Time in ns since start works fine most of the time but not always (i've got hang-ups a couple times). Haven't found those min & max values in systemd source code yet.

Passing a math.MaxInt64 as MONOTONIC_USEC works fine but math.MaxUint64 is not accepted.

Receiving the same `MONOTONIC_USEC` doesn't seem like a problem to `systemd`. I tried sending `time.Since(time.Time{})` (always the same value), `systemd` is OK with it, service reload is successful. It appears as if `systemd` does not accept values less than some minimum and greater than some maximum, and time in us since start is deemed as being too small. Time in ns since start works fine most of the time but not always (i've got hang-ups a couple times). Haven't found those min & max values in systemd source code yet. Passing a `math.MaxInt64` as `MONOTONIC_USEC` works fine but `math.MaxUint64` is not accepted.
status += fmt.Sprintf("\nMONOTONIC_USEC=%d", uint64(time.Since(start))/1e3 /* microseconds in nanoseconds */) status += fmt.Sprintf("\nMONOTONIC_USEC=%d", uint64(nanotime())/1e3 /* microseconds in nanoseconds */)
} }
status += "\nSTATUS=" + strings.TrimSuffix(status, "=1") status += "\nSTATUS=" + strings.TrimSuffix(status, "=1")
return Send(status) return Send(status)