Revise logger levels #41
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#41
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Suggest general rule: "info" for external events AND for accepting events to share them with others (pushing "new epoch" event to chain); "debug" for internal changes (node gets "new epoch" and logs it via "info" but its internal handling could be logged with "debug").
Also, as it was discussed before, we have an quite rare but really important events that deserve "info"/"warn" level (since we store objects, we can log their removal with "warn" to prevent/investigate errors).
Also, some important events can move from "debug" to "info":
Also, i would expand "debug" with logs about every step inside every main process: before/after every contract info fetching; before/after every network communication; before/after every disk operation.
Suggestions are appreciated.
Also it would be nice to have info about endpoints in log entry, when node communicating with external services, not in node config only.
Also, i would consider moving some important data to our metrics. It could help us to find some important system changes and not just scroll logs for minutes/hours (see https://github.com/TrueCloudLab/frostfs-node/issues/17 for some of my thoughts).
sync cycle took 5m
is desireable.new epoch
and bootstrap queries is debug.Instead of having INFO for external events I would suggest having INFO for events that imply state transition. The first person who will be reading them is a service engineer. By "state" I mean node's view of a network + storage. Basically, INFO logs should be sequentially read to build a good approximation of what happens with node: which peers are connected, what shards are online etc.
So network is epoch, morph client switches, dropping/creating connections to other storage nodes (?), shard mode changes (non-automatic, automatic can be with WARN). GC/Sync cycles can also be INFO as they are included in this "state": no new sync cycle can start unless the old one finishes.