Improve debug logging for APE check failures #1655
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1655
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "aarifullin/frostfs-node:feat/improve_ape_log"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This pull-request is a compromise between two suggestion #1 and #2
Request
andResource
types return properties withProperties()
ChainRouterError
is extended. The error message has not been changed, but the type provides more fieldsChainRouterError
occurredChainRouterError
occurredClose #1574
Properties()
method forRequest
andResource
00faa9f854ChainRouterError
b83bce1435LGTM, but what about container operations?
@dkirillov wrote in #1655 (comment):
Oh, true. Forgot about this service. Let me fix this
@ -31,0 +38,4 @@
logger.Debug(ctx, logs.APECheckDeniedRequest,
zap.Stringer("status", chainRouterErr.Status()),
zap.String("operation", chainRouterErr.Request().Operation()),
zap.Any("target", chainRouterErr.Target()),
It might be easier to implement custom encoder for
*apecommon.ChainRouterError
type (or another custom type defined in this package).Sth like
"request": {"properties": {...}, "target": ...}, "resource": {"name": ..., "properties": ...}
.Have you considered this?
6d482535bd/zapcore/marshaler.go (L30)
Why?
.Any
or.Stringer
allocates (likely) even when debug is enabled..Request().Properties()
may allocate too.I have implemented an encoder for
*apecommon.ChainRouterError
. Please, check.I agreed with your suggestion
Looks neat!
@ -181,2 +220,4 @@
if err := p.apeChecker.CheckAPE(ctx, prm); err != nil {
var chainRouterErr *apecommon.ChainRouterError
if errors.As(err, &chainRouterErr) {
I think this is a bad practice:
Logging an error is one way to handle an error.
At the same time we return this error too, so it may be handled twice.
Do you have any other solutions in mind?
Are you talking about auto-tests? Because I have no idea who else might be processing error from debug logs.
A client should receive a status going along with concise reason. We can't make an error message that contains thorough details about failure - it may cause a degradation as the server will send big messages to the client on unsuccessful checks.
The idea "client gets the chain router check error. We go to server logs to find out the true reason" is absolutely OK with me
Logging an error is perfectly fine.
Logging it twice is not.
In your example it is logged, then returned (on the caller it can also be logged (again), then returned etc.).
The easiest rule of thumb to solve this problem is to handle errors once.
Thus, my comment.
As an example of how it might be solved: provide details for all requests and hide it behind logging tag.
This way it is not an error and no garbage is seen.
I caught your idea for now. Yes, I agreed. This point is absolutely fair
Let me explain why this is barely possible and justify the current approach
ChainRouterError
won't be processed from calling side ever. For instance, object service request executor (analyzeStatus
) getsObjectAccessDenied
error with a reason. It still gets the same message ("access to operation <method> is denied by access policy engine: <status>"
). So,ChainRouterError
is kind of an intermediate error and its encoder used to log messages only in APE middleware. Please, check out howChainRouterError
implements Error() and how it goes to apeErr.Put
andPatch
(grpc-streaming and all)"operation finished with error"
areGet
,Delete
andSearch
.Put
andPatch
are logged withaudit
middleware onInfo
leveljournalctl
I'd like to leave this as it is and we'll be fine
That we don't see how to do better, doesn't imply the current approach is either helpful or fine.
Agree. But should it be an error then? What we log is operation context.
We shouldn't, it specifies a place in code and is not
context.Context
-dependent.log.WithTag(TagAPEOperationContext).Debug()
will do the job.See #1619.
Why, though? #1574 talks about understanding "what goes wrong". But unexpected ALLOW could also happen, and they also need to be debugged. Having someone access my private container is definitely a bug, even a more dangerous one.
As enabling debug level already do. Having a separate tag looks like a perfectly valid usecase for tags to me.
That PR is in review, but if we had a single function that logs verbose info reused between container and object services, that would help tag adoption. And, of course, it depends on the load, consider node living in the old epoch spamming HEAD requests to all nodes in the container.
Currently, container and object services do the same thing here, but in different ways.
The common denominator seems to be
router
(used directly incontainer
and indirectly viaCheckAPE
inobject
).Maybe have a custom router that logs everything?
@ -68,1 +84,3 @@
return &Service{
func NewService(apeChecker Checker, next objectSvc.ServiceServer, opts ...func(s *Service)) *Service {
s := &Service{
logger: logger.NewLoggerWrapper(zap.L()),
The default should be
zap.NewNop()
, because we don't want to log sensitive data somewhere, we want to precisely control all the sinks.Or, even better, make this argument explicit and remove variadic
opts
array.Fixed
060e4b2f76
tobd30e4838f
New commits pushed, approval review dismissed automatically according to repository settings
@dkirillov wrote in #1655 (comment):
Added for container service
bd30e4838f
toab3922488a
@ -34,0 +85,4 @@
func (e *ChainRouterError) marshalRequest(enc zapcore.ObjectEncoder) error {
enc.AddString("operation", e.request.Operation())
if err := enc.AddObject("target", zapcore.ObjectMarshalerFunc(
enc.AddObject("target", zapcore.ObjectMarshalerFunc(e.marshalTarget))
should work too and takes less space.Also, in some other places this could be simplified too.
fixed
@ -110,1 +123,4 @@
if err := g.apeChecker.CheckAPE(g.Context(), prm); err != nil {
var chainRouterErr *apecommon.ChainRouterError
if errors.As(err, &chainRouterErr) {
logChainRouterError(g.Context(), g.logger, chainRouterErr)
Why don't we log error just once in
g.apeChecker.CheckAPE
func (c *checkerImpl) CheckAPE(ctx context.Context, prm Prm) error {
?
ab3922488a
to86a0292145
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.