plugin/loop: show from -> to (#2400)

Show from and to address when detecting a loop they may aid in
debugging.

Hard to create a unit test, but this is a startup run with self induced
loop:

~~~ corefile
.:1053 {
    loop
    log
    forward . 127.0.0.1:1053
}
~~~~

~~~
:1053
2018-12-16T10:11:03.695Z [INFO] CoreDNS-1.3.0
2018-12-16T10:11:03.695Z [INFO] linux/amd64, go1.11,
CoreDNS-1.3.0
linux/amd64, go1.11,
2018-12-16T10:11:03.696Z [FATAL] plugin/loop: Loop (127.0.0.1:51384 -> :1053) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 2781022615773629442.4133547885299871809."
~~~

Update the docs and polished that a bit as well.

Signed-off-by: Miek Gieben <miek@miek.nl>
This commit is contained in:
Miek Gieben 2018-12-16 21:48:09 +00:00 committed by Yong Tang
parent 6b8c154441
commit 775cf92f03
3 changed files with 28 additions and 11 deletions

View file

@ -7,10 +7,10 @@
## Description
The *loop* plugin will send a random probe query to ourselves and will then keep track of how many times
we see it. If we see it more than twice, we assume CoreDNS is looping and we halt the process.
we see it. If we see it more than twice, we assume CoreDNS has seen a forwarding loop and we halt the process.
The plugin will try to send the query for up to 30 seconds. This is done to give CoreDNS enough time
to start up. Once a query has been successfully sent *loop* disables itself to prevent a query of
to start up. Once a query has been successfully sent, *loop* disables itself to prevent a query of
death.
The query sent is `<random number>.<random number>.zone` with type set to HINFO.
@ -36,22 +36,24 @@ forwards to it self.
After CoreDNS has started it stops the process while logging:
~~~ txt
plugin/loop: Forwarding loop detected in "." zone. Exiting. See https://coredns.io/plugins/loop#troubleshooting. Probe query: "HINFO 5577006791947779410.8674665223082153551.".
plugin/loop: Loop (127.0.0.1:55953 -> :1053) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 4547991504243258144.3688648895315093531."
~~~
## Limitations
This plugin only attempts to find simple static forwarding loops at start up time. To detect a loop, all of the following must be true
This plugin only attempts to find simple static forwarding loops at start up time. To detect a loop,
the following must be true:
* the loop must be present at start up time.
* the loop must occur for at least the `HINFO` query type.
* the loop must be present at start up time.
* the loop must occur for the `HINFO` query type.
## Troubleshooting
When CoreDNS logs contain the message `Forwarding loop detected ...`, this means that
the `loop` detection plugin has detected an infinite forwarding loop in one of the upstream
DNS servers. This is a fatal error because operating with an infinite loop will consume
memory and CPU until eventual out of memory death by the host.
When CoreDNS logs contain the message `Loop ... detected ...`, this means that the `loop` detection
plugin has detected an infinite forwarding loop in one of the upstream DNS servers. This is a fatal
error because operating with an infinite loop will consume memory and CPU until eventual out of
memory death by the host.
A forwarding loop is usually caused by:
@ -64,6 +66,7 @@ to another DNS server that is forwarding requests back to CoreDNS. If `proxy` or
using a file (e.g. `/etc/resolv.conf`), make sure that file does not contain local addresses.
### Troubleshooting Loops In Kubernetes Clusters
When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to "CrashLoopBackOff".
This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.

View file

@ -19,6 +19,7 @@ type Loop struct {
zone string
qname string
addr string
sync.RWMutex
i int
@ -49,7 +50,7 @@ func (l *Loop) ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg) (
}
if l.seen() > 2 {
log.Fatalf("Forwarding loop detected in \"%s\" zone. Exiting. See https://coredns.io/plugins/loop#troubleshooting. Probe query: \"HINFO %s\".", l.zone, l.qname)
log.Fatalf(`Loop (%s -> %s) detected for zone %q, see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO %s"`, state.RemoteAddr(), l.address(), l.zone, l.qname)
}
return plugin.NextOrFailure(l.Name(), l.Next, ctx, w, r)
@ -94,3 +95,15 @@ func (l *Loop) disabled() bool {
defer l.RUnlock()
return l.off
}
func (l *Loop) setAddress(addr string) {
l.Lock()
defer l.Unlock()
l.addr = addr
}
func (l *Loop) address() string {
l.RLock()
defer l.RUnlock()
return l.addr
}

View file

@ -41,6 +41,7 @@ func setup(c *caddy.Controller) error {
addr := net.JoinHostPort(lh, conf.Port)
for time.Now().Before(deadline) {
l.setAddress(addr)
if _, err := l.exchange(addr); err != nil {
l.reset()
time.Sleep(1 * time.Second)