* Add forwardcrd plugin README.md
Co-authored-by: Aidan Obley <aobley@vmware.com>
Signed-off-by: Christian Ang <angc@vmware.com>
* Create forwardcrd plugin
- Place forwardcrd before forward plugin in plugin list. This will avoid
forward from preventing the forwardcrd plugin from handling any queries
in the case of having a default upstream forwarder in a server block (as
is the case in the default kubernetes Corefile).
Co-authored-by: Aidan Obley <aobley@vmware.com>
Signed-off-by: Christian Ang <angc@vmware.com>
* Add Forward CRD
Signed-off-by: Christian Ang <angc@vmware.com>
* Add NewWithConfig to forward plugin
- allows external packages to instanciate forward plugins
Co-authored-by: Aidan Obley <aobley@vmware.com>
Signed-off-by: Christian Ang <angc@vmware.com>
* ForwardCRD plugin handles requests for Forward CRs
- add a Kubernetes controller that can read Forward CRs
- instances of the forward plugin are created based on Forward CRs from
the Kubernetes controller
- DNS requests are handled by calling matching Forward plugin instances
based on zone name
- Defaults to the kube-system namespace to align with Corefile RBAC
Signed-off-by: Christian Ang <angc@vmware.com>
Use klog v2 in forwardcrd plugin
* Refactor forward setup to use NewWithConfig
Co-authored-by: Christian Ang <angc@vmware.com>
Signed-off-by: Edwin Xie <exie@vmware.com>
* Use ParseInt instead of Atoi
- to ensure that the bitsize is 32 for later casting to uint32
Signed-off-by: Christian Ang <angc@vmware.com>
* Add @christianang to CODEOWNERS for forwardcrd
Signed-off-by: Christian Ang <angc@vmware.com>
Co-authored-by: Edwin Xie <exie@vmware.com>
* Speed up testing
* make notification run in the background, this recudes the test_readme
time from 18s to 0.10s
* reduce time for zone reload
* TestServeDNSConcurrent remove entirely. This took a whopping 58s for
... ? A few minutes staring didn't reveal wth it is actually testing.
Making values smaller revealed race conditions in the tests. Remove
entirely.
* Move many interval values to variables so we can reset them to short
values for the tests.
* test_large_axfr: make the zone smaller. The number used 64K has no
rational, make it 64/10 to speed up.
* TestProxyThreeWay: use client with shorter timeout
A few random tidbits in other tests.
Total time saved: 177s (almost 3m) - which makes it worthwhile again to
run the test locally:
this branch:
~~~
ok github.com/coredns/coredns/test 10.437s
cd plugin; time go t ./...
5,51s user 7,51s system 11,15s elapsed 744%CPU (
~~~
master:
~~~
ok github.com/coredns/coredns/test 35.252s
cd plugin; time go t ./...
157,64s user 15,39s system 50,05s elapsed 345%CPU ()
~~~
tests/ -25s
plugins/ -40s
This brings the total on 20s, and another 10s can be saved by fixing
dnstapio. Moving this to 5s would be even better, but 10s is also nice.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Also 0.01
Signed-off-by: Miek Gieben <miek@miek.nl>
* plugin/dnstap: various cleanups
A recent issue made me look into this plugin, I suspect various other
cleanups (hopefully deletion of code) can be made as well
Remove identical functions ToClientQuery etc, and just use tap.Message
as the base type in plugin. Keep msg/ for a few helper functions that
may proof useful.
This remove the whole test directory as we will just check the things we
are interested in which gives much better feedback and keeps that code
closer together.
tapwr dir is also not needed, writer_test.go was just duplicating the
tests already done. This moves writer.go to the top directory.
Make the only user of dnstap, the forward plugin, use the newer code
also remove the test, a better test there would be a full e2e test to
see the correct thing happens.
Cleanup the Tapper interface and move it to dnstapio where it belongs,
remove higher level interfaces that are not used. This remove
dnstap.Tapper and dnstap.IORoutines.
Use the standard mechanism for getting access to a plugin and remove
shuffling the plugin into the context.
Signed-off-by: Miek Gieben <miek@miek.nl>
* use opts to get the correct proto
Signed-off-by: Miek Gieben <miek@miek.nl>
* Various fixes
Signed-off-by: Miek Gieben <miek@miek.nl>
* Remove bad addr test, as dnstap is only called from within coredns where these fields have been preparsed
Signed-off-by: Miek Gieben <miek@miek.nl>
* dnstap: remove saving the error
all these fields have been preparsed, no need for dnstap to be pedantic
and check (and save!) this error again.
Simplifies it a bit more.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Update plugin/forward/dnstap.go
Co-authored-by: Ruslan Drozhdzh <30860269+rdrozhdzh@users.noreply.github.com>
* Code review
Signed-off-by: Miek Gieben <miek@miek.nl>
* add back in preferUDP
Signed-off-by: Miek Gieben <miek@miek.nl>
* nit
Signed-off-by: Miek Gieben <miek@miek.nl>
Co-authored-by: Ruslan Drozhdzh <30860269+rdrozhdzh@users.noreply.github.com>
* revert de-dup
Signed-off-by: Chris O'Haver <cohaver@infoblox.com>
* unit test
Signed-off-by: Chris O'Haver <cohaver@infoblox.com>
* use roundrobin policy in test
Signed-off-by: Chris O'Haver <cohaver@infoblox.com>
* Make the RD-flag in health-checks in the Forward-plugin configurable
Introduces a new configuration flag; `health_check_non_recursive`. This
flag makes the health-checker do non-recursive requests when checking
the health of upstream servers.
Signed-off-by: Geir Haugom <ghagit@haugom.org>
Signed-off-by: Christian Tryti <ctryti@gmail.com>
* Changes after feedback from reviewer
* Better tests of health-checks with and without recursion
* Removed the health_check_non_recursive configuration in favor of
extending the existing health_check configuration. Now supports an
optional `no_rec` argument.
Signed-off-by: Christian Tryti <ctryti@gmail.com>
* Add new test that checks setup of health_check.
Signed-off-by: Christian Tryti <ctryti@gmail.com>
Upgrade to new dns lib version; that saw multiple improvements; some
patch releases are in the pipeline.
The big thing here is the removal of ErrTruncated, so we need to deal
with this slightly different in the forward plugin. It removed the
entire truncated.go logic and just checks the message for .Truncated (if
there is a message) and retries with tcp.
Signed-off-by: Miek Gieben <miek@miek.nl>
Every plugin needs to deal with EDNS0 and should call Scrub to make a
message fit the client's buffer. Move this functionality into the server
and wrapping the ResponseWriter into a ScrubWriter that handles these
bits for us. Result:
Less code and faster, because multiple chained plugins could all be
calling scrub and SizeAndDo - now there is just one place.
Most tests in file/* and dnssec/* needed adjusting because in those unit
tests you don't see OPT RRs anymore. The DNSSEC signer was also looking
at the returned OPT RR to see if it needed to sign - as those are now
added by the server (and thus later), this needed to change slightly.
Scrub itself still exist (for backward compat reasons), but has been
made a noop. Scrub has been renamed to scrub as it should not be used by
external plugins.
Fixes: #2010
Signed-off-by: Miek Gieben <miek@miek.nl>
Allow plugins to dump messages in text pcap to the log. The forward
plugin does this when a reply does not much the query.
If the debug plugin isn't loaded Hexdump and Hexdumpf are noop.
Signed-off-by: Miek Gieben <miek@miek.nl>
After several experiments at SoundCloud we found that the current
minimum read timeout of 10ms is too low. A single request against a
slow/unavailable authoritative server can cause all TCP connections to
get closed. We record a 50th percentile forward/proxy latency of <5ms,
and a 99th percentile latency of 60ms. Using a minimum timeout of 200ms
seems to be a fair trade-off between avoiding unnecessary high
connection churn and reacting to upstream failures in a timely manner.
This change also renames hcDuration to hcInterval to reflect its usage,
and removes the duplicated timeout constant to make code comprehension
easier.
* Remove Compress by default
Set Compress = true in Scrub only when the message doesn not fit the
advertized buffer. Doing compression is expensive, so try to avoid it.
Master vs this branch
pkg: github.com/coredns/coredns/plugin/cache
BenchmarkCacheResponse-2 50000 24774 ns/op
pkg: github.com/coredns/coredns/plugin/cache
BenchmarkCacheResponse-2 100000 21960 ns/op
* and make it compile
Rework the TestProxyClose - close the proxy in the *same* goroutine
as where we started it. Close channels as long as we don't get dataraces
(this may need another fix).
Move the Dial goroutine out of the connManager - this simplifies things
*and* makes another goroutine go away and removes the need for connErr
channels - can now just be dns.Conn.
Also:
Revert "plugin/forward: gracefull stop (#1701)"
This reverts commit 135377bf77.
Revert "rework TestProxyClose (#1735)"
This reverts commit 9e8893a0b5.
* plugin/forward: gracefull stop
- stop connection manager only when no queries in progress
* minor improvement
* prevent healthcheck on stopped proxy
* revert closing channels
* use standard context
* global: move to context
Move from golang.org/x/net/context to std lib's context.
Change done with:
for i in $(grep -l '/context' **/*.go); do sed -e 's|golang.org/x/net/context|context|' -i $i; echo $i; done
for i in **/*.go; do goimports -w $i; done
* drop from dns.pb.go as well
* plugin/forward: TCP conns can be closed
Only when we read and get a io.EOF we know the conn is closed (for TCP).
If this is the case Dial (again) and retry. Note that this new
connection can also be closed by the upstream, we may want to add a
DialForceNew or something to get a new TCP connection..
Simular to #1624, *but* this is by (TCP) design. We also don't have to
wait for a timeout which makes it easier to reason about.
* Move to forward.go
* doesnt need changing
* plugin/{cache,forward,proxy}: don't allow responses that are bogus
Responses that are not matching what we've been querying for should be
dropped. They are converted into FormErrs by forward and proxy; as a 2nd
backstop cache will also not cache these.
* plug
* add explicit test
* plugin/forward: on demand healtchecking
Only start doing health checks when we encouner an error (any error).
This uses the new pluing/pkg/up package to abstract away the actual
checking. This reduces the LOC quite a bit; does need more testing, unit
testing and tcpdumping a bit.
* fix tests
* Fix readme
* Use pkg/up for healthchecks
* remove unused channel
* more cleanups
* update readme
* * Again do go generate and go build; still referencing the wrong forward
repo? Anyway fixed.
* Use pkg/up for doing the healtchecks to cut back on unwanted queries
* Change up.Func to return an error instead of a boolean.
* Drop the string target argument as it doesn't make sense.
* Add healthcheck test on failing to get an upstream answer.
TODO(miek): double check Forward and Lookup and how they interact with
HC, and if we correctly call close() on those
* actual test
* Tests here
* more tests
* try getting rid of host
* Get rid of the host indirection
* Finish removing hosts
* moar testing
* import fmt
* field is not used
* docs
* move some stuff
* bring back health_check
* maxfails=0 test
* git and merging, bah
* review
* plugin/forward: add it
This moves coredns/forward into CoreDNS. Fixes as a few bugs, adds a
policy option and more tests to the plugin.
Update the documentation, test IPv6 address and add persistent tests.
* Always use random policy when spraying
* include scrub fix here as well
* use correct var name
* Code review
* go vet
* Move logging to metrcs
* Small readme updates
* Fix readme