Commit graph

24 commits

Author SHA1 Message Date
Miek Gieben
c840caf1ef
Speed up testing (#4239)
* Speed up testing

* make notification run in the background, this recudes the test_readme
time from 18s to 0.10s
* reduce time for zone reload

* TestServeDNSConcurrent remove entirely. This took a whopping 58s for
  ... ? A few minutes staring didn't reveal wth it is actually testing.
  Making values smaller revealed race conditions in the tests. Remove
  entirely.

* Move many interval values to variables so we can reset them to short
  values for the tests.

* test_large_axfr: make the zone smaller. The number used 64K has no
  rational, make it 64/10 to speed up.
* TestProxyThreeWay: use client with shorter timeout

A few random tidbits in other tests.

Total time saved: 177s (almost 3m) - which makes it worthwhile again to
run the test locally:

this branch:

~~~
ok  	github.com/coredns/coredns/test	10.437s
cd plugin; time go t ./...
5,51s user 7,51s system 11,15s elapsed 744%CPU (
~~~

master:

~~~
ok  	github.com/coredns/coredns/test	35.252s
cd plugin; time go t ./...
157,64s user 15,39s system 50,05s elapsed 345%CPU ()
~~~
tests/ -25s
plugins/ -40s

This brings the total on 20s, and another 10s can be saved by fixing
dnstapio. Moving this to 5s would be even better, but 10s is also nice.

Signed-off-by: Miek Gieben <miek@miek.nl>

* Also 0.01

Signed-off-by: Miek Gieben <miek@miek.nl>
2020-10-30 10:27:04 +01:00
Christian Tryti
116bda4d27
Add configuration flag to set if RecursionDesired should be set on health checkers in Forward-plugin (#3679)
* Make the RD-flag in health-checks in the Forward-plugin configurable

Introduces a new configuration flag; `health_check_non_recursive`. This
flag makes the health-checker do non-recursive requests when checking
the health of upstream servers.

Signed-off-by: Geir Haugom <ghagit@haugom.org>
Signed-off-by: Christian Tryti <ctryti@gmail.com>

* Changes after feedback from reviewer

* Better tests of health-checks with and without recursion
* Removed the health_check_non_recursive configuration in favor of
extending the existing health_check configuration. Now supports an
optional `no_rec` argument.

Signed-off-by: Christian Tryti <ctryti@gmail.com>

* Add new test that checks setup of health_check.

Signed-off-by: Christian Tryti <ctryti@gmail.com>
2020-03-06 11:52:43 +01:00
Miek Gieben
2d98d520b5
plugin/forward: make Yield not block (#3336)
* plugin/forward: may Yield not block

Yield may block when we're super busy with creating (and looking) for
connection. Set a small timeout on Yield, to skip putting the connection
back in the queue.

Use persistentConn troughout the socket handling code to be more
consistent.

Signed-off-by: Miek Gieben <miek@miek.nl>

Dont do

Signed-off-by: Miek Gieben <miek@miek.nl>

* Set used in Yield

This gives one central place where we update used in the persistConns

Signed-off-by: Miek Gieben <miek@miek.nl>
2019-10-01 16:39:42 +01:00
Miek Gieben
dbd1c047cb
Run gostaticheck (#3325)
* Run gostaticheck

Run gostaticcheck on the codebase and fix almost all flagged items.

Only keep

* coremain/run.go:192:2: var appVersion is unused (U1000)
* plugin/chaos/setup.go:54:3: the surrounding loop is unconditionally terminated (SA4004)
* plugin/etcd/setup.go:103:3: the surrounding loop is unconditionally terminated (SA4004)
* plugin/pkg/replacer/replacer.go:274:13: argument should be pointer-like to avoid allocations (SA6002)
* plugin/route53/setup.go:124:28: session.New is deprecated: Use NewSession functions to create sessions instead. NewSession has the same functionality as New except an error can be returned when the func is called instead of waiting to receive an error until a request is made.  (SA1019)
* test/grpc_test.go:25:69: grpc.WithTimeout is deprecated: use DialContext and context.WithTimeout instead.  Will be supported throughout 1.x.  (SA1019)

The first one isn't true, as this is set via ldflags. The rest is
minor. The deprecation should be fixed at some point; I'll file some
issues.

Signed-off-by: Miek Gieben <miek@miek.nl>

* Make sure to plug in the plugins

import the plugins, that file that did this was removed, put it in the
reload test as this requires an almost complete coredns server.

Signed-off-by: Miek Gieben <miek@miek.nl>
2019-10-01 07:41:29 +01:00
li mengyang
50bac4d3c3 fix: delete unused var and const (#3294)
Signed-off-by: hwdef <hwdef97@gmail.com>
2019-09-24 07:06:37 +01:00
Miek Gieben
a1d92c51cd
plugin/forward: remove dynamic read timeout (#2319)
* plugin/forward: remove dynamic read timeout

We care about an upstream being there, so we still have a dynamic dial
time out (by way higher then 200ms) of 1s; this should be fairly stable
for an upstream. The read timeout if more variable because of cached and
non cached responses. As such remove his logic entirely.

Drop to 2s read timeout.

Fixes #2306

Signed-off-by: Miek Gieben <miek@miek.nl>
2018-11-20 08:48:56 +01:00
Ruslan Drozhdzh
298b860a97 plugin/forward: fix healthchecker crash (#2165) 2018-10-09 20:50:30 +01:00
Miek Gieben
c349446a23
Cleanup ParseHostOrFile (#2100)
Create plugin/pkg/transport that holds the transport related functions.
This needed to be a new pkg to prevent cyclic import errors.

This cleans up a bunch of duplicated code in core/dnsserver that also
tried to parse a transport (now all done in transport.Parse).

Signed-off-by: Miek Gieben <miek@miek.nl>
2018-09-19 07:29:37 +01:00
Miek Gieben
a536833546
plugin/forward: add HealthChecker interface (#1950)
* plugin/forward: add HealthChecker interface

Make the HealthChecker interface and morph the current DNS health
checker into that interface.

Remove all whole bunch of method on Forward that didn't make sense.

This is done in preparation of adding a DoH client to forward - which
requires a completely different healthcheck implementation (and more,
but lets start here)

Signed-off-by: Miek Gieben <miek@miek.nl>

* Use protocol

Signed-off-by: Miek Gieben <miek@miek.nl>

* Dial doesnt need to be method an Forward either

Signed-off-by: Miek Gieben <miek@miek.nl>

* Address comments

Address various comments on the PR.

Signed-off-by: Miek Gieben <miek@miek.nl>
2018-07-09 15:14:55 +01:00
Tobias Schmidt
422aec5f5f plugin/forward: Increase minimum read timeout to 200ms (#1889)
After several experiments at SoundCloud we found that the current
minimum read timeout of 10ms is too low. A single request against a
slow/unavailable authoritative server can cause all TCP connections to
get closed. We record a 50th percentile forward/proxy latency of <5ms,
and a 99th percentile latency of 60ms. Using a minimum timeout of 200ms
seems to be a fair trade-off between avoiding unnecessary high
connection churn and reacting to upstream failures in a timely manner.

This change also renames hcDuration to hcInterval to reflect its usage,
and removes the duplicated timeout constant to make code comprehension
easier.
2018-06-21 11:40:19 +01:00
Francois Tur
70c957d885 Plugin/Forward - autotune the dialTimeout for connection (#1852)
* - implement an auto-tunable dialTimeout for fallback.

* - fix gofmt

* - factorized timeout computation with readTimeout / updated readme /

* - fix comment
2018-06-15 07:37:22 +01:00
Miek Gieben
751a08d6a2
plugin/forward: fix alignment for sync.Atomic (#1855)
These must be alligned on 8 bytes, in Go this means putting them first
in the struct (AFAICT).
2018-06-05 17:21:09 +01:00
Ruslan Drozhdzh
833e3ddaf0 plugin/forward: erase expired connections by timer (#1782)
* plugin/forward: erase expired connection by timer

 - in previous implementation, the expired connections resided in
   cache until new request to the same upstream/protocol came. In
   case if the upstream was unhealthy new request may come long time
   later or may not come at all. All this time expired connections
   held system resources (file descriptors, ephemeral ports). In my
   fix the expired connections and related resources are released
   by timer
 - decreased the complexity of taking connection from cache. The list
   of connections is treated as stack (LIFO queue), i.e. the connection
   is taken from the end of queue (the most fresh connection) and
   returned to the end (as it was implemented before). The remarkable
   thing is that all connections in the stack appear to be ordered by
   'used' field
 - the cleanup() method finds the first good (not expired) connection
   in stack with binary search, since all connections are ordered by
   'used' field

* fix race conditions

* minor enhancement

* add comments
2018-05-25 23:00:11 +01:00
Ruslan Drozhdzh
7ac507d9ff plugin/forward: close connection manager in proxy finalizer (#1768)
- connManager() goroutine will stop when Proxy is about to be
   garbage collected. This means that no queries are in progress,
   and no queries are going to come
2018-05-18 07:46:14 +01:00
Eugen Kleiner
b9f0d55fc9 plugin/forward: expose TLSConfig and error messages to public (#1781)
* plugin/forward: expose TLSConfig and error messages to public

* Add IsTLS() instead of TLSConfig()
2018-05-09 12:41:14 +01:00
Eugen Kleiner
be8fcc484a plugin/forward: expose few methods and attributes to public (#1766)
* plugin/forward: expose few methods and attributes to public

* Update comments
2018-05-04 07:47:26 +02:00
Miek Gieben
270da82995
plugin/forward: move Dial goroutine out (#1738)
Rework the TestProxyClose - close the proxy in the *same* goroutine
as where we started it. Close channels as long as we don't get dataraces
(this may need another fix).

Move the Dial goroutine out of the connManager - this simplifies things
*and* makes another goroutine go away and removes the need for connErr
channels - can now just be dns.Conn.

Also:

Revert "plugin/forward: gracefull stop (#1701)"
This reverts commit 135377bf77.

Revert "rework TestProxyClose (#1735)"
This reverts commit 9e8893a0b5.
2018-04-26 09:34:58 +01:00
Miek Gieben
ce084012df
plugin/forward: fix TLS setup (#1714)
* plugin/forward: fix TLS setup

Way smaller PR than #1679. Fixes same thing.

* remove println

* put overwritten test back

* context

* update tests
2018-04-24 18:18:26 +01:00
Ruslan Drozhdzh
135377bf77
plugin/forward: gracefull stop (#1701)
* plugin/forward: gracefull stop

 - stop connection manager only when no queries in progress

* minor improvement

* prevent healthcheck on stopped proxy

* revert closing channels

* use standard context
2018-04-20 17:47:46 +03:00
Miek Gieben
573ad62b77 plugin/forward: min and max for avgRTT (#1680)
* Move to readtimeout

* lets compile

* address comment

* comment from pr

* much smaller minimum
2018-04-16 14:51:49 -04:00
Ruslan Drozhdzh
a20b4fe2de plugin/forward: use dynamic read timeout (#1659)
- each proxy stores average RTT (round trip time) of last rttCount queries.
   For now, I assigned the value 4 to rttCount
 - the read timeout is calculated as doubled average RTT, but it cannot
   exceed default timeout
 - initial avg RTT is set to a half of default timeout, so initial timeout
   is equal to default timeout
 - the RTT for failed read is considered equal to default timeout, so any
   failed read will lead to increasing average RTT (up to default timeout)
 - dynamic timeouts will let us react faster on lost UDP packets
 - in future, we may develop a low-latency forward policy based on
   collected RTT values of proxies
2018-04-11 07:50:06 +01:00
Ruslan Drozhdzh
e46ee9d9cc plugin/forward: retry on cached tcp connection closed by peer (#1655)
* plugin/forward: retry on cached tcp connection closed by peer

* fix linter warnings

* fixed unit test

* modify error message
2018-04-06 13:41:48 +01:00
Miek Gieben
16504234e5
plugin/forward using pkg/up (#1493)
* plugin/forward: on demand healtchecking

Only start doing health checks when we encouner an error (any error).
This uses the new pluing/pkg/up package to abstract away the actual
checking. This reduces the LOC quite a bit; does need more testing, unit
testing and tcpdumping a bit.

* fix tests

* Fix readme

* Use pkg/up for healthchecks

* remove unused channel

* more cleanups

* update readme

* * Again do go generate and go build; still referencing the wrong forward
  repo? Anyway fixed.
* Use pkg/up for doing the healtchecks to cut back on unwanted queries
  * Change up.Func to return an error instead of a boolean.
  * Drop the string target argument as it doesn't make sense.
* Add healthcheck test on failing to get an upstream answer.

TODO(miek): double check Forward and Lookup and how they interact with
HC, and if we correctly call close() on those

* actual test

* Tests here

* more tests

* try getting rid of host

* Get rid of the host indirection

* Finish removing hosts

* moar testing

* import fmt

* field is not used

* docs

* move some stuff

* bring back health_check

* maxfails=0 test

* git and merging, bah

* review
2018-02-15 10:21:57 +01:00
Miek Gieben
5b844b5017
plugin/forward: add it (#1447)
* plugin/forward: add it

This moves coredns/forward into CoreDNS. Fixes as a few bugs, adds a
policy option and more tests to the plugin.

Update the documentation, test IPv6 address and add persistent tests.

* Always use random policy when spraying

* include scrub fix here as well

* use correct var name

* Code review

* go vet

* Move logging to metrcs

* Small readme updates

* Fix readme
2018-02-05 22:00:47 +00:00