Not sure why this is proving so difficult.. pointers are hard? [Was
tempted to rollback all tweaks here, but the original issue we're fixing
it too important to not have a proper fix].
But we need to make a copy of the message at the earliest point in the
handler because we are changing it (adding an opt rr). If we do this on
the original message (which is a pointer) we change it (obvs). When
undoing those changes we do work on a copy.
Re: testing. There isn't a explicit test for this, so I've added on to
the top-level test/ directory, which indeed makes the issue visible:
master:
~~~
go test -v -run=TestLookupCacheWithoutEdns
=== RUN TestLookupCacheWithoutEdns
cache_test.go:154: Expected no OPT RR, but got:
;; OPT PSEUDOSECTION:
; EDNS: version 0; flags: do; udp: 2048
--- FAIL: TestLookupCacheWithoutEdns (0.01s)
FAIL
~~~
This branch:
~~~
% go test -v -run=TestLookupCacheWithoutEdns
=== RUN TestLookupCacheWithoutEdns
--- PASS: TestLookupCacheWithoutEdns (0.01s)
PASS
ok github.com/coredns/coredns/test 0.109s
~~~
Signed-off-by: Miek Gieben <miek@miek.nl>
By checking state.Do() were are checking if the request had DO, but
we are _always_ adding Do now - do we need to save the DO from the
ORIGINAL request, which must be done in the ResponseWriter.
Also skip OPT records in filterDNSSEC as we can't set the TTL on those
records, this prevents writing a number to OPT's MBZ.
Note none of the tests have changed and still PASS. This is due to
the fact that CoreDNSServerAndPorts isn't a full server as we start in
main, it lacks the scrubwriter for instance. This is not bad per se, but
should be documented in the test code.
Signed-off-by: Miek Gieben <miek@miek.nl>
The filtering of DNSSEC records in the cache plugin was not done
correctly. Also the change to introduced this bug didn't take into
account that the cache - by virtue of differentiating between DNSSEC and
no-DNSSEC - relied on not copying the data from the cache.
This change copies and then filters the data and factors the filtering
into a function that is used in two places (albeit with on ugly boolean
parameters to prevent copying things twice).
Add tests, do_test.go is moved to test/cache_test.go because the OPT
handing is done outside of the cache plugin. The core server re-attaches
the correct OPT when replying, so that makes for a better e2e test.
Added small unit test for filterRRslice and an explicit test that asks
for DNSSEC first and then plain, and vice versa to test cache behavior.
Fixes: #4146
Signed-off-by: Miek Gieben <miek@miek.nl>
* cache: default to DNSSEC
This change does away with the DNS/DNSSEC distinction the cache
currently makes. Cache will always make coredns perform a DNSSEC query
and store that result. If a client just needs plain DNS, the DNSSEC
records are stripped from the response.
It should also be more memory efficient, because we store a reply once
and not one DNS and another for DNSSEC.
Fixes: #3836
Signed-off-by: Miek Gieben <miek@miek.nl>
* Change OPT RR when one is present in the msg.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Fix comment for isDNSSEC
Signed-off-by: Miek Gieben <miek@miek.nl>
* Update plugin/cache/handler.go
Co-authored-by: Chris O'Haver <cohaver@infoblox.com>
* Update plugin/cache/item.go
Co-authored-by: Chris O'Haver <cohaver@infoblox.com>
* Code review; fix comment for isDNSSEC
Signed-off-by: Miek Gieben <miek@miek.nl>
* Update doc and set AD to false
Set Authenticated Data to false when DNSSEC was not wanted. Also update
the readme with the new behavior.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Update plugin/cache/handler.go
Co-authored-by: Chris O'Haver <cohaver@infoblox.com>
Co-authored-by: Chris O'Haver <cohaver@infoblox.com>
* Fix some typos
Corect some words for reading more easily
* Update NOERROR response code
NOERROR is a response code so I revert the typo checking for it
Remove some optimization and lowercasing of the qname (in the end
miekg/dns should provide a fast and OK function for it).
* remove the make([]byte, 2) allocation in the key()
* use already lowercased qname in hash key calculation.
% benchcmp old.txt new.txt
benchmark old ns/op new ns/op delta
BenchmarkCacheResponse-4 9599 8735 -9.00%
Signed-off-by: Miek Gieben <miek@miek.nl>
* Fix max-age in http server
Move the minMsgTTL to dnsutil and rename it MinimalTTL, move some
constants there as well.
Use these new function in server_https to correctly set the max-age
HTTP header.
Fixes: #1823
* Linter
The default dns.Response implementation of a dns.ResponseWriter will
panic if RemoteAddr() is called after the connection to the client has
been closed already. The current cache implementation doesn't create a
new request+responsewriter during an asynchronous prefetch, but
piggybacks on the request triggering the prefetch.
This change copies the RemoteAddr first, so that it's safe to use it
later during the actual prefetch request.
A better implementation would be to completely decouple the prefetch
request from the client triggering a request.
* plugin/cache: per server metrics
Use per server metrics in the cache plugin as well. This required
some plumbing changes. Also use request.Request more.
* fix cherry-pick
* update docs
* plugins: use plugin specific logging
Hooking up pkg/log also changed NewWithPlugin to just take a string
instead of a plugin.Handler as that is more flexible and for instance
the Root "plugin" doesn't implement it fully.
Same logging from the reload plugin:
.:1043
2018/04/22 08:56:37 [INFO] CoreDNS-1.1.1
2018/04/22 08:56:37 [INFO] linux/amd64, go1.10.1,
CoreDNS-1.1.1
linux/amd64, go1.10.1,
2018/04/22 08:56:37 [INFO] plugin/reload: Running configuration MD5 = ec4c9c55cd19759ea1c46b8c45742b06
2018/04/22 08:56:54 [INFO] Reloading
2018/04/22 08:56:54 [INFO] plugin/reload: Running configuration MD5 = 9e2bfdd85bdc9cceb740ba9c80f34c1a
2018/04/22 08:56:54 [INFO] Reloading complete
* update docs
* better doc
* Update all plugins to use plugin/pkg/log
I wish this could have been done with sed. Alas manually changed all
callers to use the new plugin/pkg/log package.
* Error -> Info
* Add docs to debug plugin as well
* plugin/{cache,forward,proxy}: don't allow responses that are bogus
Responses that are not matching what we've been querying for should be
dropped. They are converted into FormErrs by forward and proxy; as a 2nd
backstop cache will also not cache these.
* plug
* add explicit test
* Revert "pkg/typify: empty messages are OtherError (#1531)"
This reverts commit fc1d73ffa9.
* plugin/cache: add failsafeTTL
If we can not see what TTL we should put on a message to be cached, use
5 seconds as minimal TTL. We used to apply the maximum TTL to these
messages.
Messages with nothing in them are considered OtherError, they can not
serve any purpose for normal clients (i.e. dyn update or notifies might
have a use for them).
Also update a test in the cache plugin, so that we explicitaly test for
this case.
* Improve plugin/cache metrics
* Add coredns_cache_prefetch_total metric to track number of prefetches.
* Remove unnecessary Cache.get() call which would incorrectly increment
cache counters.
* Initialize all counters and gauges at zero.
* Allow prefetching of a single request per ttl
The original implementation didn't allow prefetching queries which are
only requested once during the duration of a TTL. The minimum amount of
queries which had to be seen was therefore capped at 2.
This change also implements a real prefetch test. The existing test was
a noop and always passed regardless of any prefetch implementation.
* Fix prefetching for items with a short TTL
The default prefetch threshold (percentage) is 10% of the lifetime of a
cache item. With the previous implementation, this disabled prefetching
for all items with a TTL < 10s (the resulting percentage would be 0, at
which point a cached item is already discarded).
This change uses a time based threshold calculation and ensures that
a prefetch is triggered at a TTL of 1 at the latest.
* Fix wrong duration reporting of cached responses
The logging and metrics plugins (among others) included the duration of
a cache prefetch in the request latency of client request. This change
fixes this wrong reporting and executes the prefetch request in a
goroutine in the background.
The cache plugin always returned a minimum TTL of 5 seconds, regardless
of the actual TTL of the records. A cache is not authoritative for the
record TTL and should not extend it.
Cache would let the first response through and would then cap subsequent
ones to whatever the cache duration was. This would lead to huge drops
in TTL values: 3600 -> 20 for instance, which is not only bad, but can
mess up your careful TTL planning business.
This PR fixes that and applies the cache duration to all replies. As a
bonus I could remove a time.Sleep() from the cache test and just check
for the cache duration as the TTL on the reply.
Fixes#1038
* Rename middleware to plugin
first pass; mostly used 'sed', few spots where I manually changed
text.
This still builds a coredns binary.
* fmt error
* Rename AddMiddleware to AddPlugin
* Readd AddMiddleware to remain backwards compat