Add a NativeHistogramBucketFactor parameter to the use of
`NewHistogramVec` in order to enable use of Prometheus Native
Histograms.
This will store automatically computed sparse buckets in CoreDNS.
If a compatible Prometeus requests native histograms this data will
returned instead of the static buckets.
The default factor of 1.05 should provide high quality resolution data.
Signed-off-by: SuperQ <superq@gmail.com>
defaulting to localhost makes things explicit in CoreDNS code, and will give us valid URIs in
the logs
Signed-off-by: W. Trevor King <wking@tremily.us>
The health endpoint histogram has a large amount of cardinality for a
simple endpoint. Introduce a new "Slim" set of buckets for `/health` to
reduce the metrics load on large deployments. Especially those that have
per-node DNS caching services.
Add a metric to count internal health check failures rather than use the
timeout value as side effect monitor of the check error. This avoids
incorrectly recording the timeout value if there is an error that is not
a timeout (ex. refused)
Signed-off-by: SuperQ <superq@gmail.com>
Small, trivial cleanup: got triggered because I saw a comment on how
health plugins polls other plugins which isn't true.
* Remove useless newHealth function
* healthParse -> parse
* Remove useless constants
Net deletion of code.
Signed-off-by: Miek Gieben <miek@miek.nl>
* plugin/health: add 'overloaded metrics'
Query our on health endpoint and record (and export as a metric) the
time it takes. The Get has a 5s timeout, that, when reached, will set
the metric duration to 5s. The actually call "I'm I overloaded" is left
to an external entity.
* README
* golint and govet
* and the tests