* Metrics: expand coredns_dns_responses_total with plugin label
This adds (somewhat hacky?) code to add a plugin label to the
coredns_dns_responses_total metric. It's completely obvlious to the
plugin as we just check who called the *recorder.WriteMsg method. We use
runtime.Caller( 1 2 3) to get multiple levels of callers, this should be
deep enough, but it depends on the dns.ResponseWriter wrapping that's
occuring.
README.md of metrics updates and test added in test/metrics_test.go to
check for the label being set.
I went through the plugin to see what metrics could be removed, but
actually didn't find any, the plugin push out metrics that make sense.
Due to the path fiddling to figure out the plugin name I doubt this
works (out-of-the-box) for external plugins, but I haven't tested that.
Signed-off-by: Miek Gieben <miek@miek.nl>
* better comment
Signed-off-by: Miek Gieben <miek@miek.nl>
* Metrics: expand coredns_dns_responses_total with plugin label
This adds (somewhat hacky?) code to add a plugin label to the
coredns_dns_responses_total metric. It's completely obvlious to the
plugin as we just check who called the *recorder.WriteMsg method. We use
runtime.Caller( 1 2 3) to get multiple levels of callers, this should be
deep enough, but it depends on the dns.ResponseWriter wrapping that's
occuring.
README.md of metrics updates and test added in test/metrics_test.go to
check for the label being set.
I went through the plugin to see what metrics could be removed, but
actually didn't find any, the plugin push out metrics that make sense.
Due to the path fiddling to figure out the plugin name I doubt this
works (out-of-the-box) for external plugins, but I haven't tested that.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Update core/dnsserver/server.go
Co-authored-by: dilyevsky <ilyevsky@gmail.com>
* Use [3]string
Signed-off-by: Miek Gieben <miek@miek.nl>
* imports
Signed-off-by: Miek Gieben <miek@miek.nl>
* remove dnstest changes
Signed-off-by: Miek Gieben <miek@miek.nl>
* revert
Signed-off-by: Miek Gieben <miek@miek.nl>
* Add some sleeps to make it less flaky
Signed-off-by: Miek Gieben <miek@miek.nl>
* Revert "Add some sleeps to make it less flaky"
This reverts commit b5c6655196.
* Remove forward when not needed
Signed-off-by: Miek Gieben <miek@miek.nl>
* remove newline
Signed-off-by: Miek Gieben <miek@miek.nl>
Co-authored-by: dilyevsky <ilyevsky@gmail.com>
Cleanup a variety of metric issues.
* Eliminate department of redundancy "count_total" naming.
* Use the plural of the unit when appropriate. (ex, "requests")
* Remove label names from metric names where appropriate. (ex, "rcode")
* Simplify request metrics by consolidating type label in to the base
request counter.
* Re-generate man pages.
Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>
This fixes a data race on the listener(s) that get started in the
metrics plugins.
It also restore pkg/uniq to its former glory and removes and state being
carried in there; this means for metrics that registry.go was to
replicate that behavior *with* locking (as pkg/uniq doesn't do, or need
that).
Also renamed uniqAddr to just u, to make it slightly shorter.
Signed-off-by: Miek Gieben <miek@miek.nl>
Fix metrics endpoint on a failed reload, follows the same lines as the
previous PRs, see for e.g. 076b8d4f. Test with a Corefile with 2 server
blocks and metrics enabled and then introducing a syntax error:
~~~
[ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk'
[ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk'
~~~
And then curl-ing the metrics endpoint.
See #2659 and as this is the last one.
Fixes: #2659
Getting this all right turns out to be tricky, also it's not easy
testable which is something I should fix.
Signed-off-by: Miek Gieben <miek@miek.nl>
* Add a GaugeVec for enabled plugins monitoring.
Signed-off-by: Jiacheng Xu <xjcmaxwellcjx@gmail.com>
* Add server label and zone label for enable_plugin matric.
* Add a test for PluginEnabled metric
* Add description for enabledPlugin metric.
* Change the description for the enabledPlugin metric.
* Reset the enabledPlugin metric when restart the server.
* Add the bug session for enabledPlugin metric.
* Remove the resolveTCPAddr
This clear out the remaining map[x]bool usage and moves the bool to an
empty struct.
Two note worthy other changes:
* EnableChaos in the server is now also exported to make it show up in
the documentation.
* The auto plugin is left as is, because there the boolean is
explicitaly set to false to signal 'to-be-deleted' and the key is left
as-is.
Signed-off-by: Miek Gieben <miek@miek.nl>
* - UT on metrics verifying that all plugins of all blocs have their metrics collectors declared
* - fix error msg
* - redirect Registry of metric to the one that handle the listener
- allow duplicate of metrics collector on the same Registry (case of same plugin in 2 blocs listening metrics on the same address)
* - fix change of signature
* - ensure cleaning metrics before starting the test (metrics collectors are global vars .. and re-used by several tests)
* - I think I fixed this test. Ensure correct mn of hits and clean metrics before test.
* - fix typo in error msg - proposed at review
* - fix typo in comment
* - remove ResetMetrics functions
- change a way to test the numeric metrics : get the diff between begin and end of test
* - oops. removing debug logs
* Create test to verify correct listener behavior
* Create Unset function to remove todo items
* Reset address for prometheus listener before restarting
* Add inline documentation for Unset function
* Make shutdownTimeout a constant and change to five seconds
* Revert ForEach behavior in uniq package
* update docs
* plugins: use plugin specific logging
Hooking up pkg/log also changed NewWithPlugin to just take a string
instead of a plugin.Handler as that is more flexible and for instance
the Root "plugin" doesn't implement it fully.
Same logging from the reload plugin:
.:1043
2018/04/22 08:56:37 [INFO] CoreDNS-1.1.1
2018/04/22 08:56:37 [INFO] linux/amd64, go1.10.1,
CoreDNS-1.1.1
linux/amd64, go1.10.1,
2018/04/22 08:56:37 [INFO] plugin/reload: Running configuration MD5 = ec4c9c55cd19759ea1c46b8c45742b06
2018/04/22 08:56:54 [INFO] Reloading
2018/04/22 08:56:54 [INFO] plugin/reload: Running configuration MD5 = 9e2bfdd85bdc9cceb740ba9c80f34c1a
2018/04/22 08:56:54 [INFO] Reloading complete
* update docs
* better doc
* reload: use OnRestart
Close the listener on OnRestart for health and metrics so the default
setup function can setup the listener when the plugin is "starting up".
Lightly test with some SIGUSR1-ing. Also checked the reload plugin with
this, seems fine:
.com.:1043
.:1043
2018/04/20 15:01:25 [INFO] CoreDNS-1.1.1
2018/04/20 15:01:25 [INFO] linux/amd64, go1.10,
CoreDNS-1.1.1
linux/amd64, go1.10,
2018/04/20 15:01:25 [INFO] Running configuration MD5 = aa8b3f03946fb60546ca1f725d482714
2018/04/20 15:02:01 [INFO] Reloading
2018/04/20 15:02:01 [INFO] Running configuration MD5 = b34a96d99e01db4015a892212560155f
2018/04/20 15:02:01 [INFO] Reloading complete
^C2018/04/20 15:02:06 [INFO] SIGINT: Shutting down
With this corefile:
.com {
proxy . 127.0.0.1:53
prometheus :9054
whoami
reload
}
. {
proxy . 127.0.0.1:53
prometheus :9054
whoami
reload
}
The prometheus port was 9053, changed that to 54 so reload would pick it
up.
From a cursory look it seems this also fixes:
Fixes#1604#1618#1686#1492
* At least make it test
* Use onfinalshutdown
* reload: add reload test
This test #1604 adn right now fails.
* Address review comments
* Add bug section explaining things a bit
* compile tests
* Fix tests
* fixes
* slightly less crazy
* try to make prometheus setup less confusing
* Use ephermal port for test
* Don't use the listener
* These are shared between goroutines, just use the boolean in the main
structure.
* Fix text in the reload README,
* Set addr to TODO once stopping it
* Morph fturb's comment into test, to test reload and scrape health and
metric endpoint
* Update all plugins to use plugin/pkg/log
I wish this could have been done with sed. Alas manually changed all
callers to use the new plugin/pkg/log package.
* Error -> Info
* Add docs to debug plugin as well
* plugin/metrics: set server address in context
Allow cross server block metrics to co-exist; for this we should label
each metric with the server label. Put this information in the context
and provide a helper function to get it out.
Abstracting with entirely away with difficult as the release client_go
(0.8.0) doesn't have the CurryWith functions yet. So current use is like
so:
define metric, with server label:
RcodeCount = prometheus.NewCounterVec(prometheus.CounterOpts{
Namespace: plugin.Namespace,
Subsystem: "forward",
Name: "response_rcode_count_total",
Help: "Counter of requests made per upstream.",
}, []string{"server", "rcode", "to"})
And report ith with the helper function metrics.WithServer:
RcodeCount.WithLabelValues(metrics.WithServer(ctx), rc, p.addr).Add(1)
This leave most of the code intact, but we need to stop vendoring
prometheus, because, again, plugins what want to use it. Not vendoring
prometheus makes my forward metrics show up again. Code looks bit
convoluted, but works:
~~~
c.OnStartup(func() error {
once.Do(func() {
m := dnsserver.GetConfig(c).Handler("prometheus")
if m == nil {
return
}
if x, ok := m.(*metrics.Metrics); ok {
x.MustRegister(RequestCount)
x.MustRegister(RcodeCount)
x.MustRegister(RequestDuration)
x.MustRegister(HealthcheckFailureCount)
x.MustRegister(SocketGauge)
}
})
})
~~~
prometheus.Handler is deprecated according to the godoc for the package so
instead we're using promhttp.
Additionally, we are exposing the Registry that metrics is using so other
plugins that are not inside of coredns can read the registry. Otherwise, if
we kept using the Default one, there's no way to access that from outside
of the coredns repo since it is vendored.
If external plugins wanted to extend metrics there was no way since
zoneNames couldn't be initialized. Now plugins can call New to get an
instance of Metrics that they can extend.
* Rename middleware to plugin
first pass; mostly used 'sed', few spots where I manually changed
text.
This still builds a coredns binary.
* fmt error
* Rename AddMiddleware to AddPlugin
* Readd AddMiddleware to remain backwards compat
2017-09-14 09:36:06 +01:00
Renamed from middleware/metrics/metrics.go (Browse further)