Commit graph

24 commits

Author SHA1 Message Date
Zou Nengren
4166dcc2fe
using promauto package to ensure all created metrics are properly registered (#4025)
Signed-off-by: zounengren <zounengren@cmss.chinamobile.com>
2020-07-25 08:06:28 -07:00
Miek Gieben
19cfa2960c
Cleanup metrics (#3776)
Cleanup a variety of metric issues.
* Eliminate department of redundancy "count_total" naming.
* Use the plural of the unit when appropriate. (ex, "requests")
* Remove label names from metric names where appropriate. (ex, "rcode")
* Simplify request metrics by consolidating type label in to the base
request counter.
* Re-generate man pages.

Signed-off-by: Ben Kochie <superq@gmail.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
2020-03-26 09:17:33 +01:00
Zou Nengren
13fca02316 use pkg/reuseport in rest plugins (#3492)
Automatically submitted.
2019-12-06 10:55:40 +00:00
Miek Gieben
118b0c9408
plugin/metrcs: fix datarace on listeners (#2835)
This fixes a data race on the listener(s) that get started in the
metrics plugins.

It also restore pkg/uniq to its former glory and removes and state being
carried in there; this means for metrics that registry.go was to
replicate that behavior *with* locking (as pkg/uniq doesn't do, or need
that).

Also renamed uniqAddr to just u, to make it slightly shorter.

Signed-off-by: Miek Gieben <miek@miek.nl>
2019-05-18 18:34:46 +01:00
Miek Gieben
2ef55f805e plugin/metrics: fix failed reload (#2816)
Fix metrics endpoint on a failed reload, follows the same lines as the
previous PRs, see for e.g. 076b8d4f. Test with a Corefile with 2 server
blocks and metrics enabled and then introducing a syntax error:

~~~
[ERROR] Restart failed: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk'
[ERROR] SIGUSR1: starting with listener file descriptors: Corefile:5 - Error during parsing: Unknown directive 'jfkdjk'
~~~

And then curl-ing the metrics endpoint.

See #2659 and as this is the last one.

Fixes: #2659

Getting this all right turns out to be tricky, also it's not easy
testable which is something I should fix.

Signed-off-by: Miek Gieben <miek@miek.nl>
2019-05-13 04:26:05 -07:00
Jiacheng Xu
0e137b23f1 plugin/metrics: Add a metric to monitor which plugin(s) is(are) enabled (#2700)
* Add a GaugeVec for enabled plugins monitoring.

Signed-off-by: Jiacheng Xu <xjcmaxwellcjx@gmail.com>

* Add server label and zone label for enable_plugin matric.

* Add a test for PluginEnabled metric

* Add description for enabledPlugin metric.

* Change the description for the enabledPlugin metric.

* Reset the enabledPlugin metric when restart the server.

* Add the bug session for enabledPlugin metric.

* Remove the resolveTCPAddr
2019-03-23 09:43:15 +00:00
Miek Gieben
9abbf4a4a0 map bool -> map struct{} (#2386)
This clear out the remaining map[x]bool usage and moves the bool to an
empty struct.

Two note worthy other changes:

* EnableChaos in the server is now also exported to make it show up in
  the documentation.
* The auto plugin is left as is, because there the boolean is
  explicitaly set to false to signal 'to-be-deleted' and the key is left
  as-is.

Signed-off-by: Miek Gieben <miek@miek.nl>
2018-12-10 02:17:15 -08:00
Yong Tang
e5f5da4297 Update Prometheus to 0.9.1 (#2360)
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2018-12-01 22:38:03 +00:00
Francois Tur
05204ef142 Metrics registered on wrong prometheus registry (#2246)
* - UT on metrics verifying that all plugins of all blocs have their metrics collectors declared

* - fix error msg

* - redirect Registry of metric to the one that handle the listener
- allow duplicate of metrics collector on the same Registry (case of same plugin in 2 blocs listening metrics on the same address)

* - fix change of signature

* - ensure cleaning metrics before starting the test (metrics collectors are global vars .. and re-used by several tests)

* - I think I fixed this test. Ensure correct mn of hits and clean metrics before test.

* - fix typo in error msg - proposed at review

* - fix typo in comment

* - remove ResetMetrics functions
- change a way to test the numeric metrics : get the diff between begin and end of test

* - oops. removing debug logs
2018-11-01 19:56:00 +00:00
Zach Eddy
8aa55c5ff2 Metrics listener fix (#2036)
* Create test to verify correct listener behavior

* Create Unset function to remove todo items

* Reset address for prometheus listener before restarting

* Add inline documentation for Unset function

* Make shutdownTimeout a constant and change to five seconds

* Revert ForEach behavior in uniq package
2018-08-21 11:52:25 -04:00
Miek Gieben
7c27577707
plugin/metrics: add panic counter (#1778)
Count and export number of panics we see.

Fixes #1294
2018-05-05 19:47:41 +02:00
Miek Gieben
5e6114b797
plugin/pkg/uniq: add (#1733)
Spin this out the metrics package so we can use it in the health
one of well to fix some reload bugs.
2018-04-25 11:45:09 +01:00
Miek Gieben
12b2ff9740
Use logging (#1718)
* update docs

* plugins: use plugin specific logging

Hooking up pkg/log also changed NewWithPlugin to just take a string
instead of a plugin.Handler as that is more flexible and for instance
the Root "plugin" doesn't implement it fully.

Same logging from the reload plugin:

.:1043
2018/04/22 08:56:37 [INFO] CoreDNS-1.1.1
2018/04/22 08:56:37 [INFO] linux/amd64, go1.10.1,
CoreDNS-1.1.1
linux/amd64, go1.10.1,
2018/04/22 08:56:37 [INFO] plugin/reload: Running configuration MD5 = ec4c9c55cd19759ea1c46b8c45742b06
2018/04/22 08:56:54 [INFO] Reloading
2018/04/22 08:56:54 [INFO] plugin/reload: Running configuration MD5 = 9e2bfdd85bdc9cceb740ba9c80f34c1a
2018/04/22 08:56:54 [INFO] Reloading complete

* update docs

* better doc
2018-04-22 21:40:33 +01:00
Miek Gieben
a466bb6fc6
Export metrics in setup; so it also works after reload (#1715)
* brr; a sleep

* Shouldnt need a query
2018-04-21 18:59:35 +01:00
Miek Gieben
acbcad7b4e
reload: use OnRestart (#1709)
* reload: use OnRestart

Close the listener on OnRestart for health and metrics so the default
setup function can setup the listener when the plugin is "starting up".

Lightly test with some SIGUSR1-ing. Also checked the reload plugin with
this, seems fine:

.com.:1043
.:1043
2018/04/20 15:01:25 [INFO] CoreDNS-1.1.1
2018/04/20 15:01:25 [INFO] linux/amd64, go1.10,
CoreDNS-1.1.1
linux/amd64, go1.10,
2018/04/20 15:01:25 [INFO] Running configuration MD5 = aa8b3f03946fb60546ca1f725d482714
2018/04/20 15:02:01 [INFO] Reloading
2018/04/20 15:02:01 [INFO] Running configuration MD5 = b34a96d99e01db4015a892212560155f
2018/04/20 15:02:01 [INFO] Reloading complete
^C2018/04/20 15:02:06 [INFO] SIGINT: Shutting down

With this corefile:
.com {
  proxy . 127.0.0.1:53
  prometheus :9054
  whoami
  reload
}

. {
  proxy . 127.0.0.1:53
  prometheus :9054
  whoami
  reload
}

The prometheus port was 9053, changed that to 54 so reload would pick it
up.

From a cursory look it seems this also fixes:
Fixes #1604 #1618 #1686 #1492

* At least make it test

* Use onfinalshutdown

* reload: add reload test

This test #1604 adn right now fails.

* Address review comments

* Add bug section explaining things a bit

* compile tests

* Fix tests

* fixes

* slightly less crazy

* try to make prometheus setup less confusing

* Use ephermal port for test

* Don't use the listener

* These are shared between goroutines, just use the boolean in the main
  structure.
* Fix text in the reload README,
* Set addr to TODO once stopping it
* Morph fturb's comment into test, to test reload and scrape health and
  metric endpoint
2018-04-21 17:43:02 +01:00
Miek Gieben
26d1432ae6
Update all plugins to use plugin/pkg/log (#1694)
* Update all plugins to use plugin/pkg/log

I wish this could have been done with sed. Alas manually changed all
callers to use the new plugin/pkg/log package.

* Error -> Info

* Add docs to debug plugin as well
2018-04-19 07:41:56 +01:00
Miek Gieben
4df416ca1d
Metrics (#1579)
* plugin/metrics: set server address in context

Allow cross server block metrics to co-exist; for this we should label
each metric with the server label. Put this information in the context
and provide a helper function to get it out.

Abstracting with entirely away with difficult as the release client_go
(0.8.0) doesn't have the CurryWith functions yet. So current use is like
so:

define metric, with server label:

	RcodeCount = prometheus.NewCounterVec(prometheus.CounterOpts{
		Namespace: plugin.Namespace,
		Subsystem: "forward",
		Name:      "response_rcode_count_total",
		Help:      "Counter of requests made per upstream.",
	}, []string{"server", "rcode", "to"})

And report ith with the helper function metrics.WithServer:

	RcodeCount.WithLabelValues(metrics.WithServer(ctx), rc, p.addr).Add(1)
2018-04-01 13:57:03 +01:00
Miek Gieben
acf823cd78
Metrics2 (#1588)
* plugin/metrics: still need nil check for shutdown

the second prometheus statement will trigger: (on control-C)

[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x94f45a]

goroutine 25 [running]:
github.com/coredns/coredns/plugin/metrics.(*Metrics).OnShutdown(0xc420252000, 0x0, 0x0)
	/home/miek/g/src/github.com/coredns/coredns/plugin/metrics/metrics.go:107 +0x2a
github.com/coredns/coredns/plugin/metrics.(*Metrics).OnShutdown-fm(0x0, 0x0)
	/home/miek/g/src/github.com/coredns/coredns/plugin/metrics/setup.go:39 +0x2a
github.com/mholt/caddy.(*Instance).ShutdownCallbacks(0xc4202c81e0, 0x0, 0x0, 0x0)
	/home/miek/g/src/github.com/mholt/caddy/caddy.go:164 +0xb3
github.com/mholt/caddy.allShutdownCallbacks(0x1743935, 0x8, 0x14a1b40)
	/home/miek/g/src/github.com/mholt/caddy/sigtrap.go:95 +0x10d
github.com/mholt/caddy.executeShutdownCallbacks.func1()
	/home/miek/g/src/github.com/mholt/caddy/sigtrap.go:75 +0x8f
sync.(*Once).Do(0x2256b80, 0xc42036df88)
	/home/miek/upstream/go/src/sync/once.go:44 +0xbe
github.com/mholt/caddy.executeShutdownCallbacks(0x174033f, 0x6, 0x0)
	/home/miek/g/src/github.com/mholt/caddy/sigtrap.go:71 +0x73
github.com/mholt/caddy.trapSignalsCrossPlatform.func1.1()
	/home/miek/g/src/github.com/mholt/caddy/sigtrap.go:61 +0x36
created by github.com/mholt/caddy.trapSignalsCrossPlatform.func1
	/home/miek

* comments on why
2018-03-02 18:17:05 -08:00
Miek Gieben
6f3a7af548
Metrics reload (#1586)
* wip

* plugin/metrics: fix reload behavior

Fixes #1472
2018-03-02 17:16:25 -08:00
Tobias Schmidt
b707438534 Add coredns_build_info metric (#1418)
In order to track the rollout status of CoreDNS versions, add the common
build_info metric.
2018-01-23 20:10:55 +00:00
Miek Gieben
99047aee9b
plugin/metrics: convience MustRegister function (#1332)
This leave most of the code intact, but we need to stop vendoring
prometheus, because, again, plugins what want to use it. Not vendoring
prometheus makes my forward metrics show up again. Code looks bit
convoluted, but works:

~~~
	c.OnStartup(func() error {
		once.Do(func() {
			m := dnsserver.GetConfig(c).Handler("prometheus")
			if m == nil {
				return
			}
			if x, ok := m.(*metrics.Metrics); ok {
				x.MustRegister(RequestCount)
				x.MustRegister(RcodeCount)
				x.MustRegister(RequestDuration)
				x.MustRegister(HealthcheckFailureCount)
				x.MustRegister(SocketGauge)
			}
		})
	})
~~~
2017-12-27 14:14:53 +00:00
James Hartig
671d170619 plugin/metrics: Switch to using promhttp instead of deprecated Handler (#1312)
prometheus.Handler is deprecated according to the godoc for the package so
instead we're using promhttp.

Additionally, we are exposing the Registry that metrics is using so other
plugins that are not inside of coredns can read the registry. Otherwise, if
we kept using the Default one, there's no way to access that from outside
of the coredns repo since it is vendored.
2017-12-14 18:19:03 +00:00
James Hartig
1919913c98
plugin/metrics: Added New func (#1309)
If external plugins wanted to extend metrics there was no way since
zoneNames couldn't be initialized. Now plugins can call New to get an
instance of Metrics that they can extend.
2017-12-13 16:59:10 -05:00
Miek Gieben
d8714e64e4 Remove the word middleware (#1067)
* Rename middleware to plugin

first pass; mostly used 'sed', few spots where I manually changed
text.

This still builds a coredns binary.

* fmt error

* Rename AddMiddleware to AddPlugin

* Readd AddMiddleware to remain backwards compat
2017-09-14 09:36:06 +01:00
Renamed from middleware/metrics/metrics.go (Browse further)