node: Add metrics for the error counter in the engine #418

Merged
realloc merged 1 commit from acid-ant/frostfs-node:feature/372-add-err-counter-shard into master 2023-06-07 13:04:48 +00:00
Member

Close #372

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Close #372 Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant requested review from storage-core-committers 2023-06-01 14:32:31 +00:00
acid-ant requested review from storage-core-developers 2023-06-01 14:32:32 +00:00
dstepanov-yadro requested changes 2023-06-02 06:55:49 +00:00
@ -174,6 +186,8 @@ func (e *StorageEngine) removeShards(ids ...string) {
continue
}
sh.DeleteErrorCounter()

Scenario: There were 5 shards, one of them did not work due to errors. I deleted that shard so the server would continue to work normal. After that, I decided to investigate, but there is no metrics for this shard.

So i think we should not delete metrics when delete shard.

Scenario: There were 5 shards, one of them did not work due to errors. I deleted that shard so the server would continue to work normal. After that, I decided to investigate, but there is no metrics for this shard. So i think we should not delete metrics when delete shard.
dstepanov-yadro marked this conversation as resolved
ale64bit reviewed 2023-06-02 07:01:29 +00:00
@ -132,6 +132,7 @@ func (e *StorageEngine) reportShardErrorBackground(id string, msg string, err er
}
errCount := sh.errorCount.Add(1)
sh.Shard.AddToErrorCounter(1)
Member

consider using Inc if it's always 1

consider using `Inc` if it's always 1
Author
Member

Thanks, renamed.

Thanks, renamed.
Member

Sorry, what I meant was that if the increment is always one, we can simply remove the delta everywhere and use the Inc method, e.g.:


func (m engineMetrics) IncErrorCounter(shardID string) {
	m.errorCounter.value.With(prometheus.Labels{shardIDLabelKey: shardID}).Inc()
}

But up to you, whether we will need the delta later.

Sorry, what I meant was that if the increment is always one, we can simply remove the delta everywhere and use the `Inc` method, e.g.: ``` func (m engineMetrics) IncErrorCounter(shardID string) { m.errorCounter.value.With(prometheus.Labels{shardIDLabelKey: shardID}).Inc() } ``` But up to you, whether we will need the `delta` later.
Author
Member

Agree, renamed in all places.

Agree, renamed in all places.
ale64bit marked this conversation as resolved
@ -71,0 +73,4 @@
m.errCounter += delta
}
func (m *metricsStore) ClearErrorCounter() {
Member

do we really need two functions that do the same with different names?

do we really need two functions that do the same with different names?
Author
Member

It is in tests, and it is necessary to implement two methods ClearErrorCounter and DeleteErrorCounter for type metricsStore somehow.

It is in tests, and it is necessary to implement two methods `ClearErrorCounter` and `DeleteErrorCounter` for type `metricsStore` somehow.
ale64bit marked this conversation as resolved
dstepanov-yadro approved these changes 2023-06-02 07:03:21 +00:00
acid-ant force-pushed feature/372-add-err-counter-shard from 9268c1a3d4 to 0272740848 2023-06-02 08:21:41 +00:00 Compare
ale64bit approved these changes 2023-06-02 08:30:06 +00:00
acid-ant force-pushed feature/372-add-err-counter-shard from 0272740848 to f12e080310 2023-06-02 09:01:13 +00:00 Compare
acid-ant force-pushed feature/372-add-err-counter-shard from f12e080310 to 92de822be1 2023-06-02 12:29:52 +00:00 Compare
acid-ant force-pushed feature/372-add-err-counter-shard from 92de822be1 to 05bf2a885e 2023-06-02 12:30:20 +00:00 Compare
acid-ant force-pushed feature/372-add-err-counter-shard from 05bf2a885e to 7f8f7702fa 2023-06-02 12:37:35 +00:00 Compare
acid-ant requested review from dstepanov-yadro 2023-06-02 12:40:13 +00:00
acid-ant requested review from ale64bit 2023-06-02 12:40:15 +00:00
dstepanov-yadro approved these changes 2023-06-02 14:40:01 +00:00
acid-ant requested review from storage-core-developers 2023-06-05 06:33:40 +00:00
acid-ant requested review from dstepanov-yadro 2023-06-05 06:33:51 +00:00
dstepanov-yadro approved these changes 2023-06-05 06:53:01 +00:00
ale64bit approved these changes 2023-06-05 10:53:29 +00:00
aarifullin approved these changes 2023-06-05 10:59:44 +00:00
acid-ant scheduled this pull request to auto merge when all checks succeed 2023-06-05 11:20:59 +00:00
alexvanin approved these changes 2023-06-05 11:23:13 +00:00
dstepanov-yadro approved these changes 2023-06-07 07:54:36 +00:00
dstepanov-yadro scheduled this pull request to auto merge when all checks succeed 2023-06-07 07:55:11 +00:00
acid-ant scheduled this pull request to auto merge when all checks succeed 2023-06-07 09:35:32 +00:00
acid-ant canceled auto merging this pull request when all checks succeed 2023-06-07 09:37:16 +00:00
realloc merged commit 263c6fdc50 into master 2023-06-07 13:04:48 +00:00
realloc deleted branch feature/372-add-err-counter-shard 2023-06-07 13:04:50 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#418
No description provided.