Commit graph

69 commits

Author SHA1 Message Date
4dc9a1b300 [#1413] engine: Remove error counting methods from Shard
All checks were successful
Tests and linters / Run gofumpt (pull_request) Successful in 2m4s
DCO action / DCO (pull_request) Successful in 2m22s
Pre-commit hooks / Pre-commit (pull_request) Successful in 4m10s
Vulncheck / Vulncheck (pull_request) Successful in 4m5s
Build / Build Components (pull_request) Successful in 4m31s
Tests and linters / Staticcheck (pull_request) Successful in 4m21s
Tests and linters / gopls check (pull_request) Successful in 4m43s
Tests and linters / Lint (pull_request) Successful in 4m58s
Tests and linters / Tests (pull_request) Successful in 6m36s
Tests and linters / Tests with -race (pull_request) Successful in 7m41s
All error counting and hangling logic is present on the engine level.
Currently, we pass engine metrics with shard ID metric to shard, then
export 3 methods to manipulate these metrics.
In this commits all methods are removed and error counter is tracked on
the engine level exlusively.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-04 15:10:17 +03:00
963faa615a [#1413] engine: Cleanup shard error reporting
- `reportShardErrorBackground()` no longer differs from
  `reportShardError()`, reflect this in its name;
- reuse common pieces of code to make it simpler.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-04 15:10:17 +03:00
9a87acb87a [#1410] engine: Provide the default implementation to MetricsRegister
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-03 08:23:06 +00:00
a61201a987 [#1337] config: Move rebuild_worker_count to shard section
This makes it simple to limit performance degradation for every shard
because of rebuild.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-09-06 13:57:27 +03:00
108e4e07be [#1349] node: Evacuate objects without setting mode to MAINTENANCE
All checks were successful
DCO action / DCO (pull_request) Successful in 1m31s
Vulncheck / Vulncheck (pull_request) Successful in 1m32s
Pre-commit hooks / Pre-commit (pull_request) Successful in 2m12s
Tests and linters / Run gofumpt (pull_request) Successful in 2m21s
Build / Build Components (pull_request) Successful in 2m32s
Tests and linters / gopls check (pull_request) Successful in 2m36s
Tests and linters / Staticcheck (pull_request) Successful in 2m55s
Tests and linters / Tests with -race (pull_request) Successful in 3m26s
Tests and linters / Lint (pull_request) Successful in 3m31s
Tests and linters / Tests (pull_request) Successful in 3m40s
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-09-05 16:08:27 +03:00
03976c6ed5 [#1341] .golangci.yml: Replace exportloopref with copyloopvar
exportloopref is deprecated.
gopatch:
```
@@
var index, value identifier
var slice expression
@@
for index, value := range slice {
...
-value := value
...
}

@@
var index, value identifier
var slice expression
@@
for index, value := range slice {
...
-index := index
...
}

@@
var value identifier
var channel expression
@@
for value := range channel {
...
-value := value
...
}
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-08-28 15:44:41 +00:00
3bf6e6dde6 [#1238] engine: Add non-blocking send in the shard's notification channel
All checks were successful
DCO action / DCO (pull_request) Successful in 17m26s
Build / Build Components (1.21) (pull_request) Successful in 19m19s
Vulncheck / Vulncheck (pull_request) Successful in 19m46s
Build / Build Components (1.22) (pull_request) Successful in 23m58s
Pre-commit hooks / Pre-commit (pull_request) Successful in 32m59s
Tests and linters / Staticcheck (pull_request) Successful in 5m9s
Tests and linters / Lint (pull_request) Successful in 5m47s
Tests and linters / gopls check (pull_request) Successful in 5m22s
Tests and linters / Tests (1.21) (pull_request) Successful in 8m37s
Tests and linters / Tests with -race (pull_request) Successful in 8m38s
Tests and linters / Tests (1.22) (pull_request) Successful in 8m56s
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-07-10 11:37:11 +03:00
1b17258c04 [#1029] metabase: Add refill metrics
All checks were successful
DCO action / DCO (pull_request) Successful in 1m22s
Vulncheck / Vulncheck (pull_request) Successful in 3m11s
Build / Build Components (1.21) (pull_request) Successful in 3m56s
Build / Build Components (1.20) (pull_request) Successful in 3m59s
Tests and linters / Staticcheck (pull_request) Successful in 5m31s
Tests and linters / gopls check (pull_request) Successful in 5m26s
Tests and linters / Lint (pull_request) Successful in 6m13s
Tests and linters / Tests (1.20) (pull_request) Successful in 8m54s
Tests and linters / Tests (1.21) (pull_request) Successful in 9m13s
Tests and linters / Tests with -race (pull_request) Successful in 9m30s
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-04-10 13:05:44 +03:00
d5194ab2a6 [#949] metabase: fix shard.UpdateID()
All checks were successful
Vulncheck / Vulncheck (pull_request) Successful in 1m20s
DCO action / DCO (pull_request) Successful in 1m59s
Build / Build Components (1.21) (pull_request) Successful in 3m25s
Build / Build Components (1.20) (pull_request) Successful in 4m46s
Tests and linters / Staticcheck (pull_request) Successful in 6m5s
Tests and linters / gopls check (pull_request) Successful in 6m17s
Tests and linters / Lint (pull_request) Successful in 7m7s
Tests and linters / Tests (1.20) (pull_request) Successful in 8m38s
Tests and linters / Tests with -race (pull_request) Successful in 8m51s
Tests and linters / Tests (1.21) (pull_request) Successful in 8m56s
metabase.Open() now reports metabase mode metric. shard.UpdateID()
needs to read shard ID from metabase => needs to open metabase.
It caused reporting 'shard undefined' metrics. To avoid reporting
wrong metrics metabase.GetShardID() was added which also opens
metabase and does not report metrics.

Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
2024-04-01 17:27:34 +03:00
2ad433dbcb [#1005] engine: Drop shards weights
All checks were successful
DCO action / DCO (pull_request) Successful in 4m7s
Vulncheck / Vulncheck (pull_request) Successful in 4m53s
Build / Build Components (1.21) (pull_request) Successful in 5m46s
Build / Build Components (1.20) (pull_request) Successful in 6m21s
Tests and linters / Staticcheck (pull_request) Successful in 7m45s
Tests and linters / Lint (pull_request) Successful in 8m44s
Tests and linters / Tests (1.21) (pull_request) Successful in 13m1s
Tests and linters / Tests (1.20) (pull_request) Successful in 15m42s
Tests and linters / Tests with -race (pull_request) Successful in 16m10s
Unused.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-26 17:25:05 +03:00
9ba48c582d [#917] engine: Allow to detach shards
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-06 14:49:47 +03:00
e3573de6db [#930] gc: Stop internal activity by context
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-01-31 08:30:34 +00:00
931a5e9aaf [#918] engine: Move shard to degraded mode if metabase open failed
All checks were successful
DCO action / DCO (pull_request) Successful in 1m55s
Build / Build Components (1.20) (pull_request) Successful in 2m14s
Vulncheck / Vulncheck (pull_request) Successful in 2m15s
Build / Build Components (1.21) (pull_request) Successful in 4m6s
Tests and linters / Tests (1.21) (pull_request) Successful in 5m58s
Tests and linters / Tests (1.20) (pull_request) Successful in 6m24s
Tests and linters / Staticcheck (pull_request) Successful in 6m11s
Tests and linters / Lint (pull_request) Successful in 6m37s
Tests and linters / Tests with -race (pull_request) Successful in 6m35s
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-23 11:16:40 +03:00
4b8b4da681 [#864] engine: Drop container count metric if container removed
All checks were successful
DCO action / DCO (pull_request) Successful in 1m30s
Build / Build Components (1.21) (pull_request) Successful in 3m29s
Build / Build Components (1.20) (pull_request) Successful in 3m53s
Tests and linters / Lint (pull_request) Successful in 4m31s
Tests and linters / Tests (1.20) (pull_request) Successful in 5m1s
Tests and linters / Staticcheck (pull_request) Successful in 4m51s
Tests and linters / Tests (1.21) (pull_request) Successful in 5m13s
Tests and linters / Tests with -race (pull_request) Successful in 8m34s
Vulncheck / Vulncheck (pull_request) Successful in 58s
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-10 10:45:32 +03:00
d75e7e9a21 [#864] engine: Drop container size metric if container deleted
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-10 10:44:54 +03:00
f1c7905263 [#661] blobovniczatree: Make Rebuild concurrent
Different DBs can be rebuild concurrently.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-12-07 15:37:33 +03:00
d30ab5f29e [#838] metabase: Count user objects
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-12-06 15:44:21 +03:00
70ab1ebd54 [#763] metrics: Add container_objects_total metric
All checks were successful
DCO action / DCO (pull_request) Successful in 3m54s
Build / Build Components (1.20) (pull_request) Successful in 4m58s
Build / Build Components (1.21) (pull_request) Successful in 5m16s
Vulncheck / Vulncheck (pull_request) Successful in 9m54s
Tests and linters / Lint (pull_request) Successful in 10m57s
Tests and linters / Tests (1.21) (pull_request) Successful in 12m40s
Tests and linters / Staticcheck (pull_request) Successful in 12m34s
Tests and linters / Tests with -race (pull_request) Successful in 12m48s
Tests and linters / Tests (1.20) (pull_request) Successful in 13m19s
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-11-08 12:30:57 +03:00
8a81af5a3b [#653] Add context parameter to Open functions
All checks were successful
DCO action / DCO (pull_request) Successful in 1m38s
Build / Build Components (1.20) (pull_request) Successful in 4m22s
Build / Build Components (1.21) (pull_request) Successful in 4m25s
Vulncheck / Vulncheck (pull_request) Successful in 4m56s
Tests and linters / Lint (pull_request) Successful in 6m1s
Tests and linters / Tests (1.20) (pull_request) Successful in 7m43s
Tests and linters / Staticcheck (pull_request) Successful in 8m1s
Tests and linters / Tests (1.21) (pull_request) Successful in 8m14s
Tests and linters / Tests with -race (pull_request) Successful in 8m32s
Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
2023-09-07 18:03:29 +03:00
1a0cb0f34a [#421] Try using badger for the write-cache
Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
2023-08-07 08:16:57 +00:00
028d4a8058 [#373] blobovnicza: Add metrics
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-06-21 15:13:26 +03:00
01a0c97760 [#453] engine: Set Disabled mode to deleted shard
All checks were successful
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-06-20 12:04:07 +03:00
4449006862 [#424] metrics: Use mode value as metric value for shard
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-06-14 18:26:19 +03:00
1b364d8cf4 [#424] metrics: Refactor engine metrics
Use histogram vector to measure request duration.
Fix naming like in Prometheus best practice.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-06-14 14:53:32 +03:00
41ab4d070e [#423] *: Use hrw.StringHash() where possible
All checks were successful
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-06-13 07:18:25 +00:00
263c6fdc50 [#372] node: Add metrics for the error counter in the engine
All checks were successful
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-06-07 13:04:47 +00:00
3220c4df9f [#376] metrics: Add GC metrics
All checks were successful
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-05-31 10:22:12 +00:00
2ce43935f9 [#312] metrics: Add writecache metrcis
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-05-24 10:18:39 +00:00
4b768fd115 [#381] *: Move to sync/atomic
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-05-23 08:18:01 +03:00
0e31c12e63 [#240] logs: Move log messages to constants
Drop duplicate entities.
Format entities.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-04-14 05:06:09 +00:00
dbc3811ff4 [#191] engine: Allow to remove redundant object copies
RemoveDuplicates() removes all duplicate object copies stored on
multiple shards. All shards are processed and the command tries to leave
a copy on the best shard according to HRW.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-04-07 17:25:50 +00:00
20de74a505 Rename package name
Due to source code relocation from GitHub.

Signed-off-by: Alex Vanin <a.vanin@yadro.com>
2023-03-07 16:38:26 +03:00
6925fb4c59 [TrueCloudLab/hrw#2] node: Use typed HRW methods
Update HRW lib and use typed HRW methods to sort shards and nodes

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
c3a7039801 [TrueCloudLab/hrw#2] node: Optimize shard hash
Compute shard hash only once

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
Pavel Karpy
89a0266f5e [#1794] metrics: Track physical object capacity per shard
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
Evgenii Stratonikov
9513f163aa [#2116] metrics: Track physical object capacity in the container
Currently we track based on `PayloadSize`, because it is already stored
in the metabase and it is easier to calculate without slowing down the
whole system.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
edb1428248 [#2022] Add metric readonly to get shards mode
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Evgenii Stratonikov
777fd32d4f [#1818] writecache: Increase error counter on background errors
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
56de2f1363 [#1969] local_object_storage: Simplify logic error construction
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-31 11:41:24 +03:00
Evgenii Stratonikov
fcdbf5e509 [#1969] local_object_storage: Add a type for logical errors
All logic errors are wrapped in `logicerr.Logical` type and do not
affect shard error counter.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-31 11:41:24 +03:00
Pavel Karpy
31c623636d [#1863] node: Fix shard id in the object counter metrics
If shard ID is stored in metabase (it is not the first time boot), read it,
set it, use it (not a generated one) in the metrics writer.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-13 13:06:41 +03:00
Pavel Karpy
8ebe95747e [#1770] node: Do not lock on shard's Close call
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-04 10:08:55 +03:00
Pavel Karpy
887afeaddb [#1770] engine: Do not lock on shard init
Init can take a lot of time. Because the mutex is taken, all new operations
are blocked.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-04 10:08:55 +03:00
Pavel Karpy
fbd5bc8c38 [#1770] engine: Support configuration reload
Currently, it only supports changing the compound of the shards.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-04 10:08:55 +03:00
Pavel Karpy
2d7166f8d0 [#1770] shard: Move NewEpoch event routing on SE level
It will allow dynamic shard management. Closing a shard does not allow
removing event handlers.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-04 10:08:55 +03:00
Pavel Karpy
edef26a4fd [#1658] engine: Update metrics interfaces
It now supports typed object counter metrics.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-09-13 21:32:37 +04:00
Pavel Karpy
9c7b3ce799 [#1658] node: Read metrics from meta on startup
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-25 19:20:33 +03:00
Pavel Karpy
745f72fff0 [#1658] node: Support object counter metrics
Increment shard's object counter on successful `Put` calls and decrement on
`Delete`.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-25 19:20:33 +03:00
Evgenii Stratonikov
339864b720 [#1559] local_object_storage: Move shard.Mode to a separate package
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00