c2d855aedd
[ #1636 ] qos: Return Resource Exhausted error
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:28 +03:00
9b113c3156
[ #1613 ] morph: Add tracing for morph queries to neo-go
...
DCO action / DCO (pull_request) Successful in 59s
Vulncheck / Vulncheck (pull_request) Successful in 1m4s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m55s
Build / Build Components (pull_request) Successful in 2m4s
Tests and linters / Staticcheck (pull_request) Successful in 2m38s
Tests and linters / Lint (pull_request) Successful in 3m16s
Tests and linters / Run gofumpt (pull_request) Successful in 3m54s
Tests and linters / Tests (pull_request) Successful in 4m12s
Tests and linters / gopls check (pull_request) Successful in 4m31s
Tests and linters / Tests with -race (pull_request) Successful in 4m38s
OCI image / Build container images (push) Failing after 18s
Vulncheck / Vulncheck (push) Successful in 1m2s
Pre-commit hooks / Pre-commit (push) Successful in 1m39s
Build / Build Components (push) Successful in 1m45s
Tests and linters / Staticcheck (push) Successful in 2m18s
Tests and linters / Run gofumpt (push) Successful in 2m46s
Tests and linters / Lint (push) Successful in 3m5s
Tests and linters / Tests with -race (push) Successful in 3m23s
Tests and linters / Tests (push) Successful in 3m52s
Tests and linters / gopls check (push) Successful in 4m18s
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-02-05 16:38:20 +03:00
f0c43c8d80
[ #1502 ] Use zap.Error
for logging errors
...
Vulncheck / Vulncheck (pull_request) Successful in 3m1s
Pre-commit hooks / Pre-commit (pull_request) Successful in 3m29s
Tests and linters / gopls check (pull_request) Successful in 3m50s
Tests and linters / Lint (pull_request) Successful in 4m35s
DCO action / DCO (pull_request) Successful in 5m12s
Tests and linters / Run gofumpt (pull_request) Successful in 5m33s
Build / Build Components (pull_request) Successful in 5m45s
Tests and linters / Tests with -race (pull_request) Successful in 6m37s
Tests and linters / Tests (pull_request) Successful in 7m17s
Tests and linters / Staticcheck (pull_request) Successful in 7m36s
Tests and linters / Run gofumpt (push) Successful in 1m22s
Tests and linters / Staticcheck (push) Successful in 3m19s
Tests and linters / Lint (push) Successful in 4m35s
Vulncheck / Vulncheck (push) Successful in 5m20s
Build / Build Components (push) Successful in 6m16s
Pre-commit hooks / Pre-commit (push) Successful in 6m37s
Tests and linters / Tests (push) Successful in 6m48s
Tests and linters / Tests with -race (push) Successful in 7m15s
Tests and linters / gopls check (push) Successful in 7m27s
Use `zap.Error` instead of `zap.String` for logging errors: change all expressions like
`zap.String("error", err.Error())` or `zap.String("err", err.Error())` to `zap.Error(err)`.
Leave similar expressions with other messages unchanged, for example,
`zap.String("last_error", lastErr.Error())` or `zap.String("reason", ctx.Err().Error())`.
This change was made by applying the following patch:
```diff
@@
var err expression
@@
-zap.String("error", err.Error())
+zap.Error(err)
@@
var err expression
@@
-zap.String("err", err.Error())
+zap.Error(err)
```
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
2024-12-16 11:13:42 +03:00
7429553266
[ #1437 ] node: Fix contextcheck linter
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-11-13 10:36:10 +03:00
16598553d9
[ #1437 ] shard: Fix contextcheck linter
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-11-13 10:36:09 +03:00
62b5181618
[ #1437 ] blobovnicza: Fix contextcheck linter
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-11-13 10:36:08 +03:00
6db46257c0
[ #1437 ] node: Use ctx for logging
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-11-13 10:36:07 +03:00
4dc9a1b300
[ #1413 ] engine: Remove error counting methods from Shard
...
Tests and linters / Run gofumpt (pull_request) Successful in 2m4s
DCO action / DCO (pull_request) Successful in 2m22s
Pre-commit hooks / Pre-commit (pull_request) Successful in 4m10s
Vulncheck / Vulncheck (pull_request) Successful in 4m5s
Build / Build Components (pull_request) Successful in 4m31s
Tests and linters / Staticcheck (pull_request) Successful in 4m21s
Tests and linters / gopls check (pull_request) Successful in 4m43s
Tests and linters / Lint (pull_request) Successful in 4m58s
Tests and linters / Tests (pull_request) Successful in 6m36s
Tests and linters / Tests with -race (pull_request) Successful in 7m41s
All error counting and hangling logic is present on the engine level.
Currently, we pass engine metrics with shard ID metric to shard, then
export 3 methods to manipulate these metrics.
In this commits all methods are removed and error counter is tracked on
the engine level exlusively.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-04 15:10:17 +03:00
963faa615a
[ #1413 ] engine: Cleanup shard error reporting
...
- `reportShardErrorBackground()` no longer differs from
`reportShardError()`, reflect this in its name;
- reuse common pieces of code to make it simpler.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-04 15:10:17 +03:00
9a87acb87a
[ #1410 ] engine: Provide the default implementation to MetricsRegister
...
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-10-03 08:23:06 +00:00
a61201a987
[ #1337 ] config: Move rebuild_worker_count
to shard section
...
This makes it simple to limit performance degradation for every shard
because of rebuild.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-09-06 13:57:27 +03:00
9d73f9c2c6
Reapply "[ #446 ] engine: Move to read-only on blobstor errors"
...
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-06-13 07:35:22 +00:00
40781b3a20
[ #1086 ] engine: Change mode in case of errors async
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-04-10 12:29:43 +00:00
f23e38c285
Revert "[ #446 ] engine: Move to read-only on blobstor errors"
...
DCO action / DCO (pull_request) Successful in 2m14s
Build / Build Components (1.20) (pull_request) Successful in 4m7s
Vulncheck / Vulncheck (pull_request) Successful in 3m30s
Build / Build Components (1.21) (pull_request) Successful in 4m15s
Tests and linters / Staticcheck (pull_request) Successful in 5m41s
Tests and linters / Lint (pull_request) Successful in 6m6s
Tests and linters / gopls check (pull_request) Successful in 6m42s
Tests and linters / Tests (1.20) (pull_request) Successful in 7m47s
Tests and linters / Tests (1.21) (pull_request) Successful in 8m21s
Tests and linters / Tests with -race (pull_request) Successful in 8m20s
This reverts commit 69df0d21c2
.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-04-01 12:48:30 +03:00
d75e7e9a21
[ #864 ] engine: Drop container size metric if container deleted
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-10 10:44:54 +03:00
f1c7905263
[ #661 ] blobovniczatree: Make Rebuild concurrent
...
Different DBs can be rebuild concurrently.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-12-07 15:37:33 +03:00
79088baa06
[ #772 ] node: Apply gofumpt
...
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-10-31 17:03:03 +03:00
d15199c5d8
[ #596 ] engine: Consider context errors as logical
...
Vulncheck / Vulncheck (pull_request) Successful in 2m59s
DCO action / DCO (pull_request) Successful in 2m54s
Build / Build Components (1.20) (pull_request) Successful in 4m2s
Build / Build Components (1.21) (pull_request) Successful in 4m51s
Tests and linters / Staticcheck (pull_request) Successful in 14m8s
Tests and linters / Tests (1.20) (pull_request) Failing after 14m56s
Tests and linters / Lint (pull_request) Successful in 15m27s
Tests and linters / Tests (1.21) (pull_request) Failing after 15m36s
Tests and linters / Tests with -race (pull_request) Failing after 16m18s
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-08-16 10:39:41 +03:00
cac4ed93d6
[ #428 ] engine: Add low_mem config parameter
...
Concurrent initialization in case of the metabase resync leads to
high memory consumption and potential OOM.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-06-26 13:29:39 +00:00
69df0d21c2
[ #446 ] engine: Move to read-only on blobstor errors
...
ci/woodpecker/pr/pre-commit Pipeline was successful
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-06-16 14:53:32 +03:00
20b84f183a
[ #446 ] engine: Simplify logs for shard mode change
...
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-06-16 14:51:29 +03:00
263c6fdc50
[ #372 ] node: Add metrics for the error counter in the engine
...
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-06-07 13:04:47 +00:00
faca861451
[ #411 ] Remove unnecessary pointers for sync objects
...
ci/woodpecker/push/pre-commit Pipeline was successful
Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
2023-05-31 10:19:14 +00:00
4b768fd115
[ #381 ] *: Move to sync/atomic
...
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-05-23 08:18:01 +03:00
e4889e06ba
[ #329 ] node: Make evacuate async
...
Now it's possible to run evacuate shard in async.
Also only one evacuate process can be in progress.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-05-19 08:43:52 +00:00
0e31c12e63
[ #240 ] logs: Move log messages to constants
...
Drop duplicate entities.
Format entities.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-04-14 05:06:09 +00:00
dbc3811ff4
[ #191 ] engine: Allow to remove redundant object copies
...
RemoveDuplicates() removes all duplicate object copies stored on
multiple shards. All shards are processed and the command tries to leave
a copy on the best shard according to HRW.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-04-07 17:25:50 +00:00
20de74a505
Rename package name
...
Due to source code relocation from GitHub.
Signed-off-by: Alex Vanin <a.vanin@yadro.com>
2023-03-07 16:38:26 +03:00
c3a7039801
[ TrueCloudLab/hrw#2 ] node: Optimize shard hash
...
Compute shard hash only once
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
cb016d53a6
[ #1 ] Fix comments and error messages
...
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
Pavel Karpy
923f84722a
Move to frostfs-node
...
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Pavel Karpy
b673d9e472
[ #2053 ] engine: Do not switch mode because of logical errors
...
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
f2d7e65e39
[ #2035 ] engine: Allow moving to degraded from background workers
...
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
777fd32d4f
[ #1818 ] writecache: Increase error counter on background errors
...
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
fcdbf5e509
[ #1969 ] local_object_storage: Add a type for logical errors
...
All logic errors are wrapped in `logicerr.Logical` type and do not
affect shard error counter.
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-31 11:41:24 +03:00
Evgenii Stratonikov
3b939d190c
[ #1957 ] engine: Move shard to read-only if cannot move to degraded
...
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-10-26 08:20:53 +03:00
Pavel Karpy
f037022a7a
[ #1770 ] logger: Refactor Logger
component
...
Make it store its internal `zap.Logger`'s level. Also, make all the
components to accept internal `logger.Logger` instead of `zap.Logger`; it
will simplify future refactor.
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-10-12 18:11:05 +03:00
Evgenii Stratonikov
4944490ffb
[ #1559 ] local_object_storage: Move shard to the DegradedReadOnly
mode
...
`Degraded` mode can be set by the administrator if needed.
Modifying operations in this mode can lead node into an inconsistent state
because metabase checks such as lock checking are not performed.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
339864b720
[ #1559 ] local_object_storage: Move shard.Mode
to a separate package
...
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Elizaveta Chichindaeva
cc7a723d77
[ #1320 ] English Check
...
Signed-off-by: Elizaveta Chichindaeva <elizaveta@nspcc.ru>
2022-05-11 10:40:02 +03:00
Evgenii Stratonikov
6472a170eb
[ #1143 ] shard: Introduce explicit Degraded
mode
...
`Degraded` mode is set automatically after error counter is over the
threshold. `ReadOnly` mode can still be set by an administrator.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-03-31 15:33:22 +03:00
Evgenii Stratonikov
6ad2624552
[ #1118 ] engine: allow to set error threshold
...
There are certain errors which are not expected during usual node
operation and which tell us that something is wrong with the shard.
To prevent possible data corruption, move shard in read-only mode after
amount of errors exceeded some threshold. By default no actions are performed.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-02-03 15:14:27 +03:00
Leonard Lyubich
ec04e787aa
[ #922 ] storage engine: Support operation blocking
...
There is a need to disable execution of local data operation on storage
engine in runtime. If storage engine ops are blocked, node will act like
always but all local object operations will be denied.
Implement `BlockExecution` / `ResumeExecution` methods on `StorageEngine`
which blocks / resumes the execution of data ops. Wait for the completion of
all operations executed at the time of the call. Return error passed to
`BlockExecution` from all data-related methods until `ResumeExecution` call.
Make `Close` to block operations as well.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-11-12 17:28:38 +03:00
Leonard Lyubich
5b1975d52a
[ #674 ] storage engine: Use per-shard worker pools for PUT operation
...
Make `StorageEngine` to use non-blocking worker pools with the same
(configurable) size for PUT operation. This allows you to switch to using
more free shards when overloading others, thereby more evenly distributing
the write load.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2021-10-14 10:20:39 +03:00
Alex Vanin
b8e10571c6
[ #426 ] Put prometheus behind pkg/metrics
...
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-03-17 10:58:00 +03:00
Alex Vanin
980b774af2
[ #426 ] engine: Support duration metrics
...
With `enable metrics` option, engine will collect
durations for all public methods.
Signed-off-by: Alex Vanin <alexey@nspcc.ru>
2021-03-17 10:58:00 +03:00
Leonard Lyubich
09750484f9
[ #176 ] localstore: Draft storage engine structure and ops
...
Implement the primary structure and operation of the local object
storage engine.
Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
2020-12-11 17:19:37 +03:00