Anton Nikiforov
fb9219af39
All checks were successful
DCO action / DCO (pull_request) Successful in 1m24s
Vulncheck / Vulncheck (pull_request) Successful in 4m43s
Build / Build Components (1.21) (pull_request) Successful in 6m3s
Build / Build Components (1.20) (pull_request) Successful in 6m16s
Tests and linters / Staticcheck (pull_request) Successful in 6m47s
Tests and linters / Lint (pull_request) Successful in 7m32s
Tests and linters / gopls check (pull_request) Successful in 9m12s
Tests and linters / Tests (1.20) (pull_request) Successful in 12m36s
Tests and linters / Tests (1.21) (pull_request) Successful in 13m27s
Tests and linters / Tests with -race (pull_request) Successful in 3m31s
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
54 lines
4.3 KiB
Markdown
54 lines
4.3 KiB
Markdown
# Shard modes description
|
|
|
|
## List of modes
|
|
|
|
Each mode is characterized by two important properties:
|
|
1. Whether modifying operations are allowed.
|
|
2. Whether metabase and write-cache is available.
|
|
The expected deployment scenario is to place both metabase and write-cache on an SSD drive thus these modes
|
|
can be approximately described as no-SSD modes.
|
|
|
|
| Mode | Description |
|
|
|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| `read-write` | Default mode, all operations are allowed. |
|
|
| `read-only` | Read-only mode, only read operations are allowed, metabase is available. |
|
|
| `degraded` | Degraded mode in which metabase and write-cache is disabled. It shouldn't be used at all, because metabase can contain important indices, such as LOCK objects info and modifying operation in this mode can lead to unexpected behaviour. The purpose of this mode is to allow PUT/DELETE operations without the metabase if really necessary. |
|
|
| `degraded-read-only` | Same as `degraded`, but with only read operations allowed. This mode is used during SSD replacement and/or when the metabase error counter exceeds threshold. |
|
|
| `disabled` | Currently used only in config file to temporarily disable a shard. |
|
|
|
|
## Transition order
|
|
|
|
Because each shard consists of multiple components changing its mode is not an atomic operation.
|
|
Instead, each component changes its mode independently.
|
|
|
|
For transitions to `read-write` mode the order is:
|
|
1. `metabase`
|
|
2. `blobstor`
|
|
3. `writecache`
|
|
4. `pilorama`
|
|
|
|
For transitions to all other modes the order is:
|
|
1. `writecache`
|
|
2. `blobstor`
|
|
3. `metabase`
|
|
4. `pilorama`
|
|
|
|
The motivation is to avoid transient errors because write-cache can write to both blobstor and metabase.
|
|
Thus, when we want to _stop_ write operations, write-cache needs to change mode before them.
|
|
On the other side, when we want to _allow_ them, blobstor and metabase should be writable before write-cache is.
|
|
|
|
If anything goes wrong in the middle, the mode of some components can be different from the actual mode of a shard.
|
|
However, all mode changing operations are idempotent.
|
|
|
|
## Automatic mode changes
|
|
|
|
Shard can automatically switch to a `degraded-read-only` mode in 3 cases:
|
|
1. If the metabase was not available or couldn't be opened/initialized during shard startup.
|
|
2. If shard error counter exceeds threshold.
|
|
3. If the metabase couldn't be reopened during SIGHUP handling.
|
|
|
|
# Detach shard
|
|
|
|
To detach a shard use `frostfs-cli control shards detach` command. This command removes the shards from the storage
|
|
engine and closes all resources associated with the shards.
|
|
Limitation: `SIGHUP` or storage node restart lead to detached shard will be again online.
|