forked from TrueCloudLab/frostfs-node
Dmitrii Stepanov
e39378b1c3
To limit memory usage by background flush. Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
446 lines
36 KiB
Markdown
446 lines
36 KiB
Markdown
# FrostFS Storage node configuration file
|
|
|
|
This section contains detailed FrostFS Storage node configuration file description
|
|
including default config values and some tips to set up configurable values.
|
|
|
|
There are some custom types used for brevity:
|
|
1. `duration` -- string consisting of a number and a suffix. Suffix examples include `s` (seconds), `m` (minutes), `ms` (milliseconds).
|
|
2. `size` -- string consisting of a number and a suffix. Suffix examples include `b` (bytes, default), `k` (kibibytes), `m` (mebibytes), `g` (gibibytes).
|
|
3. `file mode` -- octal number. Usually, it starts with `0` and contain 3 digits, corresponding to file access permissions for user, group and others.
|
|
4. `public key` -- hex-encoded public key
|
|
5. `hash160` -- hex-encoded 20-byte hash of a deployed contract.
|
|
|
|
# Structure
|
|
|
|
| Section | Description |
|
|
|------------------------|---------------------------------------------------------------------|
|
|
| `logger` | [Logging parameters](#logger-section) |
|
|
| `pprof` | [PProf configuration](#pprof-section) |
|
|
| `prometheus` | [Prometheus metrics configuration](#prometheus-section) |
|
|
| `control` | [Control service configuration](#control-section) |
|
|
| `contracts` | [Override FrostFS contracts hashes](#contracts-section) |
|
|
| `morph` | [N3 blockchain client configuration](#morph-section) |
|
|
| `apiclient` | [FrostFS API client configuration](#apiclient-section) |
|
|
| `policer` | [Policer service configuration](#policer-section) |
|
|
| `replicator` | [Replicator service configuration](#replicator-section) |
|
|
| `storage` | [Storage engine configuration](#storage-section) |
|
|
| `runtime` | [Runtime configuration](#runtime-section) |
|
|
| `audit` | [Audit configuration](#audit-section) |
|
|
|
|
|
|
# `control` section
|
|
```yaml
|
|
control:
|
|
authorized_keys:
|
|
- 035839e45d472a3b7769a2a1bd7d54c4ccd4943c3b40f547870e83a8fcbfb3ce11
|
|
- 028f42cfcb74499d7b15b35d9bff260a1c8d27de4f446a627406a382d8961486d6
|
|
grpc:
|
|
endpoint: 127.0.0.1:8090
|
|
```
|
|
| Parameter | Type | Default value | Description |
|
|
|-------------------|----------------|---------------|----------------------------------------------------------------------------------|
|
|
| `authorized_keys` | `[]public key` | empty | List of public keys which are used to authorize requests to the control service. |
|
|
| `grpc.endpoint` | `string` | empty | Address that control service listener binds to. |
|
|
|
|
# `grpc` section
|
|
```yaml
|
|
grpc:
|
|
- endpoint: localhost:8080
|
|
tls:
|
|
enabled: true
|
|
certificate: /path/to/cert.pem
|
|
key: /path/to/key.pem
|
|
- endpoint: internal.ip:8080
|
|
- endpoint: external.ip:8080
|
|
tls:
|
|
enabled: true
|
|
use_insecure_crypto: true
|
|
```
|
|
Contains an array of gRPC endpoint configurations. The following table describes the format of each
|
|
element.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|---------------------------|-------------------------------|---------------|---------------------------------------------------------------------------|
|
|
| `endpoint` | `[]string` | empty | Address that service listener binds to. |
|
|
| `tls` | [TLS config](#tls-subsection) | | Address that control service listener binds to. |
|
|
|
|
## `tls` subsection
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------------------|----------|---------------|---------------------------------------------------------------------------|
|
|
| `enabled` | `bool` | `false` | Address that control service listener binds to. |
|
|
| `certificate` | `string` | | Path to the TLS certificate. |
|
|
| `key` | `string` | | Path to the key. |
|
|
| `use_insecure_crypto` | `bool` | `false` | If true, ciphers considered insecure by Go stdlib are allowed to be used. |
|
|
|
|
# `pprof` section
|
|
|
|
Contains configuration for the `pprof` profiler.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|--------------------|-----------------------------------|---------------|-----------------------------------------|
|
|
| `enabled` | `bool` | `false` | Flag to enable the service. |
|
|
| `address` | `string` | | Address that service listener binds to. |
|
|
| `shutdown_timeout` | `duration` | `30s` | Time to wait for a graceful shutdown. |
|
|
| `debug` | [Debug config](#debug-subsection) | | Optional profiles configuration |
|
|
|
|
|
|
## `debug` subsection
|
|
|
|
Contains optional profiles configuration.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|--------------|-------|---------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| `block_rate` | `int` | `0` | Controls the block profiler. Non-positive values disable profiler reports. For more information: https://pkg.go.dev/runtime@go1.20.3#SetBlockProfileRate. |
|
|
| `mutex_rate` | `int` | `0` | Controls the mutex profiler. Non-positive values disable profiler reports. For more information: https://pkg.go.dev/runtime@go1.20.3#SetMutexProfileFraction. |
|
|
|
|
# `prometheus` section
|
|
|
|
Contains configuration for the `prometheus` metrics service.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|--------------------|------------|---------------|-----------------------------------------|
|
|
| `enabled` | `bool` | `false` | Flag to enable the service. |
|
|
| `address` | `string` | | Address that service listener binds to. |
|
|
| `shutdown_timeout` | `duration` | `30s` | Time to wait for a graceful shutdown. |
|
|
|
|
# `logger` section
|
|
Contains logger parameters.
|
|
|
|
```yaml
|
|
logger:
|
|
level: info
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------|----------|---------------|---------------------------------------------------------------------------------------------------|
|
|
| `level` | `string` | `info` | Logging level.<br/>Possible values: `debug`, `info`, `warn`, `error`, `dpanic`, `panic`, `fatal` |
|
|
|
|
# `contracts` section
|
|
Contains override values for FrostFS side-chain contract hashes. Most of the time contract
|
|
hashes are fetched from the NNS contract, so this section can be omitted.
|
|
|
|
```yaml
|
|
contracts:
|
|
balance: 5263abba1abedbf79bb57f3e40b50b4425d2d6cd
|
|
container: 5d084790d7aa36cea7b53fe897380dab11d2cd3c
|
|
netmap: 0cce9e948dca43a6b592efe59ddb4ecb89bdd9ca
|
|
proxy: ad7c6b55b737b696e5c82c85445040964a03e97f
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|--------------|-----------|---------------|---------------------------|
|
|
| `balance` | `hash160` | | Balance contract hash. |
|
|
| `container` | `hash160` | | Container contract hash. |
|
|
| `netmap` | `hash160` | | Netmap contract hash. |
|
|
|
|
# `morph` section
|
|
|
|
```yaml
|
|
morph:
|
|
dial_timeout: 30s
|
|
cache_ttl: 15s
|
|
ape_chain_cache_size: 10000
|
|
rpc_endpoint:
|
|
- address: wss://rpc1.morph.frostfs.info:40341/ws
|
|
priority: 1
|
|
- address: wss://rpc2.morph.frostfs.info:40341/ws
|
|
priority: 2
|
|
switch_interval: 2m
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
| ---------------------- | --------------------------------------------------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| `dial_timeout` | `duration` | `5s` | Timeout for dialing connections to N3 RPCs. |
|
|
| `cache_ttl` | `duration` | Morph block time | Sidechain cache TTL value (min interval between similar calls).<br/>Negative value disables caching.<br/>Cached entities: containers, container lists, eACL tables. |
|
|
| `rpc_endpoint` | list of [endpoint descriptions](#rpc_endpoint-subsection) | | Array of endpoint descriptions. |
|
|
| `switch_interval` | `duration` | `2m` | Time interval between the attempts to connect to the highest priority RPC node if the connection is not established yet. |
|
|
| `ape_chain_cache_size` | `int` | `10000` | Size of the morph cache for APE chains. |
|
|
|
|
## `rpc_endpoint` subsection
|
|
| Parameter | Type | Default value | Description |
|
|
|------------|----------|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| `address` | `string` | | _WebSocket_ N3 endpoint. |
|
|
| `priority` | `int` | `1` | Priority of an endpoint. Endpoint with a higher priority (lower configuration value) has more chance of being used. Endpoints with equal priority are iterated over randomly; a negative priority is interpreted as `1`. |
|
|
|
|
# `storage` section
|
|
|
|
Local storage engine configuration.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|----------------------------|-----------------------------------|---------------|------------------------------------------------------------------------------------------------------------------|
|
|
| `shard_pool_size` | `int` | `20` | Pool size for shard workers. Limits the amount of concurrent `PUT` operations on each shard. |
|
|
| `shard_ro_error_threshold` | `int` | `0` | Maximum amount of storage errors to encounter before shard automatically moves to `Degraded` or `ReadOnly` mode. |
|
|
| `low_mem` | `bool` | `false` | Reduce memory consumption by reducing performance. |
|
|
| `shard` | [Shard config](#shard-subsection) | | Configuration for separate shards. |
|
|
|
|
## `shard` subsection
|
|
|
|
Contains configuration for each shard. Keys must be consecutive numbers starting from zero.
|
|
`default` subsection has the same format and specifies defaults for missing values.
|
|
The following table describes configuration for each shard.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
| ------------------------------------------------ | ------------------------------------------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
| `compress` | `bool` | `false` | Flag to enable compression. |
|
|
| `compression_exclude_content_types` | `[]string` | | List of content-types to disable compression for. Content-type is taken from `Content-Type` object attribute. Each element can contain a star `*` as a first (last) character, which matches any prefix (suffix). |
|
|
| `compression_estimate_compressibility` | `bool` | `false` | If `true`, then noramalized compressibility estimation is used to decide compress data or not. |
|
|
| `compression_estimate_compressibility_threshold` | `float` | `0.1` | Normilized compressibility estimate threshold: data will compress if estimation if greater than this value. |
|
|
| `mode` | `string` | `read-write` | Shard Mode.<br/>Possible values: `read-write`, `read-only`, `degraded`, `degraded-read-only`, `disabled` |
|
|
| `resync_metabase` | `bool` | `false` | Flag to enable metabase resync on start. |
|
|
| `resync_metabase_worker_count` | `int` | `1000` | Count of concurrent workers to resync metabase. |
|
|
| `writecache` | [Writecache config](#writecache-subsection) | | Write-cache configuration. |
|
|
| `metabase` | [Metabase config](#metabase-subsection) | | Metabase configuration. |
|
|
| `blobstor` | [Blobstor config](#blobstor-subsection) | | Blobstor configuration. |
|
|
| `small_object_size` | `size` | `1M` | Maximum size of an object stored in blobovnicza tree. |
|
|
| `gc` | [GC config](#gc-subsection) | | GC configuration. |
|
|
|
|
### `blobstor` subsection
|
|
|
|
Contains a list of substorages each with it's own type.
|
|
Currently only 2 types are supported: `fstree` and `blobovnicza`.
|
|
|
|
```yaml
|
|
blobstor:
|
|
- type: blobovnicza
|
|
path: /path/to/blobstor
|
|
depth: 1
|
|
width: 4
|
|
- type: fstree
|
|
path: /path/to/blobstor/blobovnicza
|
|
perm: 0644
|
|
size: 4194304
|
|
depth: 1
|
|
width: 4
|
|
opened_cache_capacity: 50
|
|
opened_cache_ttl: 5m
|
|
opened_cache_exp_interval: 15s
|
|
```
|
|
|
|
#### Common options for sub-storages
|
|
| Parameter | Type | Default value | Description |
|
|
|-------------------------------------|-----------------------------------------------|---------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| `path` | `string` | | Path to the root of the blobstor. |
|
|
| `perm` | file mode | `0660` | Default permission for created files and directories. |
|
|
|
|
#### `fstree` type options
|
|
| Parameter | Type | Default value | Description |
|
|
|---------------------|-----------|---------------|-------------------------------------------------------|
|
|
| `path` | `string` | | Path to the root of the blobstor. |
|
|
| `perm` | file mode | `0660` | Default permission for created files and directories. |
|
|
| `depth` | `int` | `4` | File-system tree depth. |
|
|
|
|
#### `blobovnicza` type options
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------------------------| ---------- |---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|
| `path` | `string` | | Path to the root of the blobstor. |
|
|
| `perm` | file mode | `0660` | Default permission for created files and directories. |
|
|
| `size` | `size` | `1 G` | Maximum size of a single blobovnicza |
|
|
| `depth` | `int` | `2` | Blobovnicza tree depth. |
|
|
| `width` | `int` | `16` | Blobovnicza tree width. |
|
|
| `opened_cache_capacity` | `int` | `16` | Maximum number of simultaneously opened blobovniczas. |
|
|
| `opened_cache_ttl` | `duration` | `0` | TTL in cache for opened blobovniczas(disabled by default). In case of heavy random-read and 10 shards each with 10_000 databases and accessing 400 objects per-second we will access each db approximately once per ((10 * 10_000 / 400) = 250 seconds <= 300 seconds = 5 min). Also take in mind that in this scenario they will probably be closed earlier because of the cache capacity, so bigger values are likely to be of no use. |
|
|
| `opened_cache_exp_interval` | `duration` | `15s` | Cache cleanup interval for expired blobovnicza's. |
|
|
| `init_worker_count` | `int` | `5` | Maximum number of concurrent initialization workers. |
|
|
| `rebuild_drop_timeout` | `duration` | `10s` | Timeout before drop empty blobovnicza file during rebuild. |
|
|
|
|
### `gc` subsection
|
|
|
|
Contains garbage-collection service configuration. It iterates over the blobstor and removes object the node no longer needs.
|
|
|
|
```yaml
|
|
gc:
|
|
remover_batch_size: 200
|
|
remover_sleep_interval: 5m
|
|
expired_collector_batch_size: 500
|
|
expired_collector_worker_count: 5
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------------------------------|------------|---------------|----------------------------------------------------------|
|
|
| `remover_batch_size` | `int` | `100` | Amount of objects to grab in a single batch. |
|
|
| `remover_sleep_interval` | `duration` | `1m` | Time to sleep between iterations. |
|
|
| `expired_collector_batch_size` | `int` | `500` | Max amount of expired objects to grab in a single batch. |
|
|
| `expired_collector_worker_count` | `int` | `5` | Max amount of concurrent expired objects workers. |
|
|
|
|
### `metabase` subsection
|
|
|
|
```yaml
|
|
metabase:
|
|
path: /path/to/meta.db
|
|
perm: 0644
|
|
max_batch_size: 200
|
|
max_batch_delay: 20ms
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-------------------|------------|---------------|------------------------------------------------------------------------|
|
|
| `path` | `string` | | Path to the metabase file. |
|
|
| `perm` | file mode | `0660` | Permissions to set for the database file. |
|
|
| `max_batch_size` | `int` | `1000` | Maximum amount of write operations to perform in a single transaction. |
|
|
| `max_batch_delay` | `duration` | `10ms` | Maximum delay before a batch starts. |
|
|
|
|
### `writecache` subsection
|
|
|
|
```yaml
|
|
writecache:
|
|
enabled: true
|
|
path: /path/to/writecache
|
|
capacity: 4294967296
|
|
small_object_size: 16384
|
|
max_object_size: 134217728
|
|
flush_worker_count: 30
|
|
page_size: '4k'
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
| --------------------------- | ---------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------- |
|
|
| `path` | `string` | | Path to the metabase file. |
|
|
| `capacity` | `size` | `1G` | Approximate maximum size of the writecache. If the writecache is full, objects are written to the blobstor directly. |
|
|
| `max_object_count` | `int` | unrestricted | Approximate maximum objects count in the writecache. If the writecache is full, objects are written to the blobstor directly. |
|
|
| `small_object_size` | `size` | `32K` | Maximum object size for "small" objects. This objects are stored in a key-value database instead of a file-system. |
|
|
| `max_object_size` | `size` | `64M` | Maximum object size allowed to be stored in the writecache. |
|
|
| `flush_worker_count` | `int` | `20` | Amount of background workers that move data from the writecache to the blobstor. |
|
|
| `max_flushing_objects_size` | `size` | `512M` | Max total size of background flushing objects. |
|
|
| `max_batch_size` | `int` | `1000` | Maximum amount of small object `PUT` operations to perform in a single transaction. |
|
|
| `max_batch_delay` | `duration` | `10ms` | Maximum delay before a batch starts. |
|
|
| `page_size` | `size` | `0` | Page size overrides the default OS page size for small objects storage. Does not affect the existing storage. |
|
|
|
|
|
|
# `node` section
|
|
|
|
```yaml
|
|
node:
|
|
wallet:
|
|
path: /path/to/wallet.json
|
|
address: NcpJzXcSDrh5CCizf4K9Ro6w4t59J5LKzz
|
|
password: password
|
|
addresses:
|
|
- grpc://external.ip:8082
|
|
attribute:
|
|
- "Price:11"
|
|
- "UN-LOCODE:RU MSK"
|
|
- "key:value"
|
|
relay: false
|
|
persistent_sessions:
|
|
path: /sessions
|
|
persistent_state:
|
|
path: /state
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------------------|---------------------------------------------------------------|---------------|-------------------------------------------------------------------------|
|
|
| `key` | `string` | | Path to the binary-encoded private key. |
|
|
| `wallet` | [Wallet config](#wallet-subsection) | | Wallet configuration. Has no effect if `key` is provided. |
|
|
| `addresses` | `[]string` | | Addresses advertised in the netmap. |
|
|
| `attribute` | `[]string` | | Node attributes as a list of key-value pairs in `<key>:<value>` format. |
|
|
| `relay` | `bool` | | Enable relay mode. |
|
|
| `persistent_sessions` | [Persistent sessions config](#persistent_sessions-subsection) | | Persistent session token store configuration. |
|
|
| `persistent_state` | [Persistent state config](#persistent_state-subsection) | | Persistent state configuration. |
|
|
|
|
## `wallet` subsection
|
|
N3 wallet configuration.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|------------|----------|---------------|------------------------------|
|
|
| `path` | `string` | | Path to the wallet file. |
|
|
| `address` | `string` | | Wallet address to use. |
|
|
| `password` | `string` | | Password to open the wallet. |
|
|
|
|
## `persistent_sessions` subsection
|
|
|
|
Contains persistent session token store configuration. By default sessions do not persist between restarts.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------|----------|---------------|-----------------------|
|
|
| `path` | `string` | | Path to the database. |
|
|
|
|
## `persistent_state` subsection
|
|
Configures persistent storage for auxiliary information, such as last seen block height.
|
|
It is used to correctly handle node restarts or crashes.
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------|----------|------------------------|------------------------|
|
|
| `path` | `string` | `.frostfs-storage-state` | Path to the database. |
|
|
|
|
# `apiclient` section
|
|
Configuration for the FrostFS API client used for communication with other FrostFS nodes.
|
|
|
|
```yaml
|
|
apiclient:
|
|
dial_timeout: 15s
|
|
stream_timeout: 20s
|
|
reconnect_timeout: 30s
|
|
```
|
|
| Parameter | Type | Default value | Description |
|
|
|-------------------|----------|---------------|-----------------------------------------------------------------------|
|
|
| dial_timeout | duration | `5s` | Timeout for dialing connections to other storage or inner ring nodes. |
|
|
| stream_timeout | duration | `15s` | Timeout for individual operations in a streaming RPC. |
|
|
| reconnect_timeout | duration | `30s` | Time to wait before reconnecting to a failed node. |
|
|
|
|
# `policer` section
|
|
|
|
Configuration for the Policer service. It ensures that object is stored according to the intended policy.
|
|
|
|
```yaml
|
|
policer:
|
|
head_timeout: 15s
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|----------------|------------|---------------|----------------------------------------------|
|
|
| `head_timeout` | `duration` | `5s` | Timeout for performing the `HEAD` operation. |
|
|
|
|
# `replicator` section
|
|
|
|
Configuration for the Replicator service.
|
|
|
|
```yaml
|
|
replicator:
|
|
put_timeout: 15s
|
|
pool_size: 10
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|---------------|------------|----------------------------------------|---------------------------------------------|
|
|
| `put_timeout` | `duration` | `5s` | Timeout for performing the `PUT` operation. |
|
|
| `pool_size` | `int` | Equal to `object.put.remote_pool_size` | Maximum amount of concurrent replications. |
|
|
|
|
# `object` section
|
|
Contains object-service related parameters.
|
|
|
|
```yaml
|
|
object:
|
|
put:
|
|
remote_pool_size: 100
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|-----------------------------|-------|---------------|------------------------------------------------------------------------------------------------|
|
|
| `delete.tombstone_lifetime` | `int` | `5` | Tombstone lifetime for removed objects in epochs. |
|
|
| `put.remote_pool_size` | `int` | `10` | Max pool size for performing remote `PUT` operations. Used by Policer and Replicator services. |
|
|
| `put.local_pool_size` | `int` | `10` | Max pool size for performing local `PUT` operations. Used by Policer and Replicator services. |
|
|
|
|
# `runtime` section
|
|
Contains runtime parameters.
|
|
|
|
```yaml
|
|
runtime:
|
|
soft_memory_limit: 1GB
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|---------------------|--------|---------------|--------------------------------------------------------------------------|
|
|
| `soft_memory_limit` | `size` | 0 | Soft memory limit for the runtime. Zero or no value stands for no limit. If `GOMEMLIMIT` environment variable is set, the value from the configuration file will be ignored. |
|
|
|
|
# `audit` section
|
|
Contains audit parameters.
|
|
|
|
```yaml
|
|
audit:
|
|
enabled: true
|
|
```
|
|
|
|
| Parameter | Type | Default value | Description |
|
|
|---------------------|--------|---------------|---------------------------------------------------|
|
|
| `soft_memory_limit` | `bool` | false | If `true` then audit event logs will be recorded. |
|