Unify and refactor config handling #1610

Open
opened 2025-01-22 12:45:55 +00:00 by fyrchik · 4 comments
Owner

viper library has a beautiful Unmarshal() function 8b223a45ce/README.md (L523)

It allows to take viper as-is (i.e. with all different config sources) and map it to a struct.
This should help us tremendously:

  1. Instead of custom Config with hand-rolled hierarchy handling, just use struct fields.
  2. Comments can be used to autogenerate documentation from the struct. Currently, it is harder, if possible at all.
  3. Richer types could be used to add additional validation.
  4. Finally, it is possible to warn user about non-existent fields.

To give an example:
This section in the markdown doc https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/docs/storage-node-configuration.md#tls-subsection can be autogenerated from

type TLS struct {
   // Address that control service listener binds to.
   enabled bool `mapstructure:"enabled" default:"false"`
   // Path to the TLS certificate.
   certificate string `mapstructure:"certificate"`
   // Path to the key.
   key string `mapstructure:"certificate"`
   // If true, ciphers considered insecure by Go stdlib are allowed to be used.
   insecureCrypto bool `mapstructure:"use_insecure_crypto" default:"false"`
   // We can write anything here, because it won't be visible in doc.
   skipVerification bool `mapstructure:"insecure_skip_verify" omitdoc:"true"`
}

I suggest making a POC with IR and then support it in node, cli and adm.

`viper` library has a beautiful `Unmarshal()` function https://github.com/spf13/viper/blob/8b223a45cef1badfd02317591a316095fb15a5d2/README.md?plain=1#L523 It allows to take viper as-is (i.e. with all different config sources) and map it to a struct. This should help us tremendously: 1. Instead of custom `Config` with hand-rolled hierarchy handling, just use struct fields. 2. Comments can be used to autogenerate documentation from the struct. Currently, it is harder, if possible at all. 3. Richer types could be used to add additional validation. 4. Finally, it is possible to warn user about non-existent fields. To give an example: This section in the markdown doc https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/docs/storage-node-configuration.md#tls-subsection can be autogenerated from ```golang type TLS struct { // Address that control service listener binds to. enabled bool `mapstructure:"enabled" default:"false"` // Path to the TLS certificate. certificate string `mapstructure:"certificate"` // Path to the key. key string `mapstructure:"certificate"` // If true, ciphers considered insecure by Go stdlib are allowed to be used. insecureCrypto bool `mapstructure:"use_insecure_crypto" default:"false"` // We can write anything here, because it won't be visible in doc. skipVerification bool `mapstructure:"insecure_skip_verify" omitdoc:"true"` } ``` I suggest making a POC with IR and then support it in node, cli and adm.
fyrchik added the
frostfs-ir
frostfs-node
refactoring
discussion
labels 2025-01-22 12:46:04 +00:00
Author
Owner

How to handle shard.default is not obvious, but is possible.

How to handle `shard.default` is not obvious, but is possible.
a-savchuk self-assigned this 2025-02-24 08:36:36 +00:00
Member

Both frostfs-node and frostfs-ir can read configuration from files and environment variables. Unmarshal can use environment variables but this feature is still experimental and hidden behind a build tag. I'd like to know, is that okay for us to enable an experimental feature?

Both `frostfs-node` and `frostfs-ir` can read configuration from files and environment variables. `Unmarshal` [can](https://github.com/spf13/viper/pull/1429) use environment variables but this feature is still experimental and [hidden](https://github.com/spf13/viper/pull/1429#issuecomment-1870976604) behind a build tag. I'd like to know, is that okay for us to enable an experimental feature?
Member

I compared the configuration from one of our virtual clusters and the example configuration. I found out the example is missing two sections: audit and kludge_compatibility_mode.

audit:
    enabled: ...
    pdp:
        max_sleep_interval: ...

kludge_compatibility_mode: ...

Moreover, frostfs-ir tries to read node.kludge_compatibility_mode instead of just kludge_compatibility_mode. I have several questions:

  • Are these two section still in use?
  • If so, what should the default value max_sleep_interval be?
I compared the configuration from one of our virtual clusters and the example configuration. I found out the example is missing two sections: `audit` and `kludge_compatibility_mode`. ```yaml audit: enabled: ... pdp: max_sleep_interval: ... kludge_compatibility_mode: ... ``` Moreover, `frostfs-ir` tries to read `node.kludge_compatibility_mode` instead of just `kludge_compatibility_mode`. I have several questions: - Are these two section still in use? - If so, what should the default value `max_sleep_interval` be?
Member

TL;DR: viper.Unmarshal doesn't work well with slices when they come from environment variables. A workaround is needed.

Problem

Consider this config file:

control:
	authorized_keys:
		- key: 03583...
		- key: 028f4...
grpc:
	endpoint: localhost:8090

I want to load it into this Go struct:

type Config struct {
	Control struct {
		AuthorizedKeys []struct {
			Key string `mapstructure:"key"`
		} `mapstructure:"authorized_keys"`
		Grpc struct {
			Endpoint string `mapstructure:"endpoint"`
		} `mapstructure:"grpc"`
	} `mapstructure:"control"`
}

How viper.Unmarshal works:

  1. Reads the config from a yaml/json file into a map[string]any.
  2. Uses the mapstructure library to turn the map into a struct.

This works fine for files. Now I want do define config via environment variables:

CONTROL_AUTHORIZED_KEYS_0_KEY="03583..."
CONTROL_AUTHORIZED_KEYS_1_KEY="028f4..."
CONTORL_GRPC_ENDPOINT="localhost:8090"

viper has an experimental feature that lets it read environment variables while unmarshalling. When the viper_bind_struct flag is on, it works as follows:

  1. Converts a zero-value struct into a temporary map using mapstructure, then flattens that map to generate a list of config keys, for example, control.grpc.endpoint.
  2. Reads the config from a yaml/json file into a map[string]any and populates it with the keys got on step 1, using their corresponding values from environment variables.
  3. Uses the mapstructure library to turn the map into the struct.

However, AuthorizedKeys is a slice, and configuration keys for slice elements (like control.authorized_keys.0.key) are missing because there's no information about the slice length in step 1.

This is the main problem. viper.Unmarshal works well for

  • config defined with yaml/json files
  • config with structs, nested structs defined via environment variables
    But it does not work for slices defined via environment variables.

Possible solutions

0. Do nothing

Keep things as they are and don't use viper.Unmarshal.

1. Don't use environment variables for defining slices

Define slices only in config files.

2. Use decode hooks

mapstructure allows defining custom hooks. A custom decode hook may look like this:

type AuthorizedKeys []struct{
	Address string
}

func decodeHook(from, to reflect.Value) (any, error) {
	if to.Type() == reflect.TypeOf(AuthorizedKeys{}) {
        // decode it
	}
}

But I think these hooks will be hard to write and maintain.

3. Specify keys for environment variable manually

I think we can read all environment variables with os.Environ and create a map with config variables to use when decoding. However, this can be hard because environment variables lack information about config hierarchy. For example, CONTROL_AUTHORIZED_KEYS could be interpreted as control.authorized_keys, control.authorized.keys, control_authorized.keys, or control_authorized_keys. So, we'd need to preprocess the config struct with Go's reflect to get correct keys. I think the solution will be hard to implement and cumbersome.

4. Unify the way we read configs and write a custom decoder

We use the same approach when reading configs. We manually call viper.Get for each config key. I think we can move config keys to struct tags, then walk the struct and do all viper.Get calls based on the struct tags. For example, consider the following struct:

type Config struct {
	Control struct {
		AuthorizedKeys []struct {
			Key string `key:"key"`
		} `key:"authorized_keys"`
		Grpc struct {
			Endpoint string `key:"endpoint"`
		} `key:"grpc"`
	} `key:"control"`
}

after walking this struct with Go's reflection, we'll do all needed viper.Get calls like these

viper.Get("control.authorized_keys.0.key")
viper.Get("control.authorized_keys.1.key")
viper.Get("control.authorized_keys.2.key")
// do it until viper.Get returns nil
viper.Get("control.grpc.endpoint")

I made an example of how this can be implemented. It doesn't support slices of structs because it's still a bit challenge, but it already supports:

  • structure fields of all primitive types (string, int, uint, bool and etc.)
  • nested structures
  • custom hooks for decoding non-primitive types, to show it I made two hooks for time.Duration and net.IP
  • default values defined with a struct tag, for example:
type config struct {
	Pprof struct {
		Enable bool `key:"enable" default:"true"`
		ShutdownTimeout time.Duration `key:"shutdown_timeout", default:"15s"`
	} `key:"pprof"`
}

Also it can returns errors if:

  • a struct field is unused, i. e. it doesn't have a key tag)
  • a struct field is unset, i. e. it wasn't defined in a config file or via environment variable and doesn't have a default tag)
TL;DR: `viper.Unmarshal` doesn't work well with slices when they come from environment variables. A workaround is needed. ### Problem Consider this config file: ```yaml control: authorized_keys: - key: 03583... - key: 028f4... grpc: endpoint: localhost:8090 ``` I want to load it into this Go struct: ```go type Config struct { Control struct { AuthorizedKeys []struct { Key string `mapstructure:"key"` } `mapstructure:"authorized_keys"` Grpc struct { Endpoint string `mapstructure:"endpoint"` } `mapstructure:"grpc"` } `mapstructure:"control"` } ``` How `viper.Unmarshal` works: 1. Reads the config from a `yaml`/`json` file into a `map[string]any`. 2. Uses the `mapstructure` library to turn the map into a struct. This works fine for files. Now I want do define config via environment variables: ``` CONTROL_AUTHORIZED_KEYS_0_KEY="03583..." CONTROL_AUTHORIZED_KEYS_1_KEY="028f4..." CONTORL_GRPC_ENDPOINT="localhost:8090" ``` `viper` has [an experimental feature](https://github.com/spf13/viper/pull/1429) that lets it read environment variables while unmarshalling. When the `viper_bind_struct` flag is on, it works as follows: 1. Converts a zero-value struct into a temporary map using `mapstructure`, then flattens that map to generate a list of config keys, for example, `control.grpc.endpoint`. 2. Reads the config from a `yaml`/`json` file into a `map[string]any` and populates it with the keys got on step 1, using their corresponding values from environment variables. 3. Uses the `mapstructure` library to turn the map into the struct. However, `AuthorizedKeys` is a slice, and configuration keys for slice elements (like `control.authorized_keys.0.key`) are missing because there's no information about the slice length in step 1. This is the main problem. `viper.Unmarshal` works well for - config defined with `yaml`/`json` files - config with structs, nested structs defined via environment variables But it does not work for slices defined via environment variables. ### Possible solutions #### 0. Do nothing Keep things as they are and don't use `viper.Unmarshal`. #### 1. Don't use environment variables for defining slices Define slices only in config files. #### 2. Use decode hooks `mapstructure` allows defining custom hooks. A custom decode hook may look like this: ```go type AuthorizedKeys []struct{ Address string } func decodeHook(from, to reflect.Value) (any, error) { if to.Type() == reflect.TypeOf(AuthorizedKeys{}) { // decode it } } ``` But I think these hooks will be hard to write and maintain. #### 3. Specify keys for environment variable manually I think we can read all environment variables with `os.Environ` and create a map with config variables to use when decoding. However, this can be hard because environment variables lack information about config hierarchy. For example, `CONTROL_AUTHORIZED_KEYS` could be interpreted as `control.authorized_keys`, `control.authorized.keys`, `control_authorized.keys`, or `control_authorized_keys`. So, we'd need to preprocess the config struct with Go's reflect to get correct keys. I think the solution will be hard to implement and cumbersome. #### 4. Unify the way we read configs and write a custom decoder We use the same approach when reading configs. We manually call `viper.Get` for each config key. I think we can move config keys to struct tags, then walk the struct and do all `viper.Get` calls based on the struct tags. For example, consider the following struct: ```go type Config struct { Control struct { AuthorizedKeys []struct { Key string `key:"key"` } `key:"authorized_keys"` Grpc struct { Endpoint string `key:"endpoint"` } `key:"grpc"` } `key:"control"` } ``` after walking this struct with Go's reflection, we'll do all needed `viper.Get` calls like these ```go viper.Get("control.authorized_keys.0.key") viper.Get("control.authorized_keys.1.key") viper.Get("control.authorized_keys.2.key") // do it until viper.Get returns nil viper.Get("control.grpc.endpoint") ``` I made [an example](https://git.frostfs.info/a-savchuk/config/src/commit/6b1db46a37c0f1eda0092033ec49083c7a6b9515) of how this can be implemented. It doesn't support slices of structs because it's still a bit challenge, but it already supports: - structure fields of all primitive types (`string`, `int`, `uint`, `bool` and etc.) - nested structures - custom hooks for decoding non-primitive types, to show it I made two hooks for `time.Duration` and `net.IP` - default values defined with a struct tag, for example: ```go type config struct { Pprof struct { Enable bool `key:"enable" default:"true"` ShutdownTimeout time.Duration `key:"shutdown_timeout", default:"15s"` } `key:"pprof"` } ``` Also it can returns errors if: - a struct field is unused, i. e. it doesn't have a `key` tag) - a struct field is unset, i. e. it wasn't defined in a config file or via environment variable and doesn't have a `default` tag)
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1610
No description provided.