Add runtime.memory_limit config parameter #537

Merged
fyrchik merged 1 commits from dstepanov-yadro/frostfs-node:fix/memlimit into master 2023-07-26 15:36:10 +00:00

This parameter allows to set soft memory limit for Go runtime's GC. It can be useful for GC to work more aggressively in the case when the amount of memory occupied is close to the amount of RAM in the system.

This value can also be set via GOMEMLIMIT environment variable , but I put it in the config for several reasons:

  • user does not have to know that frostfs-node is written in Go;
  • a smart user will not change the environment variables of the Go runtime environment, since he does not know whether the environment parameters in the code change, otherwise he can get the wrong configuration;
  • the configuration file must be a single point of configuration;
  • value change is supported without rebooting via SIGHUP.
This parameter allows to set soft memory limit for Go runtime's GC. It can be useful for GC to work more aggressively in the case when the amount of memory occupied is close to the amount of RAM in the system. This value can also be set via `GOMEMLIMIT` environment variable , but I put it in the config for several reasons: - user does not have to know that frostfs-node is written in Go; - a smart user will not change the environment variables of the Go runtime environment, since he does not know whether the environment parameters in the code change, otherwise he can get the wrong configuration; - the configuration file must be a single point of configuration; - value change is supported without rebooting via SIGHUP.
dstepanov-yadro force-pushed fix/memlimit from c511716000 to 1ee2f68038 2023-07-21 09:29:40 +00:00 Compare
dstepanov-yadro requested review from storage-core-committers 2023-07-21 09:30:42 +00:00
dstepanov-yadro requested review from storage-core-developers 2023-07-21 09:30:42 +00:00
fyrchik reviewed 2023-07-21 16:09:57 +00:00
fyrchik left a comment
Owner

I am ok with this PR, as long as we agree to support it like this across all our stack.
@alexvanin @realloc @TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers

My concerns (the 3rd is the biggest one):

  1. To me this doesn't look like an application parameter and is tricky to explain.
  2. This is the code we have for the feature that already exists and is already supported (and it is not inconvenient to use IMO).
  3. Previously you could specify some limits for CPU/memory in the systemd unit file: GOMEMLIMIT looks nice somewhere near. It seems we lose this possibility in the current implementation: missing value is interpreted as "ignore everything and disable the limit". I believe this is unexpected for the user and should be changed (also it is uniform with 4, if we do not support it).
  4. GOGC is another parameter we might want to change, but do not support currently. Shall we? What is our possible policy on this?
  5. We already have runtime settings for pprof, but they are in another section, could be misleading.

One of the benefits I see is the ability to have more clever options in future (like memory_limit: adaptive ref #106) or to have simple settings for a combination of parameters (like storage.lowmem: true which is used in a number of places).

I am ok with this PR, as long as we agree to support it like this across all our stack. @alexvanin @realloc @TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers My concerns (the 3rd is the biggest one): 1. To me this doesn't look like an application parameter and is tricky to explain. 2. This is the code we _have_ for the feature that already exists and is already supported (and it is not inconvenient to use IMO). 3. Previously you could specify some limits for CPU/memory in the systemd unit file: `GOMEMLIMIT` looks nice somewhere near. It seems we _lose_ this possibility in the current implementation: missing value is interpreted as "ignore everything and disable the limit". I believe this is unexpected for the user and should be changed (also it is uniform with 4, if we do not support it). 4. GOGC is another parameter we might want to change, but do not support currently. Shall we? What is our possible policy on this? 5. We already have runtime settings for pprof, but they are in another section, could be misleading. One of the benefits I see is the ability to have more clever options in future (like `memory_limit: adaptive` ref #106) or to have simple settings for a combination of parameters (like `storage.lowmem: true` which is used in a number of places).
@ -429,0 +433,4 @@
```yaml
runtime:
memory_limit: 1GB

I think memory limit is misleading (no one reads the docs), can we add soft_?

I think memory limit is misleading (no one reads the docs), can we add `soft_`?
Poster
Collaborator

ok, fixed

ok, fixed
fyrchik marked this conversation as resolved
@ -493,4 +493,5 @@ const (
FrostFSNodeNodeIsUnderMaintenanceSkipInitialBootstrap = "the node is under maintenance, skip initial bootstrap"
EngineCouldNotChangeShardModeToDisabled = "could not change shard mode to disabled"
NetmapNodeAlreadyInCandidateListOnlineSkipInitialBootstrap = "the node is already in candidate list with online state, skip initial bootstrap"
RuntimeMemoryLimitUpdated = "runtime memory limit value updated"

And here too

And here too
Poster
Collaborator

fixed

fixed
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed fix/memlimit from 1ee2f68038 to 7573346869 2023-07-24 07:01:53 +00:00 Compare
fyrchik requested review from alexvanin 2023-07-24 07:56:17 +00:00
aarifullin approved these changes 2023-07-24 09:11:02 +00:00
Poster
Collaborator

I am ok with this PR, as long as we agree to support it like this across all our stack.
@alexvanin @realloc @TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers

My concerns (the 3rd is the biggest one):

  1. To me this doesn't look like an application parameter and is tricky to explain.
  2. This is the code we have for the feature that already exists and is already supported (and it is not inconvenient to use IMO).
  3. Previously you could specify some limits for CPU/memory in the systemd unit file: GOMEMLIMIT looks nice somewhere near. It seems we lose this possibility in the current implementation: missing value is interpreted as "ignore everything and disable the limit". I believe this is unexpected for the user and should be changed (also it is uniform with 4, if we do not support it).
  4. GOGC is another parameter we might want to change, but do not support currently. Shall we? What is our possible policy on this?
  5. We already have runtime settings for pprof, but they are in another section, could be misleading.

One of the benefits I see is the ability to have more clever options in future (like memory_limit: adaptive ref #106) or to have simple settings for a combination of parameters (like storage.lowmem: true which is used in a number of places).

  1. This is a soft memory limit that the application tries not to exceed.
  2. I didn't understand this point.
  3. As far as I remember, neither the specification nor the documentation says that we support the GOMEMLIMIT environment parameter. In addition, this environment variable is only available from version 1.19 of the Go language. Speaking generally, I would consider the case you described using an undocumented feature at my own risk.
  4. Right now I don't see the need to change GOGCPercent. If it was necessary to add a parameter to the config for GOGC, I would suggest runtime.memory_free_frequency or so on.
  5. They are in pprof section, because they pprof related, i think.
> I am ok with this PR, as long as we agree to support it like this across all our stack. > @alexvanin @realloc @TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers > > My concerns (the 3rd is the biggest one): > 1. To me this doesn't look like an application parameter and is tricky to explain. > 2. This is the code we _have_ for the feature that already exists and is already supported (and it is not inconvenient to use IMO). > 3. Previously you could specify some limits for CPU/memory in the systemd unit file: `GOMEMLIMIT` looks nice somewhere near. It seems we _lose_ this possibility in the current implementation: missing value is interpreted as "ignore everything and disable the limit". I believe this is unexpected for the user and should be changed (also it is uniform with 4, if we do not support it). > 4. GOGC is another parameter we might want to change, but do not support currently. Shall we? What is our possible policy on this? > 5. We already have runtime settings for pprof, but they are in another section, could be misleading. > > One of the benefits I see is the ability to have more clever options in future (like `memory_limit: adaptive` ref #106) or to have simple settings for a combination of parameters (like `storage.lowmem: true` which is used in a number of places). 1. This is a soft memory limit that the application tries not to exceed. 2. I didn't understand this point. 3. As far as I remember, neither the specification nor the documentation says that we support the `GOMEMLIMIT` environment parameter. In addition, this environment variable is only available from version 1.19 of the Go language. Speaking generally, I would consider the case you described using an undocumented feature at my own risk. 4. Right now I don't see the need to change GOGCPercent. If it was necessary to add a parameter to the config for `GOGC`, I would suggest runtime.memory_free_frequency or so on. 5. They are in `pprof` section, because they `pprof` related, i think.

Proposal looks good for me and we can support the same configuration in protocol gates.

Although I would rather keep support of the GO environment variables. While users don't know about Go in the backend, as a developer I expect GOMEMLIMIT to work the same regardless of the application configuration.

func setRuntimeParameters(c *cfg) {
	if len(os.Getenv("GOMEMLIMIT")) != 0 {	
		// default limit < yaml limit < app env limit < GOMEMLIMIT
		return
	}
	memLimitBytes := runtime.GCMemoryLimitBytes(c.appCfg)
	previous := debug.SetMemoryLimit(memLimitBytes)
	if memLimitBytes != previous {
		c.log.Info(logs.RuntimeMemoryLimitUpdated,
			zap.Int64("new_value", memLimitBytes),
			zap.Int64("old_value", previous))
	}
}
Proposal looks good for me and we can support the same configuration in protocol gates. Although I would rather keep support of the GO environment variables. While users don't know about Go in the backend, as a developer I expect `GOMEMLIMIT` to work the same regardless of the application configuration. ```go func setRuntimeParameters(c *cfg) { if len(os.Getenv("GOMEMLIMIT")) != 0 { // default limit < yaml limit < app env limit < GOMEMLIMIT return } memLimitBytes := runtime.GCMemoryLimitBytes(c.appCfg) previous := debug.SetMemoryLimit(memLimitBytes) if memLimitBytes != previous { c.log.Info(logs.RuntimeMemoryLimitUpdated, zap.Int64("new_value", memLimitBytes), zap.Int64("old_value", previous)) } } ```
alexvanin approved these changes 2023-07-24 10:16:17 +00:00
dstepanov-yadro force-pushed fix/memlimit from 7573346869 to a2c9e813c7 2023-07-24 11:59:46 +00:00 Compare
Poster
Collaborator

Proposal looks good for me and we can support the same configuration in protocol gates.

Although I would rather keep support of the GO environment variables. While users don't know about Go in the backend, as a developer I expect GOMEMLIMIT to work the same regardless of the application configuration.

func setRuntimeParameters(c *cfg) {
	if len(os.Getenv("GOMEMLIMIT")) != 0 {	
		// default limit < yaml limit < app env limit < GOMEMLIMIT
		return
	}
	memLimitBytes := runtime.GCMemoryLimitBytes(c.appCfg)
	previous := debug.SetMemoryLimit(memLimitBytes)
	if memLimitBytes != previous {
		c.log.Info(logs.RuntimeMemoryLimitUpdated,
			zap.Int64("new_value", memLimitBytes),
			zap.Int64("old_value", previous))
	}
}

Agree, fixed

> Proposal looks good for me and we can support the same configuration in protocol gates. > > Although I would rather keep support of the GO environment variables. While users don't know about Go in the backend, as a developer I expect `GOMEMLIMIT` to work the same regardless of the application configuration. > > ```go > func setRuntimeParameters(c *cfg) { > if len(os.Getenv("GOMEMLIMIT")) != 0 { > // default limit < yaml limit < app env limit < GOMEMLIMIT > return > } > memLimitBytes := runtime.GCMemoryLimitBytes(c.appCfg) > previous := debug.SetMemoryLimit(memLimitBytes) > if memLimitBytes != previous { > c.log.Info(logs.RuntimeMemoryLimitUpdated, > zap.Int64("new_value", memLimitBytes), > zap.Int64("old_value", previous)) > } > } > ``` Agree, fixed
fyrchik reviewed 2023-07-24 16:15:58 +00:00
@ -429,0 +438,4 @@
| Parameter | Type | Default value | Description |
|---------------------|--------|---------------|--------------------------------------------------------------------------|
| `soft_memory_limit` | `size` | 0 | Soft memory limit for the runtime. Zero or no value stands for no limit. |

Can we also mention that GOMEMLIMIT has more priority?

Can we also mention that `GOMEMLIMIT` has more priority?
Poster
Collaborator

Done

Done
fyrchik approved these changes 2023-07-24 16:16:01 +00:00
dstepanov-yadro force-pushed fix/memlimit from a2c9e813c7 to 805a4b3fae 2023-07-25 07:02:14 +00:00 Compare
dstepanov-yadro force-pushed fix/memlimit from 805a4b3fae to 588370e34a 2023-07-25 07:08:31 +00:00 Compare
fyrchik approved these changes 2023-07-25 09:04:54 +00:00
@ -0,0 +11,4 @@
memoryLimitDefault = math.MaxInt64
)
// GCMemoryLimitBytes returns the value of "memory_limit" config parameter from "runtime" section.

Also soft here.

Also `soft` here.
Poster
Collaborator

fixed

fixed
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed fix/memlimit from 588370e34a to 32c77f3a23 2023-07-25 14:28:33 +00:00 Compare
fyrchik approved these changes 2023-07-26 08:24:18 +00:00
fyrchik merged commit 32c77f3a23 into master 2023-07-26 15:36:10 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#537
There is no content yet.