object: Sort nodes by priority metrics to compute GET requests #1439

Merged
fyrchik merged 2 commits from acid-ant/frostfs-node:feat/get-prioritization into master 2024-10-29 08:05:10 +00:00
Member

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant force-pushed feat/get-prioritization from 64bbae5369 to 78e4d71cc0 2024-10-18 14:04:18 +00:00 Compare
acid-ant force-pushed feat/get-prioritization from 78e4d71cc0 to 1de4a7d5fe 2024-10-21 06:02:52 +00:00 Compare
acid-ant changed title from WIP: object: Sort nodes by priority metrics to compute GET/SEARCH requests to object: Sort nodes by priority metrics to compute GET/SEARCH requests 2024-10-21 06:24:24 +00:00
acid-ant requested review from storage-core-committers 2024-10-21 06:24:30 +00:00
acid-ant requested review from storage-core-developers 2024-10-21 06:24:30 +00:00
acid-ant force-pushed feat/get-prioritization from 1de4a7d5fe to 8c4df77878 2024-10-21 08:45:22 +00:00 Compare
Author
Member

Remove usage of priority metrics from SEARCH service.

Remove usage of priority metrics from SEARCH service.
fyrchik requested changes 2024-10-21 08:55:29 +00:00
Dismissed
@ -0,0 +12,4 @@
)
type Metric interface {
CalculateValue(*netmap.NodeInfo, *netmap.NodeInfo) []byte
Owner

What does the resulting type represent?
I expect metric to be a number (int or float)

What does the resulting type represent? I expect metric to be a number (int or float)
Author
Member

The idea was to use already existed api for sort instead of writing the custom for comparing []int/float lexicographically. For geoDistance metric, just convert distance to uint32 and then to []byte. The same is for any other metrics with one thing in mind - need to compare the resulting []byte lexicographically.

The idea was to use already existed api for sort instead of writing the custom for comparing `[]int/float` lexicographically. For `geoDistance` metric, just convert distance to `uint32` and then to `[]byte`. The same is for any other metrics with one thing in mind - need to compare the resulting `[]byte` lexicographically.
Owner

It is not that hard to implement. And we have slices.Compare.

It is not that hard to implement. And we have `slices.Compare`.
Author
Member

Thanks, fixed.

Thanks, fixed.
fyrchik marked this conversation as resolved
@ -0,0 +22,4 @@
return errors.New("unsupported priority metric")
}
func ParseMetric(raw string) Metric {
Owner

Honestly, I don't like separating Parse* and Validate*. Parsing performs validation automatically.
It may bite us in unexpected places, like silent errors on SIGHUP. And now we need to be sure we always validate.
Can we stay with ParseMetric() (Metric, error)?

Honestly, I don't like separating `Parse*` and `Validate*`. Parsing performs validation _automatically_. It may bite us in unexpected places, like silent errors on SIGHUP. And now we need to be sure we always validate. Can we stay with `ParseMetric() (Metric, error)`?
Author
Member

That was intentionally, because we don't have an ability to exit gracefully when reading config - only panic is possible.

That was intentionally, because we don't have an ability to exit gracefully when reading config - only [panic](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/8b6ec57c6147e5b784d78bc891144dd55493503d/cmd/frostfs-node/config.go#L677) is possible.
Owner

I don't see how this is related.
If we currently panic on invalid config -- be it.
SIGHUP doesn't panic anyway.

I don't see how this is related. If we currently panic on invalid config -- be it. SIGHUP doesn't panic anyway.
Author
Member

Updated.

Updated.
fyrchik marked this conversation as resolved
@ -0,0 +46,4 @@
}
func NewAttributeMetric(raw string) Metric {
attr, _ := strings.CutPrefix(raw, attrPrefix)
Owner

It seems you CutPrefix already in ParseMetric. It CutPrefix here intentional?

It seems you `CutPrefix` already in `ParseMetric`. It `CutPrefix` here intentional?
Author
Member

Thanks, fixed.

Thanks, fixed.
@ -55,1 +68,3 @@
const invalidOptsMsg = "invalid traverser options"
const (
invalidOptsMsg = "invalid traverser options"
uint16Bytes = 2
Owner
`uint16Size` ? https://github.com/golang/go/blob/c1d9303d82de0bae1b861b17ec4f9812392ea3cb/src/math/bits/bits.go#L20
Author
Member

No more relevant.

No more relevant.
fyrchik marked this conversation as resolved
@ -160,0 +208,4 @@
for _, m := range cfg.metrics {
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
Owner

There is binary.LittleEndian.AppendUint16

There is `binary.LittleEndian.AppendUint16`
Author
Member

Updated.

Updated.
fyrchik marked this conversation as resolved
@ -160,0 +209,4 @@
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
metrics[i] = append(metrics[i], b...)
Owner

Why do we need to append an index?

Why do we need to append an index?
Author
Member

We need to store somewhere the relation between metric value and node. It is impossible to use slice as a key for map.

We need to store somewhere the relation between metric value and node. It is impossible to use slice as a key for map.
Owner

Can we use a struct with 2 fields then?
Don't see how []byte with overloaded meaning is better.

Can we use a struct with 2 fields then? Don't see how `[]byte` with overloaded meaning is better.
Author
Member

Updated, please review.

Updated, please review.
fyrchik marked this conversation as resolved
@ -277,1 +280,4 @@
}
type nodeState struct {
node *netmapAPI.NodeInfo
Owner

We try to avoid explicit API imports.
Can we use the one from sdk?

We try to avoid explicit API imports. Can we use the one from sdk?
Author
Member

Ok, but for that I need to refactor a bit config and netmap service.

Ok, but for that I need to refactor a bit [config](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/8b6ec57c6147e5b784d78bc891144dd55493503d/cmd/frostfs-node/config.go#L1179) and netmap [service](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/8b6ec57c6147e5b784d78bc891144dd55493503d/pkg/services/netmap/executor.go#L79).
Author
Member

Refactored.

Refactored.
fyrchik marked this conversation as resolved
@ -278,0 +324,4 @@
next := tr.Next()
require.NotNil(t, next)
require.Equal(t, 3, len(next))
require.Equal(t, "/ip4/0.0.0.0/tcp/2", string(next[0].PublicKey()))
Owner

This 2, 0, 1 seems completely random. What is the reason for this exact sequence?

This `2, 0, 1` seems completely random. What is the reason for this exact sequence?
Author
Member

Added comments.

Added comments.
aarifullin reviewed 2024-10-21 12:41:14 +00:00
@ -160,0 +209,4 @@
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
metrics[i] = append(metrics[i], b...)
Member

Could we consider to use

type nodeMetrics struct 
   index int
   metrics []byte
}

for i := range unsortedVector {
  nm := nodeMetrics{index: i}
  for _, m := range cfg.metrics {
     nm.metrics = append(nm.metrics, m.CalculateValue(&node, &unsortedVector[i])...)
  }
  nms = append(nms, nm) 
}

instead of using the suffix - honestly, I spent some time to understand these steps

Could we consider to use ```go type nodeMetrics struct index int metrics []byte } for i := range unsortedVector { nm := nodeMetrics{index: i} for _, m := range cfg.metrics { nm.metrics = append(nm.metrics, m.CalculateValue(&node, &unsortedVector[i])...) } nms = append(nms, nm) } ``` instead of using the suffix - honestly, I spent some time to understand these steps
Author
Member

Yes, I'll rewrite this part to be more readable and simpler, thanks.

Yes, I'll rewrite this part to be more readable and simpler, thanks.
Author
Member

Updated.

Updated.
aarifullin marked this conversation as resolved
aarifullin reviewed 2024-10-21 13:14:41 +00:00
@ -103,0 +123,4 @@
unsortedVector = append(unsortedVector, ns[i][:rem[i]]...)
regularVector = append(regularVector, ns[i][rem[i]:]...)
}
rem = make([]int, 2)
Member

I believe compile-time defined []int{-1, -1} must be better :) The same is fair for the lines below

I believe compile-time defined `[]int{-1, -1}` must be better :) The same is fair for the lines below
Author
Member

Fixed.

Fixed.
aarifullin marked this conversation as resolved
fyrchik reviewed 2024-10-22 07:10:19 +00:00
@ -101,2 +117,3 @@
var rem []int
if cfg.flatSuccess != nil {
if len(cfg.metrics) > 0 {
rem = defaultCopiesVector(cfg.policy)
Owner

Have you checked this behaves properly with EC?

Have you checked this behaves properly with EC?
Author
Member

Thanks, added test for EC container. Works as expected.

Thanks, added test for EC container. Works as expected.
fyrchik marked this conversation as resolved
acid-ant force-pushed feat/get-prioritization from 8c4df77878 to 6f1b5ca613 2024-10-22 07:56:14 +00:00 Compare
acid-ant force-pushed feat/get-prioritization from 6f1b5ca613 to f0d0050634 2024-10-22 07:56:33 +00:00 Compare
acid-ant changed title from object: Sort nodes by priority metrics to compute GET/SEARCH requests to WIP: object: Sort nodes by priority metrics to compute GET/SEARCH requests 2024-10-22 08:07:26 +00:00
acid-ant changed title from WIP: object: Sort nodes by priority metrics to compute GET/SEARCH requests to WIP: object: Sort nodes by priority metrics to compute GET requests 2024-10-22 08:07:36 +00:00
acid-ant force-pushed feat/get-prioritization from f0d0050634 to 2f62a05f6a 2024-10-22 08:34:02 +00:00 Compare
Author
Member

Updated sort function.

Updated sort function.
acid-ant changed title from WIP: object: Sort nodes by priority metrics to compute GET requests to object: Sort nodes by priority metrics to compute GET requests 2024-10-22 08:34:53 +00:00
acid-ant force-pushed feat/get-prioritization from 2f62a05f6a to df784dd93d 2024-10-22 08:36:59 +00:00 Compare
acid-ant force-pushed feat/get-prioritization from df784dd93d to cfc3855922 2024-10-22 08:43:54 +00:00 Compare
fyrchik reviewed 2024-10-22 08:58:03 +00:00
@ -278,0 +316,4 @@
}),
WithoutSuccessTracking(),
WithPriorityMetrics(m),
WithNodeState(&nodeState{
Owner

You provide this interface each time the traverser is created.
Why have you decided to use an interface instead of directly providing local node info?

You provide this interface each time the traverser is created. Why have you decided to use an interface instead of directly providing local node info?
Author
Member

Because we are changing this value at each epoch. Also, that will be helpful when we have an ability to change the list of the storage node attributes in runtime.

Because we are changing this value at each epoch. Also, that will be helpful when we have an ability to change the list of the storage node attributes in runtime.
Owner

New traverser is created for each request, so we would rather like this value not to be changed inside the traverser.

New traverser is created for each request, so we would rather like this value _not_ to be changed inside the traverser.
Author
Member

For each call of LocalNodeInfo() we are creating copy of NodeInfo.
If we want to reject using of interface in traverser we need to check here that we need to set the value for local node info, it looks ugly.

For each [call](https://git.frostfs.info/TrueCloudLab/frostfs-node/commit/e17837b51289980653d707cadf9c2d277aee21c0#diff-6fd77bc0dc6ba5fa64f50dad79b0be734983db72) of `LocalNodeInfo()` we are creating copy of `NodeInfo`. If we want to reject using of interface in traverser we need to check [here](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/8b6ec57c6147e5b784d78bc891144dd55493503d/pkg/services/object/util/placement.go#L125) that we need to set the value for local node info, it looks ugly.
Owner

What do you mean? We have g.cnrSrc.Get() and g.netMapSrc.GetNetMapByEpoch(), I suggest having g.nodeInfoSrc.Get() along the lines.

What do you mean? We have `g.cnrSrc.Get()` and `g.netMapSrc.GetNetMapByEpoch()`, I suggest having `g.nodeInfoSrc.Get()` along the lines.
Author
Member

When we create traverser there is no flag which indicates that we need to set local node info or not, because we set priority metrics via a list of options. I thought we don't need to set it always for each traverser.

When we create `traverser` there is no flag which indicates that we need to set local node info or not, because we set priority metrics via a list of options. I thought we don't need to set it always for each traverser.
fyrchik marked this conversation as resolved
acid-ant force-pushed feat/get-prioritization from cfc3855922 to 6b3886c1a7 2024-10-22 09:22:11 +00:00 Compare
dstepanov-yadro requested changes 2024-10-22 11:09:38 +00:00
Dismissed
@ -324,1 +380,4 @@
}
// WithPriorityMetrics use provided priority metrics to sort nodes.
func WithPriorityMetrics(m []Metric) Option {

There is only usage in tests. Is it ok?

There is only usage in tests. Is it ok?
Author
Member

It is used when GET service initialized.
#1439/files

It is used when GET service initialized. https://git.frostfs.info/TrueCloudLab/frostfs-node/pulls/1439/files#diff-c477b7caf5394d02ccd1310a83d619293d7de526
dstepanov-yadro marked this conversation as resolved
aarifullin approved these changes 2024-10-22 11:24:08 +00:00
Dismissed
dstepanov-yadro approved these changes 2024-10-22 11:43:32 +00:00
Dismissed
fyrchik reviewed 2024-10-22 11:55:35 +00:00
@ -53,3 +63,3 @@
}
const invalidOptsMsg = "invalid traverser options"
const (
Owner

irrelevant change

irrelevant change
Author
Member

Revert it.

Revert it.
fyrchik reviewed 2024-10-22 12:08:56 +00:00
@ -0,0 +32,4 @@
func (am *attributeMetric) CalculateValue(from *netmap.NodeInfo, to *netmap.NodeInfo) int {
fromAttr := from.Attribute(am.attribute)
toAttr := to.Attribute(am.attribute)
if len(fromAttr) > 0 && len(toAttr) > 0 && fromAttr == toAttr {
Owner

Why does an empty pair of attributes have distance 1?
Not that I oppose, but do we have this in the RFC?

Why does an empty pair of attributes have distance 1? Not that I oppose, but do we have this in the RFC?
Author
Member

Yes, in RFC we have this line:

If target storage node and local storage node have the same attribute and the value of this attribute is equal, the value of priority metric is 0, in other cases - 1.
Yes, in RFC we have this line: ``` If target storage node and local storage node have the same attribute and the value of this attribute is equal, the value of priority metric is 0, in other cases - 1. ```
Author
Member

Is it ok that a storage node contains an empty attribute?

Is it ok that a storage node contains an empty attribute?
Owner

It may not contain an attribute at all, that is ok.

It may not contain an attribute at all, that is ok.
fyrchik marked this conversation as resolved
@ -278,0 +330,4 @@
)
require.NoError(t, err)
// Without priority metric `ClusterName` the order will be:
Owner

That's what I was talking about -- we build the test based on the HRW result (after we know it)
What about sth similar that you have in the SDK?

We just need to check that first nodes are ordered by clustername and the rest of them are equal to the traverser result without node state.

Not important, feel free to dismiss.

That's what I was talking about -- we build the test based on the HRW result (after we know it) What about sth similar that you have in the SDK? https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/commit/79f387317a1bbd53ae41b97bbfaad3a5fc9c8553/netmap/selector_test.go#L336 We just need to check that first nodes are ordered by clustername and the rest of them are equal to the traverser result without node state. Not important, feel free to dismiss.
Author
Member

Next return slice of structs, which contains only addresses and key. There are no way to check attributes.

`Next` return slice of structs, which contains only addresses and key. There are no way to check attributes.
fyrchik requested changes 2024-10-23 06:28:04 +00:00
Dismissed
@ -41,3 +43,3 @@
}
func testPlacement(ss, rs []int) ([][]netmap.NodeInfo, container.Container) {
func testPlacement(ss []int, rs [][]int) ([][]netmap.NodeInfo, container.Container) {
Owner

Could you explain, what is the reasoning behing this refactoring?
Replica is a number, placement policy contains a list of replicas. I don't see why do we need a 2-dimensional list here.

Could you explain, what is the reasoning behing this refactoring? Replica is a number, placement policy contains a list of replicas. I don't see why do we need a 2-dimensional list here.
Author
Member

That was done for EC container. Reduce amount of changes.

That was done for EC container. Reduce amount of changes.
Owner

I don't like such overloading, it overcomplicates the code for no real reason.
So we have rs (replicas) 2-dimensional slice, but it is really 1-dimensional, but each slice can contain only 1 or 2 elements and if it contains 2, it is suddenly a EC policy, not a replica.
testECPlacement is not that hard to write and common parts can be reused.

I don't like such overloading, it overcomplicates the code for no real reason. So we have `rs` (replicas) 2-dimensional slice, but it is really 1-dimensional, but each slice can contain only 1 or 2 elements and if it contains 2, it is suddenly a EC policy, not a replica. `testECPlacement` is not that hard to write and common parts can be reused.
Author
Member

You are right, changes were reverted yesterday, please review.

You are right, changes were reverted yesterday, please review.
acid-ant force-pushed feat/get-prioritization from 6b3886c1a7 to d2f36e7726 2024-10-23 07:14:53 +00:00 Compare
acid-ant dismissed aarifullin's review 2024-10-23 07:14:53 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

acid-ant dismissed dstepanov-yadro's review 2024-10-23 07:14:53 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

acid-ant force-pushed feat/get-prioritization from d2f36e7726 to 744c3947a7 2024-10-24 06:50:42 +00:00 Compare
acid-ant force-pushed feat/get-prioritization from 744c3947a7 to efe1143ce2 2024-10-28 09:41:03 +00:00 Compare
fyrchik approved these changes 2024-10-28 10:59:44 +00:00
Dismissed
fyrchik approved these changes 2024-10-29 08:05:05 +00:00
fyrchik merged commit 81f4cdbb91 into master 2024-10-29 08:05:10 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1439
No description provided.