object: Sort nodes by priority metrics to compute GET requests #1439
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1439
Loading…
Reference in a new issue
No description provided.
Delete branch "acid-ant/frostfs-node:feat/get-prioritization"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com
64bbae5369
to78e4d71cc0
78e4d71cc0
to1de4a7d5fe
WIP: object: Sort nodes by priority metrics to compute GET/SEARCH requeststo object: Sort nodes by priority metrics to compute GET/SEARCH requests1de4a7d5fe
to8c4df77878
Remove usage of priority metrics from SEARCH service.
@ -0,0 +12,4 @@
)
type Metric interface {
CalculateValue(*netmap.NodeInfo, *netmap.NodeInfo) []byte
What does the resulting type represent?
I expect metric to be a number (int or float)
The idea was to use already existed api for sort instead of writing the custom for comparing
[]int/float
lexicographically. ForgeoDistance
metric, just convert distance touint32
and then to[]byte
. The same is for any other metrics with one thing in mind - need to compare the resulting[]byte
lexicographically.It is not that hard to implement. And we have
slices.Compare
.Thanks, fixed.
@ -0,0 +22,4 @@
return errors.New("unsupported priority metric")
}
func ParseMetric(raw string) Metric {
Honestly, I don't like separating
Parse*
andValidate*
. Parsing performs validation automatically.It may bite us in unexpected places, like silent errors on SIGHUP. And now we need to be sure we always validate.
Can we stay with
ParseMetric() (Metric, error)
?That was intentionally, because we don't have an ability to exit gracefully when reading config - only panic is possible.
I don't see how this is related.
If we currently panic on invalid config -- be it.
SIGHUP doesn't panic anyway.
Updated.
@ -0,0 +46,4 @@
}
func NewAttributeMetric(raw string) Metric {
attr, _ := strings.CutPrefix(raw, attrPrefix)
It seems you
CutPrefix
already inParseMetric
. ItCutPrefix
here intentional?Thanks, fixed.
@ -55,1 +68,3 @@
const invalidOptsMsg = "invalid traverser options"
const (
invalidOptsMsg = "invalid traverser options"
uint16Bytes = 2
uint16Size
?c1d9303d82/src/math/bits/bits.go (L20)
No more relevant.
@ -160,0 +208,4 @@
for _, m := range cfg.metrics {
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
There is
binary.LittleEndian.AppendUint16
Updated.
@ -160,0 +209,4 @@
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
metrics[i] = append(metrics[i], b...)
Why do we need to append an index?
We need to store somewhere the relation between metric value and node. It is impossible to use slice as a key for map.
Can we use a struct with 2 fields then?
Don't see how
[]byte
with overloaded meaning is better.Updated, please review.
@ -277,1 +280,4 @@
}
type nodeState struct {
node *netmapAPI.NodeInfo
We try to avoid explicit API imports.
Can we use the one from sdk?
Ok, but for that I need to refactor a bit config and netmap service.
Refactored.
@ -278,0 +324,4 @@
next := tr.Next()
require.NotNil(t, next)
require.Equal(t, 3, len(next))
require.Equal(t, "/ip4/0.0.0.0/tcp/2", string(next[0].PublicKey()))
This
2, 0, 1
seems completely random. What is the reason for this exact sequence?Added comments.
@ -160,0 +209,4 @@
metrics[i] = append(metrics[i], m.CalculateValue(&node, &unsortedVector[i])...)
}
binary.LittleEndian.PutUint16(b, uint16(i))
metrics[i] = append(metrics[i], b...)
Could we consider to use
instead of using the suffix - honestly, I spent some time to understand these steps
Yes, I'll rewrite this part to be more readable and simpler, thanks.
Updated.
@ -103,0 +123,4 @@
unsortedVector = append(unsortedVector, ns[i][:rem[i]]...)
regularVector = append(regularVector, ns[i][rem[i]:]...)
}
rem = make([]int, 2)
I believe compile-time defined
[]int{-1, -1}
must be better :) The same is fair for the lines belowFixed.
@ -101,2 +117,3 @@
var rem []int
if cfg.flatSuccess != nil {
if len(cfg.metrics) > 0 {
rem = defaultCopiesVector(cfg.policy)
Have you checked this behaves properly with EC?
Thanks, added test for EC container. Works as expected.
8c4df77878
to6f1b5ca613
6f1b5ca613
tof0d0050634
object: Sort nodes by priority metrics to compute GET/SEARCH requeststo WIP: object: Sort nodes by priority metrics to compute GET/SEARCH requestsWIP: object: Sort nodes by priority metrics to compute GET/SEARCH requeststo WIP: object: Sort nodes by priority metrics to compute GET requestsf0d0050634
to2f62a05f6a
Updated sort function.
WIP: object: Sort nodes by priority metrics to compute GET requeststo object: Sort nodes by priority metrics to compute GET requests2f62a05f6a
todf784dd93d
df784dd93d
tocfc3855922
@ -278,0 +316,4 @@
}),
WithoutSuccessTracking(),
WithPriorityMetrics(m),
WithNodeState(&nodeState{
You provide this interface each time the traverser is created.
Why have you decided to use an interface instead of directly providing local node info?
Because we are changing this value at each epoch. Also, that will be helpful when we have an ability to change the list of the storage node attributes in runtime.
New traverser is created for each request, so we would rather like this value not to be changed inside the traverser.
For each call of
LocalNodeInfo()
we are creating copy ofNodeInfo
.If we want to reject using of interface in traverser we need to check here that we need to set the value for local node info, it looks ugly.
What do you mean? We have
g.cnrSrc.Get()
andg.netMapSrc.GetNetMapByEpoch()
, I suggest havingg.nodeInfoSrc.Get()
along the lines.When we create
traverser
there is no flag which indicates that we need to set local node info or not, because we set priority metrics via a list of options. I thought we don't need to set it always for each traverser.cfc3855922
to6b3886c1a7
@ -324,1 +380,4 @@
}
// WithPriorityMetrics use provided priority metrics to sort nodes.
func WithPriorityMetrics(m []Metric) Option {
There is only usage in tests. Is it ok?
It is used when GET service initialized.
#1439/files
@ -53,3 +63,3 @@
}
const invalidOptsMsg = "invalid traverser options"
const (
irrelevant change
Revert it.
@ -0,0 +32,4 @@
func (am *attributeMetric) CalculateValue(from *netmap.NodeInfo, to *netmap.NodeInfo) int {
fromAttr := from.Attribute(am.attribute)
toAttr := to.Attribute(am.attribute)
if len(fromAttr) > 0 && len(toAttr) > 0 && fromAttr == toAttr {
Why does an empty pair of attributes have distance 1?
Not that I oppose, but do we have this in the RFC?
Yes, in RFC we have this line:
Is it ok that a storage node contains an empty attribute?
It may not contain an attribute at all, that is ok.
@ -278,0 +330,4 @@
)
require.NoError(t, err)
// Without priority metric `ClusterName` the order will be:
That's what I was talking about -- we build the test based on the HRW result (after we know it)
What about sth similar that you have in the SDK?
require.Equal(t, 2, len(n[0]))
We just need to check that first nodes are ordered by clustername and the rest of them are equal to the traverser result without node state.
Not important, feel free to dismiss.
Next
return slice of structs, which contains only addresses and key. There are no way to check attributes.@ -41,3 +43,3 @@
}
func testPlacement(ss, rs []int) ([][]netmap.NodeInfo, container.Container) {
func testPlacement(ss []int, rs [][]int) ([][]netmap.NodeInfo, container.Container) {
Could you explain, what is the reasoning behing this refactoring?
Replica is a number, placement policy contains a list of replicas. I don't see why do we need a 2-dimensional list here.
That was done for EC container. Reduce amount of changes.
I don't like such overloading, it overcomplicates the code for no real reason.
So we have
rs
(replicas) 2-dimensional slice, but it is really 1-dimensional, but each slice can contain only 1 or 2 elements and if it contains 2, it is suddenly a EC policy, not a replica.testECPlacement
is not that hard to write and common parts can be reused.You are right, changes were reverted yesterday, please review.
6b3886c1a7
tod2f36e7726
New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
d2f36e7726
to744c3947a7
744c3947a7
toefe1143ce2