Sort nodes by Filename in GetSubTree #507

Merged
fyrchik merged 1 commit from dstepanov-yadro/frostfs-node:feat/sort-tree-nodes into master 2023-07-26 21:07:59 +00:00

S3 requires only ascending ordering, so only asc was supported: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html

To support desc need to change traversing way.

Closes #335

S3 requires only ascending ordering, so only asc was supported: https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html To support desc need to change traversing way. Closes #335
dstepanov-yadro force-pushed feat/sort-tree-nodes from 812ab4c543 to a3d1e3cda4 2023-07-11 14:23:05 +00:00 Compare
requested reviews from storage-core-committers, storage-core-developers 2023-07-11 14:32:28 +00:00
dstepanov-yadro changed title from WIP: Sort nodes by Filename in GetSubTree to Sort nodes by Filename in GetSubTree 2023-07-11 14:32:33 +00:00
dstepanov-yadro force-pushed feat/sort-tree-nodes from a3d1e3cda4 to 7247dd271e 2023-07-11 14:32:59 +00:00 Compare
requested reviews from alexvanin, dkirillov 2023-07-11 15:47:13 +00:00
fyrchik reviewed 2023-07-12 07:46:06 +00:00
@ -474,6 +475,10 @@ func getSubTree(ctx context.Context, srv TreeService_GetSubTreeServer, cid cidSD
if err != nil {
return err
}
children, err = sortByFilename(ctx, cid, b.GetTreeId(), children, b.GetOrderBy().GetDirection(), forest)
Owner

Why not do it in GetChildren?

Why not do it in `GetChildren`?
Author
Member

Because we have memory and boltdb forest implementation, but ordering will be the same.

Because we have memory and boltdb forest implementation, but ordering will be the same.
Owner

So we can sort in both bolt and memory implementation, right?

So we can sort in both bolt and memory implementation, right?
Author
Member

Now get children returns nodeID, parentID and meta, so i removed TreeGetMeta call from service.

Now `get children` returns nodeID, parentID and meta, so i removed `TreeGetMeta` call from service.
dstepanov-yadro force-pushed feat/sort-tree-nodes from 7247dd271e to e258202bd4 2023-07-12 09:12:49 +00:00 Compare
dkirillov reviewed 2023-07-12 11:28:00 +00:00
@ -485,0 +508,4 @@
if filenameBytes, found := filenames[rhs]; found {
rightFilename = string(filenameBytes)
}
return leftFilename < rightFilename
Member

Can we compare using bytes.Compare?

Can we compare using `bytes.Compare`?
Author
Member

looks so, thx. fixed.

looks so, thx. fixed.
dkirillov marked this conversation as resolved
dstepanov-yadro force-pushed feat/sort-tree-nodes from e258202bd4 to 8ae09e69c6 2023-07-12 11:56:37 +00:00 Compare
dstepanov-yadro force-pushed feat/sort-tree-nodes from 8ae09e69c6 to 60af2616d3 2023-07-12 12:09:47 +00:00 Compare
dstepanov-yadro force-pushed feat/sort-tree-nodes from 60af2616d3 to 335641a364 2023-07-12 12:11:32 +00:00 Compare
dkirillov reviewed 2023-07-13 06:25:17 +00:00
@ -174,0 +191,4 @@
break
}
if !errors.Is(err, pilorama.ErrTreeNotFound) {
e.reportShardError(sh, "can't perform `TreeGetMeta`", err,
Member

Probably the error message should be can't perform 'TreeGetAttributeValues'

Probably the error message should be `can't perform 'TreeGetAttributeValues'`
Author
Member

method deleted

method deleted
dkirillov marked this conversation as resolved
dstepanov-yadro force-pushed feat/sort-tree-nodes from 335641a364 to 102a9df67e 2023-07-13 08:54:07 +00:00 Compare
dstepanov-yadro reviewed 2023-07-13 08:55:59 +00:00
@ -69,2 +69,4 @@
// do performs a single move operation on a tree.
func (s *memoryTree) do(op *Move) move {
m := op.Meta
if m.Items == nil {
Author
Member

To have the same behaviour as bolt forest.

To have the same behaviour as bolt forest.
dstepanov-yadro force-pushed feat/sort-tree-nodes from 102a9df67e to 2cc6976829 2023-07-17 14:54:22 +00:00 Compare
dstepanov-yadro force-pushed feat/sort-tree-nodes from 2cc6976829 to 8e92e46463 2023-07-19 07:06:53 +00:00 Compare
ale64bit reviewed 2023-07-19 08:16:50 +00:00
@ -970,2 +971,4 @@
children = append(children, binary.LittleEndian.Uint64(k[9:]))
}
for _, childID := range children {
Member

children is not needed outside the transaction now. You can probably merge this loop with the previous one and get rid of children.

`children` is not needed outside the transaction now. You can probably merge this loop with the previous one and get rid of `children`.
Author
Member

fixed

fixed
ale64bit marked this conversation as resolved
@ -156,2 +156,2 @@
testGetChildren(t, 0, []uint64{10, 2, 7})
testGetChildren(t, 10, []uint64{3, 6})
testGetChildren(t, 0, []NodeInfo{
{NodeID: 10, Meta: Meta{Time: 1, Items: []KeyValue{}}},
Member

do we actually need to specify these empty slices? If yes, it might be worth changing the handling code instead so that it doesn't differentiate between empty or nil slice.

do we actually need to specify these empty slices? If yes, it might be worth changing the handling code instead so that it doesn't differentiate between empty or nil slice.
Author
Member

Well, nil and empty slice are different values actually. When writing tests, a bug was found (and fixed): #507 (comment)
So we expect now that it will be empty slices if not metadata exists.

Well, nil and empty slice are different values actually. When writing tests, a bug was found (and fixed): https://git.frostfs.info/TrueCloudLab/frostfs-node/pulls/507#issuecomment-15709 So we expect now that it will be empty slices if not metadata exists.
Member

Normally this is a bit of an anti-pattern: https://google.github.io/styleguide/go/decisions#nil-slices. e.g. in this case, when no metadata exists, there's no real difference between empty slice or nil slice. If the entity containing the metadata doesn't exist in the first place, that can be handled with an error.

Normally this is a bit of an anti-pattern: https://google.github.io/styleguide/go/decisions#nil-slices. e.g. in this case, when no metadata exists, there's no real difference between empty slice or nil slice. If the entity containing the metadata doesn't exist in the first place, that can be handled with an `error`.
@ -57,1 +57,4 @@
}
type NodeInfo struct {
NodeID Node
Member

stutters a bit. Just ID?

stutters a bit. Just `ID`?
Author
Member

fixed

fixed
ale64bit marked this conversation as resolved
@ -122,0 +152,4 @@
acc := subTreeAcc{errIndex: -1}
err := getSubTree(context.Background(), &acc, d.CID, &GetSubTreeRequest_Body{
TreeId: treeID,
RootId: 0,
Member

no need to specify zero values (RootId and Depth)

no need to specify zero values (`RootId` and `Depth`)
Author
Member

fixed

fixed
ale64bit marked this conversation as resolved
@ -485,0 +493,4 @@
return nodes
}
less := func(i, j int) bool {
Member

mm, this might break if we add more sorting criteria later (i.e. more values of Order_Direction). Maybe add an explicit switch?

mm, this might break if we add more sorting criteria later (i.e. more values of `Order_Direction`). Maybe add an explicit `switch`?
Author
Member

fixed

fixed
ale64bit marked this conversation as resolved
dstepanov-yadro force-pushed feat/sort-tree-nodes from 8e92e46463 to 613ab8d90b 2023-07-19 08:51:53 +00:00 Compare
ale64bit approved these changes 2023-07-19 08:58:13 +00:00
fyrchik approved these changes 2023-07-19 11:43:22 +00:00
@ -971,0 +971,4 @@
childInfo := NodeInfo{
ID: childID,
}
parentID, _, metaBytes, found := t.getState(b, stateKey(key, childID))
Owner

Hm, so we cannot possibly do any paging there if we wanted to? (receive the first 100 items from the db). Do attributes stored by internalKey help here?

Hm, so we cannot possibly do any paging there if we wanted to? (receive the first 100 items from the db). Do attributes stored by `internalKey` help here?
Author
Member

internalKey has definition // 'i' + attribute name (string) + attribute value (string) + parent (id) + node (id) -> 0/1.
To seek by this index we need to have attribute value.

`internalKey` has definition `// 'i' + attribute name (string) + attribute value (string) + parent (id) + node (id) -> 0/1.` To seek by this index we need to have attribute value.
fyrchik marked this conversation as resolved
@ -971,0 +972,4 @@
ID: childID,
}
parentID, _, metaBytes, found := t.getState(b, stateKey(key, childID))
if found {
Owner

When can this found be false? Isn't it an error?

When can this `found` be false? Isn't it an error?
Author
Member

Perhaps. But in the previous implementation, this was not considered an error.

Perhaps. But in the previous implementation, this was not considered an error.
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed feat/sort-tree-nodes from 613ab8d90b to b1342f1ce2 2023-07-19 15:39:27 +00:00 Compare
aarifullin reviewed 2023-07-19 20:16:39 +00:00
@ -483,2 +492,4 @@
}
func sortByFilename(nodes []pilorama.NodeInfo, d GetSubTreeRequest_Body_Order_Direction) ([]pilorama.NodeInfo, error) {
if len(nodes) == 0 {
Member

[Optional]

It's really not big deal but from my point view this would be better like that:

func sortByFilename(nodes []pilorama.NodeInfo, d GetSubTreeRequest_Body_Order_Direction) ([]pilorama.NodeInfo, error) {
    switch d {
	case GetSubTreeRequest_Body_Order_None:
		return nodes, nil
	case GetSubTreeRequest_Body_Order_Asc:
		less := func(i, j int) bool {
			return bytes.Compare(nodes[i].Meta.GetAttr(pilorama.AttributeFilename), nodes[j].Meta.GetAttr(pilorama.AttributeFilename)) < 0
		}
		sort.Slice(nodes, less)
		return nodes, nil
	default:
        if len(nodes) > 0 {
		   return nil, fmt.Errorf("unsupported order direction: %s", d.String())
        }
        return nodes, nil
	}
}
[Optional] It's really not big deal but from my point view this would be better like that: ```golang func sortByFilename(nodes []pilorama.NodeInfo, d GetSubTreeRequest_Body_Order_Direction) ([]pilorama.NodeInfo, error) { switch d { case GetSubTreeRequest_Body_Order_None: return nodes, nil case GetSubTreeRequest_Body_Order_Asc: less := func(i, j int) bool { return bytes.Compare(nodes[i].Meta.GetAttr(pilorama.AttributeFilename), nodes[j].Meta.GetAttr(pilorama.AttributeFilename)) < 0 } sort.Slice(nodes, less) return nodes, nil default: if len(nodes) > 0 { return nil, fmt.Errorf("unsupported order direction: %s", d.String()) } return nodes, nil } } ```
Member

isn't this swallowing the error about invalid order direction when there are no nodes?

isn't this swallowing the error about invalid order direction when there are no nodes?
Author
Member

you are right, fixed

you are right, fixed
aarifullin approved these changes 2023-07-19 20:21:53 +00:00
dstepanov-yadro force-pushed feat/sort-tree-nodes from b1342f1ce2 to eed594431f 2023-07-20 07:15:56 +00:00 Compare
fyrchik approved these changes 2023-07-20 08:33:59 +00:00
fyrchik merged commit eed594431f into master 2023-07-20 08:34:08 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#507
No description provided.