container: Reduce iterations through container list #1577

Merged
fyrchik merged 3 commits from elebedeva/frostfs-node:feat/container-batching into master 2024-12-28 10:05:35 +00:00
Member

Relates to #1453
Close #1452

When listing containers we used to iterate through the the whole list of containers twice: first when reading from a contract, then when sending them. Now we can send batches of containers when reading from the contract.

Signed-off-by: Ekaterina Lebedeva ekaterina.lebedeva@yadro.com

Relates to #1453 Close #1452 When listing containers we used to iterate through the the whole list of containers twice: first when reading from a contract, then when sending them. Now we can send batches of containers when reading from the contract. Signed-off-by: Ekaterina Lebedeva ekaterina.lebedeva@yadro.com
elebedeva added 2 commits 2024-12-20 18:59:28 +00:00
add prm to to pass stream and resp
Some checks failed
DCO action / DCO (pull_request) Failing after 3m37s
Tests and linters / Run gofumpt (pull_request) Failing after 3m31s
Vulncheck / Vulncheck (pull_request) Successful in 3m43s
Pre-commit hooks / Pre-commit (pull_request) Successful in 4m30s
Build / Build Components (pull_request) Successful in 4m43s
Tests and linters / Lint (pull_request) Failing after 4m54s
Tests and linters / Staticcheck (pull_request) Successful in 4m54s
Tests and linters / gopls check (pull_request) Successful in 5m9s
Tests and linters / Tests (pull_request) Successful in 5m33s
Tests and linters / Tests with -race (pull_request) Successful in 5m36s
2084a3d7c5
elebedeva force-pushed feat/container-batching from 2084a3d7c5 to 90ec3d23c6 2024-12-23 14:41:48 +00:00 Compare
elebedeva changed title from WIP: container: Reduce iterations through container list to container: Reduce iterations through container list 2024-12-23 14:55:08 +00:00
elebedeva requested review from storage-core-committers 2024-12-23 14:55:09 +00:00
elebedeva requested review from storage-core-developers 2024-12-23 14:55:09 +00:00
fyrchik requested changes 2024-12-24 07:13:57 +00:00
Dismissed
@ -16,3 +18,3 @@
//
// If remote RPC does not support neo-go session API, fallback to List() method.
func (c *Client) ContainersOf(idUser *user.ID) ([]cid.ID, error) {
func (c *Client) ContainersOf(prm *container.ContainersOfClientPrm) ([]cid.ID, error) {
Owner

If prm.Stream != nil what are the return values for this function?

If `prm.Stream != nil` what are the return values for this function?
Author
Member

nil, error or nil, nil

`nil, error` or `nil, nil`
Owner

That by itself is a good reason to have a separate function.
The code could be reused (ContainersOf can use IterateContainersOf)

That by itself is a good reason to have a separate function. The code could be reused (`ContainersOf` can use `IterateContainersOf`)
Author
Member

Moved iteration over containers from ContainersOf to separate function and added new function which can perform container batch processing.

Moved iteration over containers from `ContainersOf` to separate function and added new function which can perform container batch processing.
fyrchik marked this conversation as resolved
@ -25,2 +42,3 @@
var cidList []cid.ID
cb := func(item stackitem.Item) error {
appendCIDtoList := func(item stackitem.Item) error {
Owner

Unrelated to the commit.

Unrelated to the commit.
fyrchik marked this conversation as resolved
fyrchik reviewed 2024-12-24 12:19:05 +00:00
@ -5,6 +5,8 @@ import (
"fmt"
"git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/client"
container "git.frostfs.info/TrueCloudLab/frostfs-node/pkg/services/container/morph"
Owner

This dependency should not be here.
pkg/morph/* communicates with neo-go, pkg/services/container/* implements buisness logic.

This dependency should not be here. `pkg/morph/*` communicates with neo-go, `pkg/services/container/*` implements buisness logic.
Author
Member

Ok, fixed.

Ok, fixed.
fyrchik marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 90ec3d23c6 to 505e52a55b 2024-12-25 11:29:13 +00:00 Compare
elebedeva force-pushed feat/container-batching from 505e52a55b to ce7fc30350 2024-12-25 14:46:15 +00:00 Compare
elebedeva force-pushed feat/container-batching from ce7fc30350 to d84060a44c 2024-12-25 15:32:13 +00:00 Compare
dstepanov-yadro reviewed 2024-12-25 15:32:29 +00:00
@ -63,0 +105,4 @@
return err
}
if !batchProcessed {

Coould be replaced with if len(cidList) > 0 ?

Coould be replaced with `if len(cidList) > 0` ?
Author
Member

Sure, done

Sure, done
dstepanov-yadro marked this conversation as resolved
dstepanov-yadro reviewed 2024-12-25 15:33:38 +00:00
@ -223,0 +226,4 @@
}
return s.rdr.ContainersOfWithProcessing(&id, processCIDList)
}

Woulb be nice to check if ctx is cancelled inside this cb.

Woulb be nice to check if ctx is cancelled inside this cb.
Author
Member

Added context check.

Added context check.
fyrchik marked this conversation as resolved
elebedeva force-pushed feat/container-batching from d84060a44c to 69f8ccc794 2024-12-25 15:35:47 +00:00 Compare
fyrchik reviewed 2024-12-25 18:20:03 +00:00
@ -30,6 +30,7 @@ type Reader interface {
// to the specified user of FrostFS system. Returns the identifiers
// of all FrostFS containers if pointer to owner identifier is nil.
ContainersOf(*user.ID) ([]cid.ID, error)
ContainersOfWithProcessing(*user.ID, func([]cid.ID) error) error
Owner
  1. WithProcessing is something unfamiliar, Iterate is more common.
  2. func([]cid.ID) error looks like unnecessary complexity, why not func(cid.ID) error?
1. `WithProcessing` is something unfamiliar, `Iterate` is more common. 2. `func([]cid.ID) error` looks like unnecessary complexity, why not `func(cid.ID) error`?
Author
Member

The idea was to make it obvious from the method signature that just like ContainersOf returns cid list, new method allows to perform some action on this cid list. E.g. if in func([]cid.ID) error I would append passed cid list to my list, I would get the same result as if I used ContainersOf - []cid.ID and error.

The idea was to make it obvious from the method signature that just like `ContainersOf` returns `cid list`, new method allows to perform some action on this `cid list`. E.g. if in `func([]cid.ID) error` I would append passed `cid list` to my list, I would get the same result as if I used `ContainersOf` - `[]cid.ID` and `error`.
Author
Member

I did as you suggested, code looks more clean now.

I did as you suggested, code looks more clean now.
fyrchik marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 69f8ccc794 to f1a1f3c568 2024-12-26 22:44:20 +00:00 Compare
elebedeva force-pushed feat/container-batching from f1a1f3c568 to 155bad3dbe 2024-12-26 22:50:25 +00:00 Compare
elebedeva reviewed 2024-12-26 22:55:58 +00:00
@ -220,0 +228,4 @@
default:
}
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
Author
Member

What should we use as batch size here?
Or we could send ids one by one but for a large number of containers it seems like a bad idea.

What should we use as `batch size` here? Or we could send ids one by one but for a large number of containers it seems like a bad idea.
elebedeva marked this conversation as resolved
fyrchik requested changes 2024-12-27 05:40:13 +00:00
Dismissed
@ -63,0 +105,4 @@
// This method uses `(*Client).TestInvokeIterator()` which can return
// `unwrap.ErrNoSessionID` if the remote neo-go node does not support sessions.
// So providing handler for this type of errors is required.
func (c *Client) iterateContainers(idUser *user.ID, cb func(item stackitem.Item) error, handleSessionError func() error) error {
Owner

Using callbacks to handle errors is not something I see often.
Now you have 2 places where list is formed and in ContainerOf handleSessionError() is closed over cidList variable, so cb might not be called. The code is hard to understand.

Note that c.list() already does some parsing, so my suggestion:

  1. rename c.list() to c.iterate() and add cb func(...) argument to it.
  2. remove handleSessionError argument from here and make cb always be called.

Now, the cb in (1) can accept either stackitem.Item or cid.ID, I believe cid.ID is what it should accept, so that all parsing is in 1 place.

Using callbacks to handle errors is not something I see often. Now you have 2 places where list is formed and in `ContainerOf` handleSessionError() is closed over cidList variable, so `cb` might not be called. The code is hard to understand. Note that `c.list()` already does some parsing, so my suggestion: 1. rename `c.list()` to `c.iterate()` and add `cb func(...)` argument to it. 2. remove `handleSessionError` argument from here and make `cb` always be called. Now, the `cb` in (1) can accept either `stackitem.Item` or `cid.ID`, I believe `cid.ID` is what it should accept, so that all parsing is in 1 place.
Author
Member

Did as you suggested, thanks!

Moved stackitem.Item parsing to a separate function because it would be used in 2 places anyway: TestInvokeIterator accepts cb func(stackitem.Item) error.

Did as you suggested, thanks! Moved `stackitem.Item` parsing to a separate function because it would be used in 2 places anyway: `TestInvokeIterator` accepts `cb func(stackitem.Item) error`.
fyrchik marked this conversation as resolved
aarifullin reviewed 2024-12-27 07:05:39 +00:00
@ -220,0 +223,4 @@
var cidList []cid.ID
processCID := func(id cid.ID) error {
select {
case <-stream.Context().Done():
Member
ListStream(_ context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream)

->

ListStream(ctx context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream)

->

case <-ctx.Done()

:)

```go ListStream(_ context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) ``` -> ```go ListStream(ctx context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) ``` -> ```go case <-ctx.Done() ``` :)
Author
Member

Fixed, thanks!

Fixed, thanks!
aarifullin marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 155bad3dbe to 2ba4215205 2024-12-27 08:09:52 +00:00 Compare
fyrchik reviewed 2024-12-27 08:36:39 +00:00
@ -202,3 +203,3 @@
}
func (s *morphExecutor) ListStream(_ context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) error {
func (s *morphExecutor) ListStream(ctx context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) error {
Owner

ListStream signature doesn't have any context.

func (s *signService) ListStream(req *container.ListStreamRequest, stream ListStream) error {
func (ac *apeChecker) ListStream(req *container.ListStreamRequest, stream ListStream) error {

How had the context appeared here?

`ListStream` signature doesn't have any context. ``` func (s *signService) ListStream(req *container.ListStreamRequest, stream ListStream) error { func (ac *apeChecker) ListStream(req *container.ListStreamRequest, stream ListStream) error { ``` How had the context appeared here?
Author
Member

That's because ContainerServiceClient requires context and ContainerServiceServer does not.
Please see https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/branch/master/api/container/grpc/service_grpc.pb.go

type ContainerServiceClient interface {
  // ...
  	ListStream(ctx context.Context, in *ListStreamRequest, opts ...grpc.CallOption) (ContainerService_ListStreamClient, error)
}
type ContainerServiceServer interface {
  // ...
  	ListStream(*ListStreamRequest, ContainerService_ListStreamServer) error
}
That's because `ContainerServiceClient` requires context and `ContainerServiceServer` does not. Please see https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/branch/master/api/container/grpc/service_grpc.pb.go ```go type ContainerServiceClient interface { // ... ListStream(ctx context.Context, in *ListStreamRequest, opts ...grpc.CallOption) (ContainerService_ListStreamClient, error) } ``` ```go type ContainerServiceServer interface { // ... ListStream(*ListStreamRequest, ContainerService_ListStreamServer) error } ```
fyrchik marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 2ba4215205 to 691ae8bbbb 2024-12-27 10:56:33 +00:00 Compare
elebedeva force-pushed feat/container-batching from 691ae8bbbb to 76692f019f 2024-12-27 11:03:29 +00:00 Compare
aarifullin reviewed 2024-12-27 11:07:54 +00:00
@ -58,0 +55,4 @@
func getCIDfromStackItem(item stackitem.Item) (cid.ID, error) {
rawID, err := client.BytesFromStackItem(item)
if err != nil {
return [32]byte{}, fmt.Errorf("get byte array from stack item (%s): %w", listMethod, err)
Member

[32]byte{} -> cid.ID{}? :)

`[32]byte{}` -> `cid.ID{}`? :)
Author
Member

Fixed!

Fixed!
Author
Member

Fixed!

Fixed!
aarifullin marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 76692f019f to 1609329f53 2024-12-27 11:10:06 +00:00 Compare
aarifullin reviewed 2024-12-27 11:10:19 +00:00
@ -220,0 +231,4 @@
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
r.GetBody().SetContainerIDs(getRefCIDList(cidList))
cidList = make([]cid.ID, 0, 512)
return stream.Send(r)
Member

Nice. That's what I actually expected from list container streaming 👍

Nice. That's what I actually expected from list container streaming 👍
elebedeva force-pushed feat/container-batching from 1609329f53 to 8e167bc67c 2024-12-27 11:13:03 +00:00 Compare
aarifullin approved these changes 2024-12-27 11:15:06 +00:00
Dismissed
fyrchik requested changes 2024-12-27 12:08:46 +00:00
Dismissed
@ -220,0 +228,4 @@
default:
}
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
Owner

This batch size should be different, though.
It is more related to the transport level.
I would use a separate constant here.

This batch size should be different, though. It is more related to the transport level. I would use a separate constant here.
Author
Member

Added new local constant.

Added new local constant.
fyrchik marked this conversation as resolved
@ -220,0 +230,4 @@
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
r.GetBody().SetContainerIDs(getRefCIDList(cidList))
cidList = make([]cid.ID, 0, 512)
Owner

After getRefCIDList the old cidList is no longer used, so you can just have cidList = cidList[:0] here.
This saves us an allocation, removes magic constant and allows us to reuse memory.

After `getRefCIDList` the old `cidList` is no longer used, so you can just have `cidList = cidList[:0]` here. This saves us an allocation, removes magic constant and allows us to reuse memory.
Owner

Also, you can have var cidList []refs.ContainerID here, and parse each ID separately.
This saves us another slice allocation for conversion stage.

Also, you can have `var cidList []refs.ContainerID` here, and parse each ID separately. This saves us another slice allocation for conversion stage.
Author
Member

You're right, too many unnecessary actions here. Fixed!

You're right, too many unnecessary actions here. Fixed!
fyrchik marked this conversation as resolved
elebedeva force-pushed feat/container-batching from 8e167bc67c to 64dca5219c 2024-12-27 12:32:04 +00:00 Compare
elebedeva dismissed aarifullin's review 2024-12-27 12:32:09 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

fyrchik reviewed 2024-12-28 05:43:36 +00:00
@ -25,3 +39,1 @@
var cidList []cid.ID
cb := func(item stackitem.Item) error {
rawID, err := client.BytesFromStackItem(item)
cnrHash := c.client.ContractAddress()
Owner

This line was right before the TestInvokeIterator previously, why has it moved?

This line was right before the `TestInvokeIterator` previously, why has it moved?
Author
Member

Accidentally. Moved it back.

Accidentally. Moved it back.
elebedeva force-pushed feat/container-batching from 64dca5219c to e545cc65b1 2024-12-28 09:01:53 +00:00 Compare
elebedeva force-pushed feat/container-batching from e545cc65b1 to c0221d76e6 2024-12-28 09:05:10 +00:00 Compare
elebedeva requested review from storage-core-committers 2024-12-28 09:16:43 +00:00
elebedeva requested review from storage-core-developers 2024-12-28 09:16:44 +00:00
elebedeva removed review request for storage-core-committers 2024-12-28 09:16:56 +00:00
aarifullin approved these changes 2024-12-28 09:38:14 +00:00
fyrchik approved these changes 2024-12-28 10:05:29 +00:00
fyrchik merged commit c0221d76e6 into master 2024-12-28 10:05:35 +00:00
fyrchik referenced this pull request from a commit 2024-12-28 10:05:37 +00:00
fyrchik referenced this pull request from a commit 2024-12-28 10:05:37 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1577
No description provided.