container: Reduce iterations through container list #1577
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1577
Loading…
Reference in a new issue
No description provided.
Delete branch "elebedeva/frostfs-node:feat/container-batching"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Relates to #1453
Close #1452
When listing containers we used to iterate through the the whole list of containers twice: first when reading from a contract, then when sending them. Now we can send batches of containers when reading from the contract.
Signed-off-by: Ekaterina Lebedeva ekaterina.lebedeva@yadro.com
2084a3d7c5
to90ec3d23c6
WIP: container: Reduce iterations through container listto container: Reduce iterations through container list@ -16,3 +18,3 @@
//
// If remote RPC does not support neo-go session API, fallback to List() method.
func (c *Client) ContainersOf(idUser *user.ID) ([]cid.ID, error) {
func (c *Client) ContainersOf(prm *container.ContainersOfClientPrm) ([]cid.ID, error) {
If
prm.Stream != nil
what are the return values for this function?nil, error
ornil, nil
That by itself is a good reason to have a separate function.
The code could be reused (
ContainersOf
can useIterateContainersOf
)Moved iteration over containers from
ContainersOf
to separate function and added new function which can perform container batch processing.@ -25,2 +42,3 @@
var cidList []cid.ID
cb := func(item stackitem.Item) error {
appendCIDtoList := func(item stackitem.Item) error {
Unrelated to the commit.
@ -5,6 +5,8 @@ import (
"fmt"
"git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/client"
container "git.frostfs.info/TrueCloudLab/frostfs-node/pkg/services/container/morph"
This dependency should not be here.
pkg/morph/*
communicates with neo-go,pkg/services/container/*
implements buisness logic.Ok, fixed.
90ec3d23c6
to505e52a55b
505e52a55b
toce7fc30350
ce7fc30350
tod84060a44c
@ -63,0 +105,4 @@
return err
}
if !batchProcessed {
Coould be replaced with
if len(cidList) > 0
?Sure, done
@ -223,0 +226,4 @@
}
return s.rdr.ContainersOfWithProcessing(&id, processCIDList)
}
Woulb be nice to check if ctx is cancelled inside this cb.
Added context check.
d84060a44c
to69f8ccc794
@ -30,6 +30,7 @@ type Reader interface {
// to the specified user of FrostFS system. Returns the identifiers
// of all FrostFS containers if pointer to owner identifier is nil.
ContainersOf(*user.ID) ([]cid.ID, error)
ContainersOfWithProcessing(*user.ID, func([]cid.ID) error) error
WithProcessing
is something unfamiliar,Iterate
is more common.func([]cid.ID) error
looks like unnecessary complexity, why notfunc(cid.ID) error
?The idea was to make it obvious from the method signature that just like
ContainersOf
returnscid list
, new method allows to perform some action on thiscid list
. E.g. if infunc([]cid.ID) error
I would append passedcid list
to my list, I would get the same result as if I usedContainersOf
-[]cid.ID
anderror
.I did as you suggested, code looks more clean now.
69f8ccc794
tof1a1f3c568
f1a1f3c568
to155bad3dbe
@ -220,0 +228,4 @@
default:
}
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
What should we use as
batch size
here?Or we could send ids one by one but for a large number of containers it seems like a bad idea.
@ -63,0 +105,4 @@
// This method uses `(*Client).TestInvokeIterator()` which can return
// `unwrap.ErrNoSessionID` if the remote neo-go node does not support sessions.
// So providing handler for this type of errors is required.
func (c *Client) iterateContainers(idUser *user.ID, cb func(item stackitem.Item) error, handleSessionError func() error) error {
Using callbacks to handle errors is not something I see often.
Now you have 2 places where list is formed and in
ContainerOf
handleSessionError() is closed over cidList variable, socb
might not be called. The code is hard to understand.Note that
c.list()
already does some parsing, so my suggestion:c.list()
toc.iterate()
and addcb func(...)
argument to it.handleSessionError
argument from here and makecb
always be called.Now, the
cb
in (1) can accept eitherstackitem.Item
orcid.ID
, I believecid.ID
is what it should accept, so that all parsing is in 1 place.Did as you suggested, thanks!
Moved
stackitem.Item
parsing to a separate function because it would be used in 2 places anyway:TestInvokeIterator
acceptscb func(stackitem.Item) error
.@ -220,0 +223,4 @@
var cidList []cid.ID
processCID := func(id cid.ID) error {
select {
case <-stream.Context().Done():
->
->
:)
Fixed, thanks!
155bad3dbe
to2ba4215205
@ -202,3 +203,3 @@
}
func (s *morphExecutor) ListStream(_ context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) error {
func (s *morphExecutor) ListStream(ctx context.Context, req *container.ListStreamRequest, stream containerSvc.ListStream) error {
ListStream
signature doesn't have any context.How had the context appeared here?
That's because
ContainerServiceClient
requires context andContainerServiceServer
does not.Please see https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/branch/master/api/container/grpc/service_grpc.pb.go
2ba4215205
to691ae8bbbb
691ae8bbbb
to76692f019f
@ -58,0 +55,4 @@
func getCIDfromStackItem(item stackitem.Item) (cid.ID, error) {
rawID, err := client.BytesFromStackItem(item)
if err != nil {
return [32]byte{}, fmt.Errorf("get byte array from stack item (%s): %w", listMethod, err)
[32]byte{}
->cid.ID{}
? :)Fixed!
Fixed!
76692f019f
to1609329f53
@ -220,0 +231,4 @@
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
r.GetBody().SetContainerIDs(getRefCIDList(cidList))
cidList = make([]cid.ID, 0, 512)
return stream.Send(r)
Nice. That's what I actually expected from list container streaming 👍
1609329f53
to8e167bc67c
@ -220,0 +228,4 @@
default:
}
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
This batch size should be different, though.
It is more related to the transport level.
I would use a separate constant here.
Added new local constant.
@ -220,0 +230,4 @@
cidList = append(cidList, id)
if len(cidList) == 512 { // 512 is batch size from pkg/morph/client/container/containers_of.go
r.GetBody().SetContainerIDs(getRefCIDList(cidList))
cidList = make([]cid.ID, 0, 512)
After
getRefCIDList
the oldcidList
is no longer used, so you can just havecidList = cidList[:0]
here.This saves us an allocation, removes magic constant and allows us to reuse memory.
Also, you can have
var cidList []refs.ContainerID
here, and parse each ID separately.This saves us another slice allocation for conversion stage.
You're right, too many unnecessary actions here. Fixed!
8e167bc67c
to64dca5219c
New commits pushed, approval review dismissed automatically according to repository settings
@ -25,3 +39,1 @@
var cidList []cid.ID
cb := func(item stackitem.Item) error {
rawID, err := client.BytesFromStackItem(item)
cnrHash := c.client.ContractAddress()
This line was right before the
TestInvokeIterator
previously, why has it moved?Accidentally. Moved it back.
64dca5219c
toe545cc65b1
e545cc65b1
toc0221d76e6