[#469] List multipart uploads streaming #527

Open
nzinkevich wants to merge 1 commit from nzinkevich/frostfs-s3-gw:multiparts_list_streaming into master
Member

Signed-off-by: Nikita Zinkevich n.zinkevich@yadro.com

Signed-off-by: Nikita Zinkevich <n.zinkevich@yadro.com>
nzinkevich self-assigned this 2024-10-30 07:07:07 +00:00
nzinkevich added 1 commit 2024-10-30 07:07:08 +00:00
[#469] List multipart uploads streaming
Some checks failed
/ DCO (pull_request) Successful in 1m21s
/ Vulncheck (pull_request) Successful in 1m44s
/ Lint (pull_request) Failing after 1m35s
/ Tests (pull_request) Failing after 1m24s
/ Builds (pull_request) Successful in 1m18s
7a5333fe35
Signed-off-by: Nikita Zinkevich <n.zinkevich@yadro.com>
nzinkevich requested review from storage-services-committers 2024-10-30 07:08:28 +00:00
nzinkevich requested review from storage-services-developers 2024-10-30 07:08:31 +00:00
nzinkevich force-pushed multiparts_list_streaming from 7a5333fe35 to 1cb1299bb1 2024-10-30 13:56:58 +00:00 Compare
nzinkevich changed title from WIP: [#469] List multipart uploads streaming to [#469] List multipart uploads streaming 2024-10-30 13:57:06 +00:00
nzinkevich force-pushed multiparts_list_streaming from 1cb1299bb1 to 6970f52d2d 2024-11-01 06:44:33 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from 6970f52d2d to c726e2f058 2024-11-01 06:59:37 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from c726e2f058 to e673d727e9 2024-11-02 07:14:28 +00:00 Compare
dkirillov reviewed 2024-11-08 14:30:47 +00:00
@ -20,0 +23,4 @@
}
type MultipartInfoStream interface {
Next(marker, uploadID string) (*MultipartInfo, error)
Member

Why do we pass marker and uploadID?

Why do we pass `marker` and `uploadID`?
dkirillov marked this conversation as resolved
@ -515,0 +516,4 @@
isTruncated := true
var last data.MultipartInfo
var info *data.MultipartInfo
for uploadsCount <= p.MaxUploads {
Member

Why don't just for uploadsCount < p.MaxUploads { ?

Why don't just `for uploadsCount < p.MaxUploads {` ?
Author
Member

I need to check next element to determine Truncated state. Extracted this check outside the loop

I need to check next element to determine Truncated state. Extracted this check outside the loop
dkirillov marked this conversation as resolved
@ -515,0 +531,4 @@
last = *info
}
upload := uploadInfoFromMultipartInfo(info, p.Prefix, p.Delimiter)
if upload != nil {
Member

It seems now we can skip checking prefix inside uploadInfoFromMultipart and result always be non nil

It seems now we can skip checking prefix inside `uploadInfoFromMultipart` and result always be non nil
Member

This isn't changed

This isn't changed
Author
Member

Without checking there is a fail in a test - a call with prefix "/my" returns also item with "/zzz" prefix. Because GetSubTreeStream trims prefix to "/"

Without checking there is a fail in a [test](https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/branch/master/api/handler/multipart_upload_test.go#L262-L267) - a call with prefix "/my" returns also item with "/zzz" prefix. Because GetSubTreeStream trims prefix to "/"
Member

Then we should fix streaming in tree, because we it must return only nodes that have provided prefix. (/my in this case, and object /zzz/object/name3 doesn't have such prefix )

Then we should fix streaming in tree, because we it must return only nodes that have provided prefix. (`/my` in this case, and object `/zzz/object/name3` doesn't have such prefix )
@ -556,0 +574,4 @@
owner := n.BearerOwner(ctx)
cacheKey := cache.CreateListMultipartSessionCacheKey(p.Bkt.CID, p.Prefix, p.KeyMarker, p.UploadIDMarker)
session = n.cache.GetListMultipartSession(owner, cacheKey)
if session == nil {
Member

Why do this if differ from the similar from object listing?

if session == nil || session.Acquired.Swap(true) {

Why do this if differ from the similar from object listing? https://git.frostfs.info/nzinkevich/frostfs-s3-gw/src/commit/e673d727e9c9486e0b70127fb224ed58b6842fb3/api/layer/listing.go#L309
Author
Member

Changed method to be more similar

Changed method to be more similar
dkirillov marked this conversation as resolved
@ -556,0 +575,4 @@
cacheKey := cache.CreateListMultipartSessionCacheKey(p.Bkt.CID, p.Prefix, p.KeyMarker, p.UploadIDMarker)
session = n.cache.GetListMultipartSession(owner, cacheKey)
if session == nil {
ctx, cancel := context.WithCancel(ctx)
Member

We cannot use ctx from current request. It will be canceled after first request be finished

We cannot use `ctx` from current request. It will be canceled after first request be finished
dkirillov marked this conversation as resolved
@ -44,3 +44,3 @@
GetVersions(ctx context.Context, bktInfo *data.BucketInfo, objectName string) ([]*data.NodeVersion, error)
GetLatestVersion(ctx context.Context, bktInfo *data.BucketInfo, objectName string) (*data.NodeVersion, error)
InitVersionsByPrefixStream(ctx context.Context, bktInfo *data.BucketInfo, prefix string, latestOnly bool) (data.VersionsStream, error)
InitVersionsByPrefixStream(ctx context.Context, bktInfo *data.BucketInfo, prefix, treeID string, latestOnly bool) (data.VersionsStream, error)
Member

Why?

Why?
dkirillov marked this conversation as resolved
@ -900,28 +900,10 @@ func (s *DummySubTreeStream) Next() (NodeResponse, error) {
return s.data, nil
}
type MultiID []uint64
Member

Why did we move this to data package?

Why did we move this to `data` package?
dkirillov marked this conversation as resolved
@ -1084,0 +1084,4 @@
}, nil
}
func (c *Tree) InitMultipartsByPrefixStream(ctx context.Context, bktInfo *data.BucketInfo, prefix, treeID string, latestOnly bool) (data.VersionsStream, error) {
Member

This method isn't used

This method isn't used
dkirillov marked this conversation as resolved
Member

Please, rebase branch

Please, rebase branch
nzinkevich force-pushed multiparts_list_streaming from e673d727e9 to 9f4840528f 2024-11-11 12:14:53 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from 9f4840528f to c6dabf62bf 2024-11-11 12:25:42 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from c6dabf62bf to aaca4f84b8 2024-11-11 12:29:27 +00:00 Compare
dkirillov reviewed 2024-11-12 08:55:27 +00:00
@ -556,0 +577,4 @@
cacheKey := cache.CreateListMultipartSessionCacheKey(p.Bkt.CID, p.Prefix, p.KeyMarker, p.UploadIDMarker)
session = n.cache.GetListMultipartSession(owner, cacheKey)
if session == nil || session.Acquired.Swap(true) {
ctx, cancel := context.WithCancel(context.Background())
Member

We also have to add AccessBox to this context to be able to get access to tree service. Otherwise currently we get

2024-11-12T11:52:19.895+0300    error   request failed  {"request_id": "ef743d2a-0f81-4f02-ae61-82afac3db746", "method": "ListMultipartUploads", "bucket": "test", "object": "", "description": "could not list multipart uploads", "user": "NbUgTSFvPmsRxmGeWpuuGeJUoRoi6PErcM", "error": "access denied: address s01.frostfs.devenv:8080: access denied: rpc error: code = Unknown desc = status: code = 2048 message = access to object operation denied", "status": 403}

See how this done for regular listing

if bd, err := middleware.GetBoxData(ctx); err == nil {

We also have to add AccessBox to this context to be able to get access to tree service. Otherwise currently we get ``` 2024-11-12T11:52:19.895+0300 error request failed {"request_id": "ef743d2a-0f81-4f02-ae61-82afac3db746", "method": "ListMultipartUploads", "bucket": "test", "object": "", "description": "could not list multipart uploads", "user": "NbUgTSFvPmsRxmGeWpuuGeJUoRoi6PErcM", "error": "access denied: address s01.frostfs.devenv:8080: access denied: rpc error: code = Unknown desc = status: code = 2048 message = access to object operation denied", "status": 403} ``` See how this done for regular listing https://git.frostfs.info/nzinkevich/frostfs-s3-gw/src/commit/aaca4f84b8ba29031481c54f6bdcae90909287a0/api/layer/listing.go#L324
Author
Member

Fixed

Fixed
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 08:58:16 +00:00
@ -74,3 +74,3 @@
}
commonVersionsListingParams struct {
commonListingParams struct {
Member

Why do we change this?

It seems we don't use it for multipart listing

Why do we change this? It seems we don't use it for multipart listing
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 08:59:34 +00:00
@ -80,3 +80,3 @@
MaxKeys int
Marker string
Bookmark string
// key to store session in cache
Member

Not only. It also contains Marker for listing v1 and ContinuationToken for listing v2

Not only. It also contains `Marker` for listing v1 and `ContinuationToken` for listing v2
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 09:00:38 +00:00
@ -0,0 +31,4 @@
// DefaultListMultipartSessionCacheLifetime is a default lifetime of entries in cache of ListMultipartUploads.
DefaultListMultipartSessionCacheLifetime = time.Second * 60
// DefaultListMultipartSessionCacheSize is a default size of cache of ListMultipartUploads.
DefaultListMultipartSessionCacheSize = 100
Member

If we add new cache we also should be able configure this

If we add new cache we also should be able configure this
Author
Member

Added config params

Added config params
Member
Please mention these params in https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/branch/master/config/config.yaml and https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/branch/master/config/config.env Also we can write `time.Minute` instead of `time.Second * 60`
dkirillov reviewed 2024-11-12 09:20:32 +00:00
@ -540,3 +553,1 @@
uploads = uploads[:p.MaxUploads]
result.NextUploadIDMarker = uploads[len(uploads)-1].UploadID
result.NextKeyMarker = uploads[len(uploads)-1].Key
result.NextUploadIDMarker = info.UploadID
Member

It seems here we must use next rather than info.
Please add tests for that case and others

It seems here we must use `next` rather than `info`. Please add tests for that case and others
Author
Member

In previous implementation NextKeyMarker and NextUploadIDMarker was brought from the last element of current list. And the new one acts the same. So using info in this case is correct. Anyway, I'm going to write tests for this

In previous implementation `NextKeyMarker` and `NextUploadIDMarker` was brought from the last element of current list. And the new one acts the same. So using `info` in this case is correct. Anyway, I'm going to write tests for this
Member

Well, maybe I point to different error. Output sometime doesn't contain NextUploadIDMarker

$ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084   list-multipart-uploads --bucket test   | grep UploadId
            "UploadId": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9",
            "UploadId": "8fd7e720-a5c4-4e56-86c6-dc99146199c7",
            "UploadId": "8f60cdb1-7e62-41bb-98c1-567498be6dc2",
            "UploadId": "75988c0a-d94f-4667-9087-056af63acefc",
            "UploadId": "dda0fa13-cc27-4c52-aba6-2aa561d39b19",


$ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084   list-multipart-uploads --bucket test   --max-uploads 1
{
    "Bucket": "test",
    "KeyMarker": "",
    "NextKeyMarker": "dir/dir/obj",
    "Prefix": "",
    "NextUploadIdMarker": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9",
    "MaxUploads": 1,
    "IsTruncated": true,
    "Uploads": [
        {
            "UploadId": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9",
            "Key": "dir/dir/obj",
            "Initiated": "2024-11-12T09:34:43+00:00",
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt",
                "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt"
            },
            "Initiator": {
                "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt",
                "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt"
            }
        }
    ]
}

$ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084   list-multipart-uploads --bucket test   --max-uploads 1 --key-marker dir/dir/obj --upload-id-marker bd50e12d-fc4b-450b-afb1-ed082f5e2ef9
{
    "Bucket": "test",
    "KeyMarker": "dir/dir/obj",
    "UploadIdMarker": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9",
    "Prefix": "",
    "MaxUploads": 1,
    "IsTruncated": false,
    "Uploads": [
        {
            "UploadId": "8fd7e720-a5c4-4e56-86c6-dc99146199c7",
            "Key": "dir/dir/obj",
            "Initiated": "2024-11-08T14:21:51+00:00",
            "StorageClass": "STANDARD",
            "Owner": {
                "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt",
                "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt"
            },
            "Initiator": {
                "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt",
                "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt"
            }
        }
    ]
}

Well, maybe I point to different error. Output sometime doesn't contain NextUploadIDMarker ``` $ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084 list-multipart-uploads --bucket test | grep UploadId "UploadId": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9", "UploadId": "8fd7e720-a5c4-4e56-86c6-dc99146199c7", "UploadId": "8f60cdb1-7e62-41bb-98c1-567498be6dc2", "UploadId": "75988c0a-d94f-4667-9087-056af63acefc", "UploadId": "dda0fa13-cc27-4c52-aba6-2aa561d39b19", $ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084 list-multipart-uploads --bucket test --max-uploads 1 { "Bucket": "test", "KeyMarker": "", "NextKeyMarker": "dir/dir/obj", "Prefix": "", "NextUploadIdMarker": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9", "MaxUploads": 1, "IsTruncated": true, "Uploads": [ { "UploadId": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9", "Key": "dir/dir/obj", "Initiated": "2024-11-12T09:34:43+00:00", "StorageClass": "STANDARD", "Owner": { "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt", "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt" }, "Initiator": { "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt", "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt" } } ] } $ AWS_MAX_ATTEMPTS=1 aws s3api --no-verify-ssl --endpoint http://localhost:8084 list-multipart-uploads --bucket test --max-uploads 1 --key-marker dir/dir/obj --upload-id-marker bd50e12d-fc4b-450b-afb1-ed082f5e2ef9 { "Bucket": "test", "KeyMarker": "dir/dir/obj", "UploadIdMarker": "bd50e12d-fc4b-450b-afb1-ed082f5e2ef9", "Prefix": "", "MaxUploads": 1, "IsTruncated": false, "Uploads": [ { "UploadId": "8fd7e720-a5c4-4e56-86c6-dc99146199c7", "Key": "dir/dir/obj", "Initiated": "2024-11-08T14:21:51+00:00", "StorageClass": "STANDARD", "Owner": { "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt", "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt" }, "Initiator": { "ID": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt", "DisplayName": "NUUb82KR2JrVByHs2YSKgtK29gKnF5q6Vt" } } ] } ```
Author
Member

Fixed and added test for this scenario

Fixed and added test for this scenario
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 09:27:55 +00:00
@ -515,0 +521,4 @@
break
}
n.reqLogger(ctx).Warn(logs.CouldNotGetMultipartUploadInfo, zap.Error(err))
continue
Member

Probably we should return error rather than continue. And consider cancel context (also for regular listing) @alexvanin

Probably we should return error rather than continue. And consider cancel context (also for regular listing) @alexvanin
dkirillov reviewed 2024-11-12 09:40:50 +00:00
@ -553,6 +567,40 @@ func (n *Layer) ListMultipartUploads(ctx context.Context, p *ListMultipartUpload
return &result, nil
Member

Result must be sorted by upload-id if some uploads have the same key

Result must be sorted by upload-id if some uploads have the same key
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 09:56:15 +00:00
@ -1340,0 +1344,4 @@
}
func (c *Tree) GetMultipartUploadsByPrefix(ctx context.Context, bktInfo *data.BucketInfo, params data.MultipartStreamParams) (data.MultipartInfoStream, error) {
stream, err := c.getSubTreeByPrefixStream(ctx, bktInfo, systemTree, params.Prefix)
Member

Consider using approach like here

subTree, err := c.service.GetSubTreeStream(ctx, bktInfo, treeID, rootID, 2)


Otherwise if we will have objects:

  • dir/objdir1/obj
  • dir/objdir1/obj2
  • dir/obj1/obj1
  • ...
  • dir/obj1/obj1000000

and request will contain prefix: dir/objdir we will list that greater than 1000000 objects and just filter them (but we will get them anyway from storage) despite we are interested in only 2 of them

Consider using approach like here https://git.frostfs.info/nzinkevich/frostfs-s3-gw/src/commit/aaca4f84b8ba29031481c54f6bdcae90909287a0/pkg/service/tree/tree.go#L1114 Otherwise if we will have objects: * `dir/objdir1/obj` * `dir/objdir1/obj2` * `dir/obj1/obj1` * ... * `dir/obj1/obj1000000` and request will contain `prefix: dir/objdir` we will list that greater than 1000000 objects and just filter them (but we will get them anyway from storage) despite we are interested in only 2 of them
Author
Member

Changed the implementation. Now, at first, c.service.GetSubTreeStream(ctx, bktInfo, treeID, rootID, 2) is called, and then the GetSubTreeStream with max depth is applied to the children as needed. Which should prevents from getting redundant nodes

Changed the implementation. Now, at first, `c.service.GetSubTreeStream(ctx, bktInfo, treeID, rootID, 2)` is called, and then the `GetSubTreeStream` with max depth is applied to the children as needed. Which should prevents from getting redundant nodes
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 09:59:44 +00:00
@ -1253,6 +1211,26 @@ func formFilePath(node NodeResponse, fileName string, namesMap map[uint64]string
return filepath, nil
}
func formFilePathV2(node treeNode, filename string, namesMap map[uint64]*treeNode) (string, error) {
Member

It's better to use more meaningful name.
By the way why do we need separate function? And why do we use only 0th parentID below (namesMap[curNode.ParentID[0]])?

It's better to use more meaningful name. By the way why do we need separate function? And why do we use only 0th parentID below (`namesMap[curNode.ParentID[0]]`)?
Author
Member

It seems to me that multipart info can't be split so I use only first element
https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/branch/master/pkg/service/tree/tree.go#L332-L334

It seems to me that multipart info can't be split so I use only first element https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/branch/master/pkg/service/tree/tree.go#L332-L334
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-11-12 10:00:06 +00:00
@ -1329,0 +1313,4 @@
continue
}
m.nodeNames[tNode.ID[0]] = tNode
Member

Why do we use only tNode.ID[0]?

Why do we use only `tNode.ID[0]`?
dkirillov marked this conversation as resolved
nzinkevich force-pushed multiparts_list_streaming from aaca4f84b8 to fe9d7fbe7d 2024-11-15 15:20:34 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from fe9d7fbe7d to 844b4d92be 2024-11-15 15:21:10 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from 844b4d92be to 4a32fde2bf 2024-11-15 15:37:56 +00:00 Compare
dkirillov reviewed 2024-11-21 08:26:19 +00:00
@ -531,2 +542,2 @@
if p.UploadIDMarker != "" {
uploads = trimAfterUploadIDAndKey(p.KeyMarker, p.UploadIDMarker, uploads)
isTruncated := true
next, err := session.Stream.Next(ctx)
Member

I'm not sure if it's ok to separate this invocation. At least as it's done now.
See test:

diff --git a/api/handler/multipart_upload_test.go b/api/handler/multipart_upload_test.go
index 6836fc5d..dcff5e5b 100644
--- a/api/handler/multipart_upload_test.go
+++ b/api/handler/multipart_upload_test.go
@@ -266,6 +266,30 @@ func TestListMultipartUploads(t *testing.T) {
                require.Equal(t, uploadInfo2.UploadID, listUploads.Uploads[1].UploadID)
        })
 
+       t.Run("check delimiter", func(t *testing.T) {
+               t.Run("not truncated", func(t *testing.T) {
+                       listUploads := listMultipartUploads(hc, bktName, "/", "/", "", "", -1)
+                       require.Len(t, listUploads.Uploads, 0)
+                       require.Len(t, listUploads.CommonPrefixes, 2)
+                       require.Equal(t, "/my/", listUploads.CommonPrefixes[0].Prefix)
+                       require.Equal(t, "/zzz/", listUploads.CommonPrefixes[1].Prefix)
+               })
+
+               t.Run("truncated", func(t *testing.T) {
+                       listUploads := listMultipartUploads(hc, bktName, "/", "/", "", "", 1)
+                       require.Len(t, listUploads.Uploads, 0)
+                       require.Len(t, listUploads.CommonPrefixes, 1)
+                       require.Equal(t, "/my/", listUploads.CommonPrefixes[0].Prefix)
+                       require.True(t, listUploads.IsTruncated)
+
+                       listUploads = listMultipartUploads(hc, bktName, "/", "/", listUploads.NextUploadIDMarker, listUploads.NextKeyMarker, 1)
+                       require.Len(t, listUploads.Uploads, 0)
+                       require.Len(t, listUploads.CommonPrefixes, 1)
+                       require.Equal(t, "/zzz/", listUploads.CommonPrefixes[0].Prefix)
+                       require.False(t, listUploads.IsTruncated)
+               })
+       })
+

I'm not sure if it's ok to separate this invocation. At least as it's done now. See test: ```diff diff --git a/api/handler/multipart_upload_test.go b/api/handler/multipart_upload_test.go index 6836fc5d..dcff5e5b 100644 --- a/api/handler/multipart_upload_test.go +++ b/api/handler/multipart_upload_test.go @@ -266,6 +266,30 @@ func TestListMultipartUploads(t *testing.T) { require.Equal(t, uploadInfo2.UploadID, listUploads.Uploads[1].UploadID) }) + t.Run("check delimiter", func(t *testing.T) { + t.Run("not truncated", func(t *testing.T) { + listUploads := listMultipartUploads(hc, bktName, "/", "/", "", "", -1) + require.Len(t, listUploads.Uploads, 0) + require.Len(t, listUploads.CommonPrefixes, 2) + require.Equal(t, "/my/", listUploads.CommonPrefixes[0].Prefix) + require.Equal(t, "/zzz/", listUploads.CommonPrefixes[1].Prefix) + }) + + t.Run("truncated", func(t *testing.T) { + listUploads := listMultipartUploads(hc, bktName, "/", "/", "", "", 1) + require.Len(t, listUploads.Uploads, 0) + require.Len(t, listUploads.CommonPrefixes, 1) + require.Equal(t, "/my/", listUploads.CommonPrefixes[0].Prefix) + require.True(t, listUploads.IsTruncated) + + listUploads = listMultipartUploads(hc, bktName, "/", "/", listUploads.NextUploadIDMarker, listUploads.NextKeyMarker, 1) + require.Len(t, listUploads.Uploads, 0) + require.Len(t, listUploads.CommonPrefixes, 1) + require.Equal(t, "/zzz/", listUploads.CommonPrefixes[0].Prefix) + require.False(t, listUploads.IsTruncated) + }) + }) + ```
Author
Member

Modified a bit this test (because there are two objects in /my/ folder so this common prefix will appear twice and only on the third call it will be /zzz/. And fixed this scenario. But I'm not quite sure whether I understand the problem with separate invocations you proposed, could you clarify a bit?

Modified a bit this test (because there are two objects in `/my/` folder so this common prefix will appear twice and only on the third call it will be `/zzz/`. And fixed this scenario. But I'm not quite sure whether I understand the problem with `separate invocations` you proposed, could you clarify a bit?
dkirillov reviewed 2024-11-21 09:17:24 +00:00
@ -1340,0 +1316,4 @@
if filePath, err = formFilePath(tNode, fileName, m.nodePaths); err != nil {
if errors.Is(err, errParentPathNotFound) {
filePath = fileName
m.nodePaths[tNode.ID[0]] = filePath
Member

I suppose we should check if tNode.IsSplit() and skip if necessary. And only after that use tNode.ID[0]

I suppose we should check if `tNode.IsSplit()` and skip if necessary. And only after that use `tNode.ID[0]`
dkirillov reviewed 2024-11-21 09:20:02 +00:00
@ -1340,3 +1341,4 @@
if err != nil {
return nil, err
}
if node.ID[0] == m.rootID[0] {
Member

Please, use the similar check

if !s.rootID.Equal(node.GetNodeID()) && strings.HasPrefix(getFilename(node), s.tailPrefix) {

Please, use the similar check https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/commit/a12fea8a5b1e72cdd8cc10eb7278033305a24a3b/pkg/service/tree/tree.go#L987
dkirillov reviewed 2024-11-21 09:23:04 +00:00
@ -1344,3 +1355,1 @@
var parentPrefix string
if parentFilePath != "" { // The root of subTree can also have a parent
parentPrefix = strings.TrimSuffix(parentFilePath, separator) + separator // To avoid 'foo//bar'
func (m *multipartInfoStream) getTreeNodeFromStream(stream SubTreeStream) (*treeNode, error) {
Member

This function should be
func getTreeNodeFromStream(stream SubTreeStream) (*treeNode, error)
or
func (m *multipartInfoStream) getTreeNodeFromStream() (*treeNode, error)

This function should be `func getTreeNodeFromStream(stream SubTreeStream) (*treeNode, error)` or `func (m *multipartInfoStream) getTreeNodeFromStream() (*treeNode, error)`
dkirillov reviewed 2024-11-21 09:25:01 +00:00
@ -515,0 +520,4 @@
for uploadsCount < p.MaxUploads {
info, err = session.Stream.Next(ctx)
if err != nil {
if err == io.EOF {
Member

I would use errors.Is(err, io.EOF). In service/tree also

I would use `errors.Is(err, io.EOF)`. In `service/tree` also
dkirillov reviewed 2024-11-21 09:28:18 +00:00
@ -1340,0 +1300,4 @@
tNode, err = m.getTreeNodeFromStream(m.currentStream)
if err != nil {
if err == io.EOF {
var err2 error
Member

Do we really need new variable err2?
It seems we can write

if m.currentStream, err = m.openNewStream(ctx); err != nil {
	return nil, err
}

because of continue below

Do we really need new variable `err2`? It seems we can write ```golang if m.currentStream, err = m.openNewStream(ctx); err != nil { return nil, err } ``` because of `continue` below
dkirillov reviewed 2024-11-21 09:29:22 +00:00
@ -1340,0 +1309,4 @@
return nil, err
}
var ok bool
fileName, ok = tNode.FileName()
Member

Why not just fileName, ok := tNode.FileName() ?

Why not just `fileName, ok := tNode.FileName()` ?
dkirillov reviewed 2024-11-21 09:42:26 +00:00
@ -1340,0 +1314,4 @@
continue
}
if filePath, err = formFilePath(tNode, fileName, m.nodePaths); err != nil {
if errors.Is(err, errParentPathNotFound) {
Member

Why do we treat this error special? It seems normally we traverse node in right order and if we don't see parent it's a bug in storage node. Root node filepath we can set initially to this map

Why do we treat this error special? It seems normally we traverse node in right order and if we don't see parent it's a bug in storage node. Root node filepath we can set initially to this map
dkirillov reviewed 2024-11-21 09:45:30 +00:00
@ -1340,0 +1326,4 @@
continue
}
if id, ok := tNode.Meta[uploadIDKV]; ok {
if m.keyMarker == "" || filePath > m.keyMarker || (filePath == m.keyMarker && m.uploadID != "" && id > m.uploadID) {
Member

It seems we can simplify this to

if m.keyMarker < filePath || (m.keyMarker == filePath && m.uploadID < id) {
It seems we can simplify this to ```golang if m.keyMarker < filePath || (m.keyMarker == filePath && m.uploadID < id) { ```
nzinkevich force-pushed multiparts_list_streaming from 4a32fde2bf to 5586b8fb96 2024-11-21 12:00:32 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from 5586b8fb96 to cea8d4b324 2024-11-21 14:05:27 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from cea8d4b324 to 88c4c117e5 2024-11-22 15:18:27 +00:00 Compare
nzinkevich force-pushed multiparts_list_streaming from 88c4c117e5 to 0f5a2e0a15 2024-11-22 15:19:27 +00:00 Compare
Member

We have some problem with streaming #561

We have some problem with streaming https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/issues/561
All checks were successful
/ DCO (pull_request) Successful in 1m49s
Required
Details
/ Builds (pull_request) Successful in 2m14s
Required
Details
/ Vulncheck (pull_request) Successful in 2m23s
Required
Details
/ Lint (pull_request) Successful in 3m24s
Required
Details
/ Tests (pull_request) Successful in 2m17s
Required
Details
This pull request has changes conflicting with the target branch.
  • api/handler/handlers_test.go
  • api/layer/cache.go
  • cmd/s3-gw/app.go
  • cmd/s3-gw/app_settings.go
  • docs/configuration.md
  • pkg/service/tree/tree_test.go
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u multiparts_list_streaming:nzinkevich-multiparts_list_streaming
git checkout nzinkevich-multiparts_list_streaming
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-services-committers
TrueCloudLab/storage-services-developers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-s3-gw#527
No description provided.