[#585] ListBuckets pagination support #591

Merged
alexvanin merged 2 commits from nzinkevich/frostfs-s3-gw:feature/list_buckets_pagination into master 2025-01-21 07:49:20 +00:00
Member

Add support for new ListBuckets query params for pagination. To test, use --prefix, --bucket-region, --continuation-token, --max-buckets flags in aws cli.

Add support for new ListBuckets query params for pagination. To test, use `--prefix`, `--bucket-region`, `--continuation-token`, `--max-buckets` flags in `aws cli`.
nzinkevich self-assigned this 2024-12-20 06:59:10 +00:00
nzinkevich requested review from storage-services-developers 2024-12-20 06:59:10 +00:00
nzinkevich requested review from storage-services-committers 2024-12-20 06:59:10 +00:00
nzinkevich force-pushed feature/list_buckets_pagination from fa66f9a135 to 8466f66067 2024-12-20 07:05:14 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from 8466f66067 to 9789edd296 2024-12-20 11:38:48 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from 9789edd296 to d11ace207a 2024-12-20 11:39:22 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from d11ace207a to fdbd3823bb 2024-12-20 12:14:25 +00:00 Compare
dkirillov reviewed 2024-12-20 13:24:31 +00:00
@ -299,3 +299,2 @@
func createBucket(hc *handlerContext, bktName string) *createBucketInfo {
box, key := createAccessBox(hc.t)
func createBucket(hc *handlerContext, prm bucketPrm) *createBucketInfo {
Member

Consider params like:

func createBucket(hc *handlerContext, bktName string, locationConstraint string) *createBucketInfo {

or even don't change this function but create new one

func createBucketWithContraint(hc *handlerContext, bktName string, locationConstraint string) *createBucketInfo {
Consider params like: ```golang func createBucket(hc *handlerContext, bktName string, locationConstraint string) *createBucketInfo { ``` or even don't change this function but create new one ```golang func createBucketWithContraint(hc *handlerContext, bktName string, locationConstraint string) *createBucketInfo { ```
nzinkevich marked this conversation as resolved
@ -0,0 +15,4 @@
var (
own user.ID
res *ListBucketsResponse
ctx = r.Context()
Member

nitpick: Please use

diff --git a/api/handler/bucket_list.go b/api/handler/bucket_list.go
index 1b067a18..35e2ce7e 100644
--- a/api/handler/bucket_list.go
+++ b/api/handler/bucket_list.go
@@ -12,12 +12,8 @@ import (
 
 // ListBucketsHandler handles bucket listing requests.
 func (h *handler) ListBucketsHandler(w http.ResponseWriter, r *http.Request) {
-       var (
-               own     user.ID
-               res     *ListBucketsResponse
-               ctx     = r.Context()
-               reqInfo = middleware.GetReqInfo(ctx)
-       )
+       ctx := r.Context()
+       reqInfo := middleware.GetReqInfo(ctx)
 
        params, err := parseListBucketParams(r)
        if err != nil {
@@ -31,11 +27,12 @@ func (h *handler) ListBucketsHandler(w http.ResponseWriter, r *http.Request) {
                return
        }
 
+       var own user.ID
        if len(result.Containers) > 0 {
                own = result.Containers[0].Owner
        }
 
-       res = &ListBucketsResponse{
+       res := &ListBucketsResponse{
                Owner: Owner{
                        ID:          own.String(),
                        DisplayName: own.String(),
nitpick: Please use ```diff diff --git a/api/handler/bucket_list.go b/api/handler/bucket_list.go index 1b067a18..35e2ce7e 100644 --- a/api/handler/bucket_list.go +++ b/api/handler/bucket_list.go @@ -12,12 +12,8 @@ import ( // ListBucketsHandler handles bucket listing requests. func (h *handler) ListBucketsHandler(w http.ResponseWriter, r *http.Request) { - var ( - own user.ID - res *ListBucketsResponse - ctx = r.Context() - reqInfo = middleware.GetReqInfo(ctx) - ) + ctx := r.Context() + reqInfo := middleware.GetReqInfo(ctx) params, err := parseListBucketParams(r) if err != nil { @@ -31,11 +27,12 @@ func (h *handler) ListBucketsHandler(w http.ResponseWriter, r *http.Request) { return } + var own user.ID if len(result.Containers) > 0 { own = result.Containers[0].Owner } - res = &ListBucketsResponse{ + res := &ListBucketsResponse{ Owner: Owner{ ID: own.String(), DisplayName: own.String(), ```
nzinkevich marked this conversation as resolved
@ -0,0 +35,4 @@
own = result.Containers[0].Owner
}
res = &ListBucketsResponse{
Member

Let's extract this to method/function encodeListbuckets or something like that (similar to object listing handlers).

Let's extract this to method/function `encodeListbuckets` or something like that (similar to object listing handlers).
nzinkevich marked this conversation as resolved
@ -0,0 +38,4 @@
res = &ListBucketsResponse{
Owner: Owner{
ID: own.String(),
DisplayName: own.String(),
Member

Probably we can use reqInfo.User

Probably we can use `reqInfo.User`
nzinkevich marked this conversation as resolved
@ -0,0 +58,4 @@
}
func parseListBucketParams(r *http.Request) (params layer.ListBucketParams, err error) {
strMaxBuckets := r.URL.Query().Get("max-buckets")
Member

It's better to use consts.
Probably even from middleware (middleware.QueryMaxBuckets)

It's better to use consts. Probably even from middleware (`middleware.QueryMaxBuckets`)
nzinkevich marked this conversation as resolved
@ -0,0 +60,4 @@
func parseListBucketParams(r *http.Request) (params layer.ListBucketParams, err error) {
strMaxBuckets := r.URL.Query().Get("max-buckets")
if strMaxBuckets != "" {
if params.MaxBuckets, err = strconv.Atoi(r.URL.Query().Get("max-buckets")); err != nil {
Member

We must check parsed value for negativeness and add test for that. Otherwise we will get panic, see:

diff --git a/api/handler/bucket_list_test.go b/api/handler/bucket_list_test.go
index b6c700b1..b4fc24e9 100644
--- a/api/handler/bucket_list_test.go
+++ b/api/handler/bucket_list_test.go
@@ -94,7 +94,7 @@ func TestHandler_ListBucketsHandler(t *testing.T) {
 
        t.Run("pagination", func(t *testing.T) {
                t.Run("happy path", func(t *testing.T) {
-                       params := bucketPrm{query: url.Values{"max-buckets": []string{"1"}}}
+                       params := bucketPrm{query: url.Values{"max-buckets": []string{"-1"}}}
                        resp := listBuckets(hc, params)
                        require.Len(t, resp.Buckets.Buckets, 1)
                        require.NotEmpty(t, resp.ContinuationToken)

We must check parsed value for negativeness and add test for that. Otherwise we will get panic, see: ```diff diff --git a/api/handler/bucket_list_test.go b/api/handler/bucket_list_test.go index b6c700b1..b4fc24e9 100644 --- a/api/handler/bucket_list_test.go +++ b/api/handler/bucket_list_test.go @@ -94,7 +94,7 @@ func TestHandler_ListBucketsHandler(t *testing.T) { t.Run("pagination", func(t *testing.T) { t.Run("happy path", func(t *testing.T) { - params := bucketPrm{query: url.Values{"max-buckets": []string{"1"}}} + params := bucketPrm{query: url.Values{"max-buckets": []string{"-1"}}} resp := listBuckets(hc, params) require.Len(t, resp.Buckets.Buckets, 1) require.NotEmpty(t, resp.ContinuationToken) ```
nzinkevich marked this conversation as resolved
@ -0,0 +120,4 @@
})
}
func listBuckets(hc *handlerContext, prm bucketPrm) ListBucketsResponse {
Member

Consider using function signature like in listObjectsV1. Or at least create new param (don't reuse bucketPrm since a half of that fields aren't used in this function)

Consider using function signature like in `listObjectsV1`. Or at least create new param (don't reuse `bucketPrm` since a half of that fields aren't used in this function)
nzinkevich marked this conversation as resolved
@ -397,0 +406,4 @@
bktName string
query url.Values
box *accessbox.Box
key *keys.PrivateKey
Member

Why do we need box and key in this struct if in createBucket we always overwrite them?

Why do we need `box` and `key` in this struct if in `createBucket` we always overwrite them?
Author
Member

Remove key from params but box is needed for createBucketBase and for now is not overwritten

Remove `key` from params but `box` is needed for `createBucketBase` and for now is not overwritten
dkirillov marked this conversation as resolved
@ -397,0 +407,4 @@
query url.Values
box *accessbox.Box
key *keys.PrivateKey
body any
Member

This should be more specific

This should be more specific
nzinkevich marked this conversation as resolved
@ -105,0 +110,4 @@
if info.CID.EncodeToString() == listParams.ContinuationToken {
tokenMet = true
}
if shouldSkipBucket(tokenMet, info, listParams) {
Member

It seems before skip bucket we should list all and sort by bucket name

It seems before skip bucket we should list all and sort by bucket name
nzinkevich marked this conversation as resolved
@ -79,3 +114,3 @@
objectErrors map[string]error
objectPutErrors map[string]error
containers map[string]*container.Container
containers containers
Member

Why do we need this change?

Why do we need this change?
Author
Member

I wanted to test list-bucket pagination, but map[string]*container.Container in mock has undetermined iterate order. When I call ListBucketHandler sequentially with continuation token, it's very likely that the test will fail

UPD: if we sorting containers now, it is not necessary anymore

I wanted to test list-bucket pagination, but `map[string]*container.Container` in mock has undetermined iterate order. When I call ListBucketHandler sequentially with continuation token, it's very likely that the test will fail UPD: if we sorting containers now, it is not necessary anymore
nzinkevich marked this conversation as resolved
@ -376,0 +387,4 @@
var result ListBucketResult
var err error
result.Containers, err = n.containerList(ctx, params)
Member

Before listing we should check max-buckets. It it's 0 then just return empty result

Before listing we should check max-buckets. It it's 0 then just return empty result
nzinkevich marked this conversation as resolved
nzinkevich force-pushed feature/list_buckets_pagination from fdbd3823bb to 19c771bb97 2024-12-25 14:18:21 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from 19c771bb97 to 8fb6139774 2024-12-25 14:20:57 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from 8fb6139774 to 601ff75d71 2024-12-26 12:36:35 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from 601ff75d71 to e0fa9e27d0 2024-12-26 12:37:07 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from e0fa9e27d0 to a2a84b398d 2024-12-26 14:24:29 +00:00 Compare
nzinkevich force-pushed feature/list_buckets_pagination from a2a84b398d to 29a7e7908f 2025-01-09 13:45:45 +00:00 Compare
dkirillov reviewed 2025-01-10 06:40:50 +00:00
@ -108,0 +113,4 @@
var tokenCID cid.ID
tokenMet := false
if listParams.ContinuationToken == "" {
Member

Despite that in AWS ContinuationToken is not a real key I suppose we can use real key. Technically the bevavior be the same. So I suggest consider this:

diff --git a/api/layer/container.go b/api/layer/container.go
index 83d312f1..9006f700 100644
--- a/api/layer/container.go
+++ b/api/layer/container.go
@@ -15,7 +15,6 @@ import (
 	"git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/logs"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/client"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container"
-	cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
 	"go.uber.org/zap"
 )
 
@@ -104,6 +103,11 @@ func (n *Layer) containerList(ctx context.Context, listParams ListBucketsParams)
 			n.reqLogger(ctx).Error(logs.CouldNotFetchContainerInfo, zap.Error(err))
 			continue
 		}
+
+		if shouldSkipBucket(info, listParams) {
+			continue
+		}
+
 		list = append(list, info)
 	}
 
@@ -111,34 +115,18 @@ func (n *Layer) containerList(ctx context.Context, listParams ListBucketsParams)
 		return list[i].Name < list[j].Name
 	})
 
-	var tokenCID cid.ID
-	tokenMet := false
-	if listParams.ContinuationToken == "" {
-		// skip token check if empty
-		tokenMet = true
-	} else if err = tokenCID.DecodeString(listParams.ContinuationToken); err != nil {
-		return nil, err
-	}
-
-	k := 0
-	count := len(list)
-	for i := 0; i < count; i++ {
-		if list[k].CID.Equals(tokenCID) {
-			tokenMet = true
-		}
-		if shouldSkipBucket(tokenMet, list[k], listParams) {
-			list = append(list[:k], list[k+1:]...)
+	for i, info := range list {
+		if info.Name != listParams.ContinuationToken && listParams.ContinuationToken != "" {
 			continue
 		}
-		k++
+		return list[i:], nil
 	}
 
-	return list, nil
+	return nil, nil
 }
 
-func shouldSkipBucket(tokenMet bool, info *data.BucketInfo, prm ListBucketsParams) bool {
-	if !tokenMet ||
-		!strings.HasPrefix(info.Name, prm.Prefix) ||
+func shouldSkipBucket(info *data.BucketInfo, prm ListBucketsParams) bool {
+	if !strings.HasPrefix(info.Name, prm.Prefix) ||
 		(prm.BucketRegion != "" && info.LocationConstraint != prm.BucketRegion) {
 		return true
 	}
diff --git a/api/layer/layer.go b/api/layer/layer.go
index 2ff74da1..643461d4 100644
--- a/api/layer/layer.go
+++ b/api/layer/layer.go
@@ -396,7 +396,7 @@ func (n *Layer) ListBuckets(ctx context.Context, params ListBucketsParams) (List
 		return ListBucketsResult{}, err
 	}
 	if params.MaxBuckets != 0 && len(result.Containers) > params.MaxBuckets {
-		result.ContinuationToken = result.Containers[params.MaxBuckets].CID.EncodeToString()
+		result.ContinuationToken = result.Containers[params.MaxBuckets].Name
 		result.Containers = result.Containers[:params.MaxBuckets]
 	}
 

Despite that in AWS `ContinuationToken` is not a real key I suppose we can use real key. Technically the bevavior be the same. So I suggest consider this: ```diff diff --git a/api/layer/container.go b/api/layer/container.go index 83d312f1..9006f700 100644 --- a/api/layer/container.go +++ b/api/layer/container.go @@ -15,7 +15,6 @@ import ( "git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/logs" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/client" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container" - cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id" "go.uber.org/zap" ) @@ -104,6 +103,11 @@ func (n *Layer) containerList(ctx context.Context, listParams ListBucketsParams) n.reqLogger(ctx).Error(logs.CouldNotFetchContainerInfo, zap.Error(err)) continue } + + if shouldSkipBucket(info, listParams) { + continue + } + list = append(list, info) } @@ -111,34 +115,18 @@ func (n *Layer) containerList(ctx context.Context, listParams ListBucketsParams) return list[i].Name < list[j].Name }) - var tokenCID cid.ID - tokenMet := false - if listParams.ContinuationToken == "" { - // skip token check if empty - tokenMet = true - } else if err = tokenCID.DecodeString(listParams.ContinuationToken); err != nil { - return nil, err - } - - k := 0 - count := len(list) - for i := 0; i < count; i++ { - if list[k].CID.Equals(tokenCID) { - tokenMet = true - } - if shouldSkipBucket(tokenMet, list[k], listParams) { - list = append(list[:k], list[k+1:]...) + for i, info := range list { + if info.Name != listParams.ContinuationToken && listParams.ContinuationToken != "" { continue } - k++ + return list[i:], nil } - return list, nil + return nil, nil } -func shouldSkipBucket(tokenMet bool, info *data.BucketInfo, prm ListBucketsParams) bool { - if !tokenMet || - !strings.HasPrefix(info.Name, prm.Prefix) || +func shouldSkipBucket(info *data.BucketInfo, prm ListBucketsParams) bool { + if !strings.HasPrefix(info.Name, prm.Prefix) || (prm.BucketRegion != "" && info.LocationConstraint != prm.BucketRegion) { return true } diff --git a/api/layer/layer.go b/api/layer/layer.go index 2ff74da1..643461d4 100644 --- a/api/layer/layer.go +++ b/api/layer/layer.go @@ -396,7 +396,7 @@ func (n *Layer) ListBuckets(ctx context.Context, params ListBucketsParams) (List return ListBucketsResult{}, err } if params.MaxBuckets != 0 && len(result.Containers) > params.MaxBuckets { - result.ContinuationToken = result.Containers[params.MaxBuckets].CID.EncodeToString() + result.ContinuationToken = result.Containers[params.MaxBuckets].Name result.Containers = result.Containers[:params.MaxBuckets] } ```
dkirillov marked this conversation as resolved
@ -376,0 +395,4 @@
if err != nil {
return ListBucketsResult{}, err
}
if params.MaxBuckets != 0 && len(result.Containers) > params.MaxBuckets {
Member

Can be just

if len(result.Containers) > params.MaxBuckets {
Can be just ``` if len(result.Containers) > params.MaxBuckets { ```
nzinkevich marked this conversation as resolved
nzinkevich force-pushed feature/list_buckets_pagination from 29a7e7908f to 551b7343bd 2025-01-10 07:50:32 +00:00 Compare
dkirillov approved these changes 2025-01-10 09:52:27 +00:00
Dismissed
nzinkevich force-pushed feature/list_buckets_pagination from 551b7343bd to 7dd308b466 2025-01-16 13:15:49 +00:00 Compare
nzinkevich dismissed dkirillov's review 2025-01-16 13:15:49 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

dkirillov requested review from storage-services-developers 2025-01-17 06:12:52 +00:00
dkirillov requested review from storage-services-committers 2025-01-17 06:12:55 +00:00
dkirillov approved these changes 2025-01-17 06:15:00 +00:00
alexvanin approved these changes 2025-01-21 07:48:41 +00:00
alexvanin added this to the v0.33.0 milestone 2025-01-21 07:48:51 +00:00
alexvanin merged commit 619385836d into master 2025-01-21 07:49:20 +00:00
alexvanin deleted branch feature/list_buckets_pagination 2025-01-21 07:49:21 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-s3-gw#591
No description provided.