[#151] Index page for FrostFS containers #153

Open
nzinkevich wants to merge 1 commit from nzinkevich/frostfs-http-gw:feature/index_page_native into master
Member

Closes #151. Make index page work not only via s3-like requests like /get/<bucketName>/<prefix> but also with ordinary http-gw requests with cid and oid: /get/<cid>/<oid>. Native protocol index page returns when cid (as a FrostFS container ID) parses correctly but oid is empty, incorrect or not found

Closes #151. Make index page work not only via s3-like requests like `/get/<bucketName>/<prefix>` but also with ordinary http-gw requests with `cid` and `oid`: `/get/<cid>/<oid>`. Native protocol index page returns when `cid` (as a FrostFS container ID) parses correctly but `oid` is empty, incorrect or not found
nzinkevich added 2 commits 2024-10-11 13:46:41 +00:00
Signed-off-by: Nikita Zinkevich <n.zinkevich@yadro.com>
[#151] index page: Move getDirObjects handlers to browse.go
Some checks failed
/ DCO (pull_request) Failing after 1m2s
/ Vulncheck (pull_request) Successful in 1m30s
/ Builds (pull_request) Successful in 1m0s
/ Lint (pull_request) Successful in 2m20s
/ Tests (pull_request) Successful in 1m7s
7d4c03712a
nzinkevich changed title from feature/index_page_native to [#151] feature/index_page_native 2024-10-11 13:46:59 +00:00
nzinkevich requested review from storage-services-committers 2024-10-11 13:47:44 +00:00
nzinkevich requested review from storage-services-developers 2024-10-11 13:47:46 +00:00
nzinkevich changed title from [#151] feature/index_page_native to [#151] Index page for FrostFS containers 2024-10-11 13:49:13 +00:00
nzinkevich force-pushed feature/index_page_native from 7d4c03712a to 9c1842f899 2024-10-14 06:27:57 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 9c1842f899 to 33b3c8ac9d 2024-10-14 06:40:34 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 33b3c8ac9d to 675c76a9c8 2024-10-14 06:45:40 +00:00 Compare
alexvanin reviewed 2024-10-14 13:54:02 +00:00
@ -366,0 +369,4 @@
| Parameter | Type | SIGHUP reload | Default value | Description |
|------------------------|----------|---------------|---------------|----------------------------------------------------------------------------------------------|
| `enabled` | `bool` | yes | `false` | Flag to enable index_page return if no object with specified S3-name was found. |
| `template_path.s3` | `string` | yes | `""` | Path to .gotmpl file with html templates for S3 index_page (which is using tree service fr . |
Owner

Unfinished sentence in description?

Unfinished sentence in description?
Author
Member

I placed this info above the table, but forgot to delete it here. Fixed

I placed this info above the table, but forgot to delete it here. Fixed
nzinkevich force-pushed feature/index_page_native from 675c76a9c8 to 8845571b2c 2024-10-15 06:34:27 +00:00 Compare
dkirillov reviewed 2024-10-16 11:19:28 +00:00
@ -209,2 +210,3 @@
v.SetDefault(cfgIndexPageEnabled, false)
v.SetDefault(cfgIndexPageTemplatePath, "")
v.SetDefault(cfgIndexPageS3TemplatePath, "")
v.SetDefault(cfgIndexPageNativeTemplatePath, "")
Member

Actually we can skip setting such defaults.

Actually we can skip setting such defaults.
dkirillov marked this conversation as resolved
@ -41,0 +58,4 @@
func newListObjectsResponseNative(attrs map[string]string) ResponseObject {
return ResponseObject{
OID: attrs[attrOID],
Created: attrs[object.AttributeTimestamp] + "000",
Member

Maybe it's better to save to fields value in appropriate format and don't invoke any functions from template?

Maybe it's better to save to fields value in appropriate format and don't invoke any functions from template?
Member

Will we change this? It still invokes formatTimestamp from template

Will we change this? It still invokes `formatTimestamp` from template
dkirillov marked this conversation as resolved
@ -117,0 +172,4 @@
basePath := strings.TrimRightFunc(prefix, func(r rune) bool {
return r != '/'
})
objectIDs, err := h.search(ctx, bucketInfo.CID, object.AttributeFilePath, prefix, object.MatchCommonPrefix)
Member

Can we search all object if prefix is empty to be able to handle objects that have FileName rather than FilePath?

Can we search all object if prefix is empty to be able to handle objects that have `FileName` rather than `FilePath`?
Author
Member

Fixed

Fixed
Member

I meant write something like this:

filters := object.NewSearchFilters()
filters.AddRootFilter()

if prefix != "" {
    filters.AddFilter(object.AttributeFilePath, prefix, object.MatchCommonPrefix)
}
I meant write something like this: ```golang filters := object.NewSearchFilters() filters.AddRootFilter() if prefix != "" { filters.AddFilter(object.AttributeFilePath, prefix, object.MatchCommonPrefix) } ```
dkirillov marked this conversation as resolved
@ -117,0 +189,4 @@
const initialSliceCapacity = 100
var (
log = h.log.With(
Member

Please use := syntax where possible

Please use `:=` syntax where possible
dkirillov marked this conversation as resolved
@ -117,0 +222,4 @@
go func() {
defer wg.Done()
addr.SetObject(id)
Member

We cannot use addr from other goroutine.

We cannot use `addr` from other goroutine.
dkirillov marked this conversation as resolved
@ -117,0 +240,4 @@
dirname := getNextDir(attrs[object.AttributeFilePath], basePath)
if dirname == "" {
mu.Lock()
objects = append(objects, newListObjectsResponseNative(attrs))
Member

Why don't we use channel for this?

Why don't we use channel for this?
dkirillov marked this conversation as resolved
@ -117,0 +254,4 @@
}
}()
return false
Member

If I understand correctly, we stop iteration if any error is occurred. So here we should return true if context is canceled

If I understand correctly, we stop iteration if any error is occurred. So here we should `return true` if context is canceled
dkirillov marked this conversation as resolved
@ -28,2 +26,2 @@
err := id.DecodeString(test)
if err != nil {
testCID, _ := c.UserValue("cid").(string)
var cntID cid.ID
Member

We usually use cnrID name. So let's write:

cnrIDStr, _ := c.UserValue("cid").(string)
var cnrID cid.ID
We usually use `cnrID` name. So let's write: ```golang cnrIDStr, _ := c.UserValue("cid").(string) var cnrID cid.ID ```
dkirillov marked this conversation as resolved
@ -205,6 +218,28 @@ func (h *Handler) byAddress(c *fasthttp.RequestCtx, f func(context.Context, requ
addr.SetContainer(bktInfo.CID)
addr.SetObject(*objID)
_, err = h.frostfs.GetObject(ctx, PrmObjectGet{
Member

We do the same GetObject request in f function. It's better to use HeadObject.

By the way, do we really need this? If oid is actual object id why do we list objects?

We do the same `GetObject` request in `f` function. It's better to use `HeadObject`. By the way, do we really need this? If `oid` is actual object id why do we list objects?
Author
Member

We only checked before that idObjstring is decodable to oid type, but I think we should also output index page if no object found. That's why I try to GetObject (Head is better, I agree)

We only checked before that `idObj`string is decodable to oid type, but I think we should also output index page if no object found. That's why I try to `GetObject` (Head is better, I agree)
Member

I think we should also output index page if no object found

Could you explain why?

> I think we should also output index page if no object found Could you explain why?
Author
Member

Well, I thought the S3 index page worked similarly, but I forgot that I removed this feature in the final implementation. Apache also does not return an index page if page not found. So I agree, it shouldn't return index page in that case

Well, I thought the S3 index page worked similarly, but I forgot that I removed this feature in the final implementation. Apache also does not return an index page if page not found. So I agree, it shouldn't return index page in that case
dkirillov marked this conversation as resolved
@ -0,0 +1,101 @@
{{$container := .BucketInfo.CID}}
Member

Do we really need separate template?

Do we really need separate template?
dkirillov marked this conversation as resolved
nzinkevich force-pushed feature/index_page_native from 8845571b2c to 4fe34909aa 2024-10-18 08:14:48 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 4fe34909aa to be8b89c487 2024-10-18 08:18:38 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from be8b89c487 to e0a0f53fb8 2024-10-18 08:21:13 +00:00 Compare
dkirillov reviewed 2024-10-21 12:23:12 +00:00
@ -222,2 +221,4 @@
path := a.cfg.GetString(cfgIndexPageTemplatePath)
tmpl, err := a.readTemplate(path)
if err != nil {
a.settings.setIndexTemplate("")
Member

By the way, why do we set default template if we couldn't load new one? Maybe it's worth don't change current template?

By the way, why do we set default template if we couldn't load new one? Maybe it's worth don't change current template?
dkirillov marked this conversation as resolved
@ -117,0 +191,4 @@
zap.String("cid", bucketInfo.CID.EncodeToString()),
zap.String("prefix", prefix),
)
basePath := strings.TrimRightFunc(prefix, func(r rune) bool {
Member

Matter of taste but

	var basePath string
	if ind := strings.LastIndex(prefix, "/"); ind != -1 {
		basePath = prefix[:ind+1]
	}
Matter of taste but ```golang var basePath string if ind := strings.LastIndex(prefix, "/"); ind != -1 { basePath = prefix[:ind+1] } ```
dkirillov marked this conversation as resolved
@ -117,0 +250,4 @@
func (h *Handler) headDirObjects(ctx context.Context, p headDirParams) {
wg := sync.WaitGroup{}
dirs := sync.Map{}
Member

Please, write:

var wg sync.WaitGroup
var dirs sync.Map

Or

var (
    wg sync.WaitGroup
    dirs sync.Map
)
Please, write: ```golang var wg sync.WaitGroup var dirs sync.Map ``` Or ```golang var ( wg sync.WaitGroup dirs sync.Map ) ```
dkirillov marked this conversation as resolved
@ -117,0 +256,4 @@
err := p.objectIDs.Iterate(func(id oid.ID) bool {
wg.Add(1)
go func(id oid.ID) {
Member

It's better to restrict number of goroutines. See for example s3-gw listing

It's better to restrict number of goroutines. See for example s3-gw listing https://git.frostfs.info/TrueCloudLab/frostfs-s3-gw/src/commit/e35b582fe25a97c018367ef9244dfe93d4500ad9/api/layer/listing.go#L530
dkirillov marked this conversation as resolved
@ -194,3 +194,3 @@
idCnr, _ = c.UserValue("cid").(string)
idObj, _ = c.UserValue("oid").(string)
idObj, _ = url.PathUnescape(c.UserValue("oid").(string))
log = h.log.With(zap.String("cid", idCnr), zap.String("oid", idObj))
Member

Please, write:

		idCnr, _ := c.UserValue("cid").(string)
		idObj, _ := url.PathUnescape(c.UserValue("oid").(string))
		log      := h.log.With(zap.String("cid", idCnr), zap.String("oid", idObj))
Please, write: ```golang idCnr, _ := c.UserValue("cid").(string) idObj, _ := url.PathUnescape(c.UserValue("oid").(string)) log := h.log.With(zap.String("cid", idCnr), zap.String("oid", idObj)) ```
dkirillov marked this conversation as resolved
@ -207,2 +207,4 @@
objID := new(oid.ID)
if err = objID.DecodeString(idObj); err != nil {
if h.config.IndexPageEnabled() {
var addr oid.Address
Member

This variable is unused

This variable is unused
dkirillov marked this conversation as resolved
@ -61,3 +61,3 @@
return ""
}
func (c *configMock) IndexPageTemplate() string {
func (c *configMock) IndexPageNativeTemplate() string {
Member

Please, add new line

Please, add new line
dkirillov marked this conversation as resolved
@ -59,4 +72,3 @@
</tr>
{{end}}
{{range .Objects}}
<tr>
Member

I don't understand why do we need conditiond If $isNative. We show objects and dirs, in both cases objects and dirs have the same attibutes (name, size, timestamp, OID)

I don't understand why do we need conditiond `If $isNative`. We show objects and dirs, in both cases objects and dirs have the same attibutes (name, size, timestamp, OID)
Member

I meant something like

diff --git a/internal/handler/browse.go b/internal/handler/browse.go
index a2373d1..b55d349 100644
--- a/internal/handler/browse.go
+++ b/internal/handler/browse.go
@@ -45,12 +45,14 @@ type (
 		FilePath string
 		Size     string
 		IsDir    bool
+		GetURL   string
 	}
 )
 
 func newListObjectsResponseS3(attrs map[string]string) ResponseObject {
 	return ResponseObject{
 		Created:  formatTimestamp(attrs[attrCreated]),
+		OID:      attrs[attrOID],
 		FileName: attrs[attrFileName],
 		Size:     attrs[attrSize],
 		IsDir:    attrs[attrOID] == "",
@@ -171,6 +173,7 @@ func (h *Handler) getDirObjectsS3(ctx context.Context, bucketInfo *data.BucketIn
 		}
 		obj := newListObjectsResponseS3(attrs)
 		obj.FilePath = prefix + obj.FileName
+		obj.GetURL = "/get/" + bucketInfo.Name + "/" + urlencode(obj.FilePath)
 		objects = append(objects, obj)
 	}
 
@@ -202,7 +205,15 @@ func (h *Handler) getDirObjectsNative(ctx context.Context, bucketInfo *data.Buck
 	}
 	defer objectIDs.Close()
 
-	return h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath)
+	res, err := h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath)
+	if err != nil {
+		return nil, err
+	}
+	for i := range res {
+		res[i].GetURL = "/get/" + bucketInfo.CID.EncodeToString() + "/" + res[i].OID
+	}
+
+	return res, nil
 }
 
 type workerParams struct {
diff --git a/internal/templates/index.gotmpl b/internal/templates/index.gotmpl
index a7ea5a9..d20fb13 100644
--- a/internal/templates/index.gotmpl
+++ b/internal/templates/index.gotmpl
@@ -71,17 +71,12 @@
             <td>
                 {{if .IsDir}}
                     🗀
-                    <a href="/get/{{$container}}{{ urlencode .FilePath }}/">
+                    <a href="{{ .GetURL }}/">
                         {{.FileName}}/
                     </a>
-                {{else if .OID}}
-                    🗎
-                    <a href="/get/{{$container}}/{{ .OID }}">
-                        {{.FileName}}
-                    </a>
                 {{else}}
                     🗎
-                    <a href="/get/{{$container}}{{ urlencode .FilePath }}">
+                    <a href="{{ .GetURL }}">
                         {{.FileName}}
                     </a>
                 {{end}}
@@ -90,15 +85,9 @@
             <td>{{if .Size}}{{ formatSize .Size }}{{end}}</td>
             <td>{{ .Created }}</td>
             <td>
-                {{if .OID}}
-                    <a href="/get/{{$container}}/{{ .OID }}?download=true">
-                        Link
-                    </a>
-                {{else}}
-                    <a href="/get/{{$container}}{{ urlencode .FilePath }}?download=true">
-                        Link
-                    </a>
-                {{ end }}
+                <a href="{{ .GetURL }}?download=true">
+                    Link
+                </a>
             </td>
         </tr>
     {{end}}

I meant something like ```diff diff --git a/internal/handler/browse.go b/internal/handler/browse.go index a2373d1..b55d349 100644 --- a/internal/handler/browse.go +++ b/internal/handler/browse.go @@ -45,12 +45,14 @@ type ( FilePath string Size string IsDir bool + GetURL string } ) func newListObjectsResponseS3(attrs map[string]string) ResponseObject { return ResponseObject{ Created: formatTimestamp(attrs[attrCreated]), + OID: attrs[attrOID], FileName: attrs[attrFileName], Size: attrs[attrSize], IsDir: attrs[attrOID] == "", @@ -171,6 +173,7 @@ func (h *Handler) getDirObjectsS3(ctx context.Context, bucketInfo *data.BucketIn } obj := newListObjectsResponseS3(attrs) obj.FilePath = prefix + obj.FileName + obj.GetURL = "/get/" + bucketInfo.Name + "/" + urlencode(obj.FilePath) objects = append(objects, obj) } @@ -202,7 +205,15 @@ func (h *Handler) getDirObjectsNative(ctx context.Context, bucketInfo *data.Buck } defer objectIDs.Close() - return h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath) + res, err := h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath) + if err != nil { + return nil, err + } + for i := range res { + res[i].GetURL = "/get/" + bucketInfo.CID.EncodeToString() + "/" + res[i].OID + } + + return res, nil } type workerParams struct { diff --git a/internal/templates/index.gotmpl b/internal/templates/index.gotmpl index a7ea5a9..d20fb13 100644 --- a/internal/templates/index.gotmpl +++ b/internal/templates/index.gotmpl @@ -71,17 +71,12 @@ <td> {{if .IsDir}} 🗀 - <a href="/get/{{$container}}{{ urlencode .FilePath }}/"> + <a href="{{ .GetURL }}/"> {{.FileName}}/ </a> - {{else if .OID}} - 🗎 - <a href="/get/{{$container}}/{{ .OID }}"> - {{.FileName}} - </a> {{else}} 🗎 - <a href="/get/{{$container}}{{ urlencode .FilePath }}"> + <a href="{{ .GetURL }}"> {{.FileName}} </a> {{end}} @@ -90,15 +85,9 @@ <td>{{if .Size}}{{ formatSize .Size }}{{end}}</td> <td>{{ .Created }}</td> <td> - {{if .OID}} - <a href="/get/{{$container}}/{{ .OID }}?download=true"> - Link - </a> - {{else}} - <a href="/get/{{$container}}{{ urlencode .FilePath }}?download=true"> - Link - </a> - {{ end }} + <a href="{{ .GetURL }}?download=true"> + Link + </a> </td> </tr> {{end}} ```
dkirillov reviewed 2024-10-21 13:55:42 +00:00
@ -117,0 +198,4 @@
if basePath == "" {
filters = append(filters, object.MatchNotPresent)
}
objCh := make(chan ResponseObject)
Member

It's better to create such channel somewhere in headDirObjects and close it there when no more objects be available

It's better to create such channel somewhere in `headDirObjects` and close it there when no more objects be available
dkirillov marked this conversation as resolved
nzinkevich force-pushed feature/index_page_native from e0a0f53fb8 to 3f421c6d12 2024-10-23 08:41:32 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 3f421c6d12 to b900a915dd 2024-10-23 09:19:29 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from b900a915dd to f437d74273 2024-10-23 09:31:07 +00:00 Compare
nzinkevich self-assigned this 2024-10-24 07:19:19 +00:00
dkirillov reviewed 2024-10-28 06:44:51 +00:00
@ -230,3 +231,1 @@
a.settings.setIndexTemplate("")
a.log.Warn(logs.FailedToReadIndexPageTemplate, zap.Error(err))
return
a.log.Fatal(logs.FailedToReadIndexPageTemplate, zap.Error(err))
Member

Please, don't use fatal. Otherwise by sighup server can be stopped

Please, don't use fatal. Otherwise by sighup server can be stopped
dkirillov reviewed 2024-10-28 06:51:03 +00:00
@ -116,0 +252,4 @@
done <- struct{}{}
}()
pool, err := ants.NewPool(runtime.NumCPU())
Member

Pool size should be configurable. And default should be at least 100 (maybe 1000)

Pool size should be configurable. And default should be at least 100 (maybe 1000)
dkirillov reviewed 2024-10-28 08:15:17 +00:00
@ -116,0 +276,4 @@
close(p.errCh)
close(p.objCh)
<-done
<-done
Member

This 4 lines seems confusing to me.
Can we rewrite this code to make it more strightforward?
For example:

diff --git a/internal/handler/browse.go b/internal/handler/browse.go
index a2373d1..d3f84ae 100644
--- a/internal/handler/browse.go
+++ b/internal/handler/browse.go
@@ -12,7 +12,6 @@ import (
 	"time"
 
 	"git.frostfs.info/TrueCloudLab/frostfs-http-gw/internal/data"
-	"git.frostfs.info/TrueCloudLab/frostfs-http-gw/internal/logs"
 	"git.frostfs.info/TrueCloudLab/frostfs-http-gw/utils"
 	cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object"
@@ -202,116 +201,118 @@ func (h *Handler) getDirObjectsNative(ctx context.Context, bucketInfo *data.Buck
 	}
 	defer objectIDs.Close()
 
-	return h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath)
-}
+	ctx, cancel := context.WithCancel(ctx)
+	defer cancel()
 
-type workerParams struct {
-	cnrID     cid.ID
-	objectIDs ResObjectSearch
-	basePath  string
-	errCh     chan error
-	objCh     chan ResponseObject
-	cancel    context.CancelFunc
-}
+	resp, err := h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath)
+	if err != nil {
+		return nil, err
+	}
 
-func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) ([]ResponseObject, error) {
-	const initialSliceCapacity = 100
+	log := utils.GetReqLogOrDefault(ctx, h.log)
 
-	var wg sync.WaitGroup
-	var dirs sync.Map
-	log := h.log.With(
-		zap.String("cid", cnrID.EncodeToString()),
-		zap.String("path", basePath),
-	)
-	done := make(chan struct{})
-	objects := make([]ResponseObject, 0, initialSliceCapacity)
-	ctx, cancel := context.WithCancel(ctx)
-	p := workerParams{
-		cnrID:     cnrID,
-		objectIDs: objectIDs,
-		basePath:  basePath,
-		errCh:     make(chan error, 1),
-		objCh:     make(chan ResponseObject, 1),
-		cancel:    cancel,
-	}
-	defer cancel()
+	dirs := make(map[string]struct{})
+	objects := make([]ResponseObject, 0, 100)
+	for objExt := range resp {
+		if objExt.Error != nil {
+			log.Error("error", zap.Error(objExt.Error))
+			cancel()
+			continue
+		}
 
-	go func() {
-		for err := range p.errCh {
-			if err != nil {
-				log.Error(logs.FailedToHeadObject, zap.Error(err))
+		if objExt.Object.IsDir {
+			if _, ok := dirs[objExt.Object.FileName]; ok {
+				continue
 			}
+			dirs[objExt.Object.FileName] = struct{}{}
 		}
-		done <- struct{}{}
-	}()
 
-	go func() {
-		for obj := range p.objCh {
-			objects = append(objects, obj)
-		}
-		done <- struct{}{}
-	}()
+		objects = append(objects, objExt.Object)
+	}
 
+	return objects, nil
+}
+
+type ResponseObjectExtended struct {
+	Object ResponseObject
+	Error  error
+}
+
+func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) (chan ResponseObjectExtended, error) {
 	pool, err := ants.NewPool(runtime.NumCPU())
 	if err != nil {
 		return nil, err
 	}
-	defer pool.Release()
-	err = objectIDs.Iterate(func(id oid.ID) bool {
-		wg.Add(1)
-		if err = pool.Submit(func() {
-			defer wg.Done()
-			h.headDirObject(ctx, id, &dirs, p)
-		}); err != nil {
-			p.errCh <- err
-		}
-		select {
-		case <-ctx.Done():
-			return true
-		default:
-			return false
+
+	res := make(chan ResponseObjectExtended)
+
+	go func() {
+		defer func() {
+			pool.Release()
+			close(res)
+		}()
+
+		log := utils.GetReqLogOrDefault(ctx, h.log).With(
+			zap.String("cid", cnrID.EncodeToString()),
+			zap.String("path", basePath),
+		)
+
+		var wg sync.WaitGroup
+		err = objectIDs.Iterate(func(id oid.ID) bool {
+			wg.Add(1)
+			err = pool.Submit(func() {
+				defer wg.Done()
+
+				var obj ResponseObjectExtended
+				obj.Object, obj.Error = h.headDirObject(ctx, cnrID, id, basePath)
+				res <- obj
+			})
+			if err != nil {
+				wg.Done()
+				log.Warn("FailedToSubmitTaskToPool", zap.Error(err))
+			}
+
+			select {
+			case <-ctx.Done():
+				return true
+			default:
+				return false
+			}
+		})
+		if err != nil {
+			log.Error("iterate", zap.Error(err))
 		}
-	})
-	wg.Wait()
-	close(p.errCh)
-	close(p.objCh)
-	<-done
-	<-done
 
-	if err != nil {
-		return nil, err
-	}
+		wg.Wait()
+	}()
 
-	return objects, nil
+	return res, nil
 }
 
-func (h *Handler) headDirObject(ctx context.Context, objID oid.ID, dirs *sync.Map, p workerParams) {
-	addr := newAddress(p.cnrID, objID)
+func (h *Handler) headDirObject(ctx context.Context, cnrID cid.ID, objID oid.ID, basePath string) (ResponseObject, error) {
+	addr := newAddress(cnrID, objID)
 	obj, err := h.frostfs.HeadObject(ctx, PrmObjectHead{
 		PrmAuth: PrmAuth{BearerToken: bearerToken(ctx)},
 		Address: addr,
 	})
 	if err != nil {
-		p.errCh <- err
-		p.cancel()
-		return
+		return ResponseObject{}, err
 	}
 
 	attrs := loadAttributes(obj.Attributes())
 	attrs[attrOID] = objID.EncodeToString()
 	attrs[attrSize] = strconv.FormatUint(obj.PayloadSize(), 10)
 
-	dirname := getNextDir(attrs[object.AttributeFilePath], p.basePath)
+	dirname := getNextDir(attrs[object.AttributeFilePath], basePath)
 	if dirname == "" {
-		p.objCh <- newListObjectsResponseNative(attrs)
-	} else if _, ok := dirs.Load(dirname); !ok {
-		p.objCh <- ResponseObject{
-			FileName: dirname,
-			FilePath: p.basePath + dirname,
-			IsDir:    true,
-		}
-		dirs.Store(dirname, true)
+		return newListObjectsResponseNative(attrs), nil
 	}
+
+	return ResponseObject{
+		FileName: dirname,
+		FilePath: basePath + dirname,
+		IsDir:    true,
+	}, nil
 }
 
 type browseParams struct {

This 4 lines seems confusing to me. Can we rewrite this code to make it more strightforward? For example: ```diff diff --git a/internal/handler/browse.go b/internal/handler/browse.go index a2373d1..d3f84ae 100644 --- a/internal/handler/browse.go +++ b/internal/handler/browse.go @@ -12,7 +12,6 @@ import ( "time" "git.frostfs.info/TrueCloudLab/frostfs-http-gw/internal/data" - "git.frostfs.info/TrueCloudLab/frostfs-http-gw/internal/logs" "git.frostfs.info/TrueCloudLab/frostfs-http-gw/utils" cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object" @@ -202,116 +201,118 @@ func (h *Handler) getDirObjectsNative(ctx context.Context, bucketInfo *data.Buck } defer objectIDs.Close() - return h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath) -} + ctx, cancel := context.WithCancel(ctx) + defer cancel() -type workerParams struct { - cnrID cid.ID - objectIDs ResObjectSearch - basePath string - errCh chan error - objCh chan ResponseObject - cancel context.CancelFunc -} + resp, err := h.headDirObjects(ctx, bucketInfo.CID, objectIDs, basePath) + if err != nil { + return nil, err + } -func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) ([]ResponseObject, error) { - const initialSliceCapacity = 100 + log := utils.GetReqLogOrDefault(ctx, h.log) - var wg sync.WaitGroup - var dirs sync.Map - log := h.log.With( - zap.String("cid", cnrID.EncodeToString()), - zap.String("path", basePath), - ) - done := make(chan struct{}) - objects := make([]ResponseObject, 0, initialSliceCapacity) - ctx, cancel := context.WithCancel(ctx) - p := workerParams{ - cnrID: cnrID, - objectIDs: objectIDs, - basePath: basePath, - errCh: make(chan error, 1), - objCh: make(chan ResponseObject, 1), - cancel: cancel, - } - defer cancel() + dirs := make(map[string]struct{}) + objects := make([]ResponseObject, 0, 100) + for objExt := range resp { + if objExt.Error != nil { + log.Error("error", zap.Error(objExt.Error)) + cancel() + continue + } - go func() { - for err := range p.errCh { - if err != nil { - log.Error(logs.FailedToHeadObject, zap.Error(err)) + if objExt.Object.IsDir { + if _, ok := dirs[objExt.Object.FileName]; ok { + continue } + dirs[objExt.Object.FileName] = struct{}{} } - done <- struct{}{} - }() - go func() { - for obj := range p.objCh { - objects = append(objects, obj) - } - done <- struct{}{} - }() + objects = append(objects, objExt.Object) + } + return objects, nil +} + +type ResponseObjectExtended struct { + Object ResponseObject + Error error +} + +func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) (chan ResponseObjectExtended, error) { pool, err := ants.NewPool(runtime.NumCPU()) if err != nil { return nil, err } - defer pool.Release() - err = objectIDs.Iterate(func(id oid.ID) bool { - wg.Add(1) - if err = pool.Submit(func() { - defer wg.Done() - h.headDirObject(ctx, id, &dirs, p) - }); err != nil { - p.errCh <- err - } - select { - case <-ctx.Done(): - return true - default: - return false + + res := make(chan ResponseObjectExtended) + + go func() { + defer func() { + pool.Release() + close(res) + }() + + log := utils.GetReqLogOrDefault(ctx, h.log).With( + zap.String("cid", cnrID.EncodeToString()), + zap.String("path", basePath), + ) + + var wg sync.WaitGroup + err = objectIDs.Iterate(func(id oid.ID) bool { + wg.Add(1) + err = pool.Submit(func() { + defer wg.Done() + + var obj ResponseObjectExtended + obj.Object, obj.Error = h.headDirObject(ctx, cnrID, id, basePath) + res <- obj + }) + if err != nil { + wg.Done() + log.Warn("FailedToSubmitTaskToPool", zap.Error(err)) + } + + select { + case <-ctx.Done(): + return true + default: + return false + } + }) + if err != nil { + log.Error("iterate", zap.Error(err)) } - }) - wg.Wait() - close(p.errCh) - close(p.objCh) - <-done - <-done - if err != nil { - return nil, err - } + wg.Wait() + }() - return objects, nil + return res, nil } -func (h *Handler) headDirObject(ctx context.Context, objID oid.ID, dirs *sync.Map, p workerParams) { - addr := newAddress(p.cnrID, objID) +func (h *Handler) headDirObject(ctx context.Context, cnrID cid.ID, objID oid.ID, basePath string) (ResponseObject, error) { + addr := newAddress(cnrID, objID) obj, err := h.frostfs.HeadObject(ctx, PrmObjectHead{ PrmAuth: PrmAuth{BearerToken: bearerToken(ctx)}, Address: addr, }) if err != nil { - p.errCh <- err - p.cancel() - return + return ResponseObject{}, err } attrs := loadAttributes(obj.Attributes()) attrs[attrOID] = objID.EncodeToString() attrs[attrSize] = strconv.FormatUint(obj.PayloadSize(), 10) - dirname := getNextDir(attrs[object.AttributeFilePath], p.basePath) + dirname := getNextDir(attrs[object.AttributeFilePath], basePath) if dirname == "" { - p.objCh <- newListObjectsResponseNative(attrs) - } else if _, ok := dirs.Load(dirname); !ok { - p.objCh <- ResponseObject{ - FileName: dirname, - FilePath: p.basePath + dirname, - IsDir: true, - } - dirs.Store(dirname, true) + return newListObjectsResponseNative(attrs), nil } + + return ResponseObject{ + FileName: dirname, + FilePath: basePath + dirname, + IsDir: true, + }, nil } type browseParams struct { ```
dkirillov reviewed 2024-10-28 08:15:46 +00:00
@ -116,0 +263,4 @@
defer wg.Done()
h.headDirObject(ctx, id, &dirs, p)
}); err != nil {
p.errCh <- err
Member

Here we also should add wg.Done

Here we also should add `wg.Done`
Member

Please, rebase branch

Please, rebase branch
nzinkevich force-pushed feature/index_page_native from f437d74273 to c56eafac0e 2024-10-31 12:42:08 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from c56eafac0e to 1972f69099 2024-10-31 12:49:40 +00:00 Compare
alexvanin reviewed 2024-11-02 09:04:42 +00:00
@ -29,1 +26,3 @@
if err != nil {
cnrIDStr, _ := c.UserValue("cid").(string)
var cnrID cid.ID
if err := cnrID.DecodeString(cnrIDStr); err != nil {
Owner

What's the difference between checking oid before and cid now?

What's the difference between checking `oid` before and `cid` now?
alexvanin requested changes 2024-11-02 09:30:31 +00:00
@ -116,0 +242,4 @@
}
func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) (chan ResponseObjectExtended, error) {
pool, err := ants.NewPool(h.config.IndexPagePoolSize())
Owner

As far as I understand, every simultaneous browse request will produce new pool with up to 1000 routines by default. That might increase goroutine usage very very fast under the load.

I suggest to use one pool of workers in the handler.

As far as I understand, every simultaneous browse request will produce new pool with up to 1000 routines by default. That might increase goroutine usage very very fast under the load. I suggest to use one pool of workers in the handler.
alexvanin marked this conversation as resolved
nzinkevich force-pushed feature/index_page_native from 1972f69099 to f8369145c9 2024-11-02 10:24:03 +00:00 Compare
nzinkevich requested review from alexvanin 2024-11-02 10:27:08 +00:00
nzinkevich force-pushed feature/index_page_native from f8369145c9 to 558609d311 2024-11-02 10:30:42 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 558609d311 to 6622110f8c 2024-11-02 10:44:13 +00:00 Compare
nzinkevich force-pushed feature/index_page_native from 6622110f8c to 12819b6bd3 2024-11-02 10:47:10 +00:00 Compare
alexvanin approved these changes 2024-11-02 13:57:08 +00:00
@ -22,2 +28,4 @@
attrFileName = "FileName"
attrSize = "Size"
defaultPoolSize = 1000
Owner

I think this is basically unused and can be moved to test file. In the app you defined default pool size in settings.go

I think this is basically unused and can be moved to test file. In the app you defined default pool size in settings.go
All checks were successful
/ DCO (pull_request) Successful in 1m3s
/ Vulncheck (pull_request) Successful in 1m38s
/ Builds (pull_request) Successful in 1m10s
/ Lint (pull_request) Successful in 2m2s
/ Tests (pull_request) Successful in 1m13s
This pull request doesn't have enough approvals yet. 1 of 2 approvals granted.
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u feature/index_page_native:nzinkevich-feature/index_page_native
git checkout nzinkevich-feature/index_page_native
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-services-developers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-http-gw#153
No description provided.