[#151] Index page for FrostFS containers #153
No reviewers
TrueCloudLab/storage-services-developers
Labels
No labels
P0
P1
P2
P3
good first issue
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-http-gw#153
Loading…
Reference in a new issue
No description provided.
Delete branch "nzinkevich/frostfs-http-gw:feature/index_page_native"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #151. Make index page work not only via s3-like requests like
/get/<bucketName>/<prefix>
but also with ordinary http-gw requests withcid
andoid
:/get/<cid>/<oid>
. Native protocol index page returns whencid
(as a FrostFS container ID) parses correctly butoid
is empty, incorrect or not foundfeature/index_page_nativeto [#151] feature/index_page_native[#151] feature/index_page_nativeto [#151] Index page for FrostFS containers7d4c03712a
to9c1842f899
9c1842f899
to33b3c8ac9d
33b3c8ac9d
to675c76a9c8
@ -366,0 +369,4 @@
| Parameter | Type | SIGHUP reload | Default value | Description |
|------------------------|----------|---------------|---------------|----------------------------------------------------------------------------------------------|
| `enabled` | `bool` | yes | `false` | Flag to enable index_page return if no object with specified S3-name was found. |
| `template_path.s3` | `string` | yes | `""` | Path to .gotmpl file with html templates for S3 index_page (which is using tree service fr . |
Unfinished sentence in description?
I placed this info above the table, but forgot to delete it here. Fixed
675c76a9c8
to8845571b2c
@ -209,2 +210,3 @@
v.SetDefault(cfgIndexPageEnabled, false)
v.SetDefault(cfgIndexPageTemplatePath, "")
v.SetDefault(cfgIndexPageS3TemplatePath, "")
v.SetDefault(cfgIndexPageNativeTemplatePath, "")
Actually we can skip setting such defaults.
@ -41,0 +58,4 @@
func newListObjectsResponseNative(attrs map[string]string) ResponseObject {
return ResponseObject{
OID: attrs[attrOID],
Created: attrs[object.AttributeTimestamp] + "000",
Maybe it's better to save to fields value in appropriate format and don't invoke any functions from template?
Will we change this? It still invokes
formatTimestamp
from template@ -117,0 +172,4 @@
basePath := strings.TrimRightFunc(prefix, func(r rune) bool {
return r != '/'
})
objectIDs, err := h.search(ctx, bucketInfo.CID, object.AttributeFilePath, prefix, object.MatchCommonPrefix)
Can we search all object if prefix is empty to be able to handle objects that have
FileName
rather thanFilePath
?Fixed
I meant write something like this:
@ -117,0 +189,4 @@
const initialSliceCapacity = 100
var (
log = h.log.With(
Please use
:=
syntax where possible@ -117,0 +222,4 @@
go func() {
defer wg.Done()
addr.SetObject(id)
We cannot use
addr
from other goroutine.@ -117,0 +240,4 @@
dirname := getNextDir(attrs[object.AttributeFilePath], basePath)
if dirname == "" {
mu.Lock()
objects = append(objects, newListObjectsResponseNative(attrs))
Why don't we use channel for this?
@ -117,0 +254,4 @@
}
}()
return false
If I understand correctly, we stop iteration if any error is occurred. So here we should
return true
if context is canceled@ -28,2 +26,2 @@
err := id.DecodeString(test)
if err != nil {
testCID, _ := c.UserValue("cid").(string)
var cntID cid.ID
We usually use
cnrID
name. So let's write:@ -205,6 +218,28 @@ func (h *Handler) byAddress(c *fasthttp.RequestCtx, f func(context.Context, requ
addr.SetContainer(bktInfo.CID)
addr.SetObject(*objID)
_, err = h.frostfs.GetObject(ctx, PrmObjectGet{
We do the same
GetObject
request inf
function. It's better to useHeadObject
.By the way, do we really need this? If
oid
is actual object id why do we list objects?We only checked before that
idObj
string is decodable to oid type, but I think we should also output index page if no object found. That's why I try toGetObject
(Head is better, I agree)Could you explain why?
Well, I thought the S3 index page worked similarly, but I forgot that I removed this feature in the final implementation. Apache also does not return an index page if page not found. So I agree, it shouldn't return index page in that case
@ -0,0 +1,101 @@
{{$container := .BucketInfo.CID}}
Do we really need separate template?
8845571b2c
to4fe34909aa
4fe34909aa
tobe8b89c487
be8b89c487
toe0a0f53fb8
@ -222,2 +221,4 @@
path := a.cfg.GetString(cfgIndexPageTemplatePath)
tmpl, err := a.readTemplate(path)
if err != nil {
a.settings.setIndexTemplate("")
By the way, why do we set default template if we couldn't load new one? Maybe it's worth don't change current template?
@ -117,0 +191,4 @@
zap.String("cid", bucketInfo.CID.EncodeToString()),
zap.String("prefix", prefix),
)
basePath := strings.TrimRightFunc(prefix, func(r rune) bool {
Matter of taste but
@ -117,0 +250,4 @@
func (h *Handler) headDirObjects(ctx context.Context, p headDirParams) {
wg := sync.WaitGroup{}
dirs := sync.Map{}
Please, write:
Or
@ -117,0 +256,4 @@
err := p.objectIDs.Iterate(func(id oid.ID) bool {
wg.Add(1)
go func(id oid.ID) {
It's better to restrict number of goroutines. See for example s3-gw listing
err = pool.Submit(func() {
@ -194,3 +194,3 @@
idCnr, _ = c.UserValue("cid").(string)
idObj, _ = c.UserValue("oid").(string)
idObj, _ = url.PathUnescape(c.UserValue("oid").(string))
log = h.log.With(zap.String("cid", idCnr), zap.String("oid", idObj))
Please, write:
@ -207,2 +207,4 @@
objID := new(oid.ID)
if err = objID.DecodeString(idObj); err != nil {
if h.config.IndexPageEnabled() {
var addr oid.Address
This variable is unused
@ -61,3 +61,3 @@
return ""
}
func (c *configMock) IndexPageTemplate() string {
func (c *configMock) IndexPageNativeTemplate() string {
Please, add new line
@ -59,4 +72,3 @@
</tr>
{{end}}
{{range .Objects}}
<tr>
I don't understand why do we need conditiond
If $isNative
. We show objects and dirs, in both cases objects and dirs have the same attibutes (name, size, timestamp, OID)I meant something like
@ -117,0 +198,4 @@
if basePath == "" {
filters = append(filters, object.MatchNotPresent)
}
objCh := make(chan ResponseObject)
It's better to create such channel somewhere in
headDirObjects
and close it there when no more objects be availablee0a0f53fb8
to3f421c6d12
3f421c6d12
tob900a915dd
b900a915dd
tof437d74273
@ -230,3 +231,1 @@
a.settings.setIndexTemplate("")
a.log.Warn(logs.FailedToReadIndexPageTemplate, zap.Error(err))
return
a.log.Fatal(logs.FailedToReadIndexPageTemplate, zap.Error(err))
Please, don't use fatal. Otherwise by sighup server can be stopped
@ -116,0 +252,4 @@
done <- struct{}{}
}()
pool, err := ants.NewPool(runtime.NumCPU())
Pool size should be configurable. And default should be at least 100 (maybe 1000)
@ -116,0 +276,4 @@
close(p.errCh)
close(p.objCh)
<-done
<-done
This 4 lines seems confusing to me.
Can we rewrite this code to make it more strightforward?
For example:
@ -116,0 +263,4 @@
defer wg.Done()
h.headDirObject(ctx, id, &dirs, p)
}); err != nil {
p.errCh <- err
Here we also should add
wg.Done
Please, rebase branch
f437d74273
toc56eafac0e
c56eafac0e
to1972f69099
@ -29,1 +26,3 @@
if err != nil {
cnrIDStr, _ := c.UserValue("cid").(string)
var cnrID cid.ID
if err := cnrID.DecodeString(cnrIDStr); err != nil {
What's the difference between checking
oid
before andcid
now?@ -116,0 +242,4 @@
}
func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) (chan ResponseObjectExtended, error) {
pool, err := ants.NewPool(h.config.IndexPagePoolSize())
As far as I understand, every simultaneous browse request will produce new pool with up to 1000 routines by default. That might increase goroutine usage very very fast under the load.
I suggest to use one pool of workers in the handler.
1972f69099
tof8369145c9
f8369145c9
to558609d311
558609d311
to6622110f8c
6622110f8c
to12819b6bd3
@ -22,2 +28,4 @@
attrFileName = "FileName"
attrSize = "Size"
defaultPoolSize = 1000
I think this is basically unused and can be moved to test file. In the app you defined default pool size in settings.go
@ -76,2 +84,2 @@
<td>{{if not .IsDir}}{{ formatSize .Size }}{{end}}</td>
<td>{{if not .IsDir}}{{ formatTimestamp .Created }}{{end}}</td>
<td>{{.OID}}</td>
<td>{{if .Size}}{{ formatSize .Size }}{{end}}</td>
Why do we change condition from
if not .IsDir
toif .Size
?Oh, .IsDir would be better to mark empty files with '0B' size
@ -27,3 +26,1 @@
var id oid.ID
err := id.DecodeString(test)
if err != nil {
cnrIDStr, _ := c.UserValue("cid").(string)
Why did we change
oid
tocid
?Oh. It seems I got it. Then we should rename
h.byObjectName
andh.byAddress
so that it reflect reality.And/or be able to handle urls like
get/<bucket-name>/<oid>
@ -31,6 +32,7 @@ type Config interface {
ZipCompression() bool
ClientCut() bool
IndexPageEnabled() bool
WorkerPoolSize() int
This is unused
@ -245,0 +248,4 @@
func (s *appSettings) WorkerPoolSize() int {
s.mu.RLock()
defer s.mu.RUnlock()
return s.workerPoolSize
Is this parameter reloadable? It seems not. Then we don't need this getter. And we should place the
s.workerPoolSize
field aboves.mu
field inappSettings
struct@ -116,0 +240,4 @@
Error error
}
func (h *Handler) headDirObjects(ctx context.Context, cnrID cid.ID, objectIDs ResObjectSearch, basePath string) (chan ResponseObjectExtended, error) {
Let's write
@ -116,0 +217,4 @@
objects := make([]ResponseObject, 0, 100)
for objExt := range resp {
if objExt.Error != nil {
log.Error(logs.FailedToHeadObject, zap.Error(objExt.Error))
What do we want to do on such error?
cc @alexvanin
12819b6bd3
to264e095137
New commits pushed, approval review dismissed automatically according to repository settings
264e095137
tob9e77b64c1
b9e77b64c1
toe18e5b4fc5
New commits pushed, approval review dismissed automatically according to repository settings
e18e5b4fc5
to181568f442
@ -96,6 +97,7 @@ type (
clientCut bool
returnIndexPage bool
indexPageTemplate string
workerPoolSize int
Please place this field right below
dialerSource
@ -116,0 +165,4 @@
return nil, err
}
var result = &GetObjectsResponse{
Please don't use
var
when we initialize struct:Additionally, we can skip setting
hasErrors: false
Overall LGTM (see minor comments)
@ -191,2 +193,3 @@
// byNativeAddress is a wrapper for function (e.g. request.headObject, request.receiveFile) that
// prepares request and object address to it.
func (h *Handler) byAddress(c *fasthttp.RequestCtx, f func(context.Context, request, oid.Address)) {
func (h *Handler) byNativeAddress(c *fasthttp.RequestCtx, f func(context.Context, request, oid.Address)) {
After internal demo, let's modify this code a bit.
We want to determine which browse function to use based on tree data.
If tree data is present, then use
h.getDirObjectsS3
,If tree data is absent, then use
h.getDirObjectsNative
.New commits pushed, approval review dismissed automatically according to repository settings
339d944bef
to3b26cf921c
3b26cf921c
toec30e9c008
@ -33,0 +27,4 @@
downloadQueryParam := c.QueryArgs().GetBool("download")
switch {
case isObjectID(oidURLParam):
Why don't we check
cid
also?There may be also container identifier from NNS, and we want to resolve them too. So I check only
oid
, as it was beforeThen if we create object
5rnoerMhK3kSxDEUh2FrSQvSW86P2deahSifFLxFBF9v
into bucketmy-bucket
and I wanted to get it by http-gw likecurl http://<http-gw>/get/my-bucket/5rnoerMhK3kSxDEUh2FrSQvSW86P2deahSifFLxFBF9v
I wil got404
. Is it ok?@ -415,3 +414,3 @@
}
return objects, nil
c.SetStatusCode(fasthttp.StatusOK)
Why do we need this if we have the same on line 433 ?
And why we set status here rather than in
h.browseObjects
?By the way, why do we ever need to set success status?
ec30e9c008
to6c21977ccb
New commits pushed, approval review dismissed automatically according to repository settings
6c21977ccb
to8ef4ac1f9b
@ -75,18 +75,21 @@ request_timeout: 5s
rebalance_timer: 30s
pool_error_threshold: 100
reconnect_interval: 1m
worker_pool_size: 1000
It should be added to config files too I suppose
8ef4ac1f9b
toaa0a7d86c6
New commits pushed, approval review dismissed automatically according to repository settings
aa0a7d86c6
to43764772aa