feature/integrate_tree_service #8
Labels
No labels
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/distribution#8
Loading…
Reference in a new issue
No description provided.
Delete branch ":feature/integrate_tree_service"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
close #3
8d2fe3a6cf
to51fd77f1b9
51fd77f1b9
tobb533eeec1
@ -652,28 +683,28 @@ func (d *driver) List(ctx context.Context, path string) ([]string, error) {
}
func (d *driver) Move(ctx context.Context, sourcePath string, destPath string) error {
As far as I see, move operation does not modify object payload. I guess it is sufficient to change tree node in tree service and do not fetch and reupload object into storage. Is it correct?
Currently we store filepath in object attributes, so we have to reupload object to change this attribute
Aside of restoring tree in case of pilorama faillure on all container nodes, do we need filepath attribute in the object?
Probably not
We've decided to keep filepath attribute in the object in case of global tree failures. At the same time,
Move
operation will not update filepath attribute, therefore will not reupload object.@ -513,4 +551,4 @@
var prmHead pool.PrmObjectHead
prmHead.SetAddress(addr)
obj, err := d.sdkPool.HeadObject(ctx, prmHead)
Seems like we can store payload size in k/v pair in
treeObj
meta, so we can remove this head request. The same is applicable for other head requests.Maybe I am wrong, because we upload object part-by-part and we don't know total size unless we use head request. Need to look into it.
If object isn't fully uploaded we shouldn't read it I suppose
@ -0,0 +16,4 @@
GetListOIDBySplitID(ctx context.Context, containerID cid.ID, path string, splitID *object.SplitID) ([]oid.ID, error)
AddObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID) (uint64, error)
AddPHYObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID, splitID *object.SplitID) (uint64, error)
@dkirillov What you think about having different tree IDs for high-level regular object and system-related physical objects? Is it worth it?
Yes, I think it's ok. The same mechanics in s3-gw for multipart upload
@ -591,52 +629,45 @@ func (d *driver) Stat(ctx context.Context, path string) (storagedriver.FileInfo,
return newFileInfoDir(path), nil
}
id, ok, err := d.searchOneBase(ctx, path, true)
Methods
d.searchOneBase
andd.searchOne
are not used anymore. Can be dropped.@ -706,3 +741,1 @@
prmSearch.SetFilters(filters)
res, err := d.sdkPool.SearchObjects(ctx, prmSearch)
treeObjList, err := d.treeService.GetListObjectByPrefix(ctx, d.containerID, path)
I'm not sure this is correct because we get all object and
deleting "/a" does delete "/ab"
but must not.It's better to get all nodes by the exact names and list sub tree if one of that node is intermediate.
Not quite. In this case, when requesting deletion with the path
/a
, the file/ab
will not be deleted, because:GetListObjectByPrefix
will first find the root node (intermediate) and then take a subtree with no depth limit. Then a separate request is made to get a node for the/a
file viagetObjectByPath
(in whichGetNodes
is called along the/a
path). As a result,/ab
will not be in the final list. But if, despite this, it is better to use your approach, no problem.Please check the following test:
This test will not pass due to the fact that paths must start with a slash (distribution, when calling driver methods, always passes paths starting with a slash). But now I have added path validation to the driver methods. I also made a change regarding skipping the root node in
GetListObjectByPrefix
.An example of a test that works and considers the situation you described above:
@ -0,0 +24,4 @@
ServiceClient interface {
GetNodes(ctx context.Context, p *GetNodesParams) ([]NodeResponse, error)
GetSubTree(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) ([]NodeResponse, error)
GetSubTreeStream(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error)
This is unused method (some others too). Can we drop it then?
@ -0,0 +72,4 @@
sizeKV = "Size"
splitIDKV = "SplitID"
versionTree = "version"
@alexvanin Did we decide use the same tree as for s3 objects?
I think so, but there is no strong opinion on this.
I have some concerns about using the same container/bucket in both ways: via s3 and distribution
Decided to use different tree names for distribution.
@ -0,0 +9,4 @@
"github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree"
)
type TreeService interface {
I'm not sure there is a sense in this interface. Let's drop it. For tests we have low-level interface and its mock implementation
@ -0,0 +9,4 @@
cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree"
grpcService "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree/service"
errorsFrost "github.com/distribution/distribution/v3/registry/storage/driver/frostfs/errors"
It seems there isn't such package
bb533eeec1
to6d62809fb6
6d62809fb6
to7b97b95c32
7b97b95c32
to951d7af1fd
951d7af1fd
to7a74e42e07
@ -0,0 +75,4 @@
modificationTime = "ModificationTime"
objectTree = "object"
splitTree = "split"
Can we name this more distribution specific?
Like
distribution-object
,distribution-split
for example@ -0,0 +196,4 @@
nodes := make([]NodeResponse, 0)
for _, node := range subTree {
if node.GetNodeID() == 0 {
Actually we have to check for
rootID
:@ -0,0 +251,4 @@
return s.res[s.offset-1], nil
}
func (c *ServiceClientMemory) GetSubTreeStream(_ context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error) {
Unused
7a74e42e07
to248fee64cc