feature/integrate_tree_service #8

Merged
alexvanin merged 1 commit from :feature/integrate_tree_service into tcl/master 2024-09-04 19:51:22 +00:00
Member

close #3

close #3
r.loginov self-assigned this 2024-03-27 10:59:27 +00:00
r.loginov requested review from alexvanin 2024-03-27 10:59:57 +00:00
r.loginov requested review from dkirillov 2024-03-27 11:00:04 +00:00
r.loginov force-pushed feature/integrate_tree_service from 8d2fe3a6cf to 51fd77f1b9 2024-03-27 13:43:12 +00:00 Compare
r.loginov force-pushed feature/integrate_tree_service from 51fd77f1b9 to bb533eeec1 2024-03-27 14:08:32 +00:00 Compare
alexvanin reviewed 2024-03-28 08:17:24 +00:00
@ -652,28 +683,28 @@ func (d *driver) List(ctx context.Context, path string) ([]string, error) {
}
func (d *driver) Move(ctx context.Context, sourcePath string, destPath string) error {
Owner

As far as I see, move operation does not modify object payload. I guess it is sufficient to change tree node in tree service and do not fetch and reupload object into storage. Is it correct?

As far as I see, move operation does not modify object payload. I guess it is sufficient to change tree node in tree service and do not fetch and reupload object into storage. Is it correct?
Member

Currently we store filepath in object attributes, so we have to reupload object to change this attribute

Currently we store filepath in object attributes, so we have to reupload object to change this attribute
Owner

Aside of restoring tree in case of pilorama faillure on all container nodes, do we need filepath attribute in the object?

Aside of restoring tree in case of pilorama faillure on all container nodes, do we need filepath attribute in the object?
Member

Probably not

Probably not
Owner

We've decided to keep filepath attribute in the object in case of global tree failures. At the same time, Move operation will not update filepath attribute, therefore will not reupload object.

We've decided to keep filepath attribute in the object in case of global tree failures. At the same time, `Move` operation will not update filepath attribute, therefore will not reupload object.
alexvanin marked this conversation as resolved
@ -513,4 +551,4 @@
var prmHead pool.PrmObjectHead
prmHead.SetAddress(addr)
obj, err := d.sdkPool.HeadObject(ctx, prmHead)
Owner

Seems like we can store payload size in k/v pair in treeObj meta, so we can remove this head request. The same is applicable for other head requests.

Seems like we can store payload size in k/v pair in `treeObj` meta, so we can remove this head request. The same is applicable for other head requests.
Owner

Maybe I am wrong, because we upload object part-by-part and we don't know total size unless we use head request. Need to look into it.

Maybe I am wrong, because we upload object part-by-part and we don't know total size unless we use head request. Need to look into it.
Member

If object isn't fully uploaded we shouldn't read it I suppose

If object isn't fully uploaded we shouldn't read it I suppose
alexvanin marked this conversation as resolved
@ -0,0 +16,4 @@
GetListOIDBySplitID(ctx context.Context, containerID cid.ID, path string, splitID *object.SplitID) ([]oid.ID, error)
AddObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID) (uint64, error)
AddPHYObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID, splitID *object.SplitID) (uint64, error)
Owner

@dkirillov What you think about having different tree IDs for high-level regular object and system-related physical objects? Is it worth it?

@dkirillov What you think about having different tree IDs for high-level regular object and system-related physical objects? Is it worth it?
Member

Yes, I think it's ok. The same mechanics in s3-gw for multipart upload

Yes, I think it's ok. The same mechanics in s3-gw for multipart upload
alexvanin marked this conversation as resolved
dkirillov reviewed 2024-03-28 09:26:06 +00:00
@ -591,52 +629,45 @@ func (d *driver) Stat(ctx context.Context, path string) (storagedriver.FileInfo,
return newFileInfoDir(path), nil
}
id, ok, err := d.searchOneBase(ctx, path, true)
Member

Methods d.searchOneBase and d.searchOne are not used anymore. Can be dropped.

Methods `d.searchOneBase` and `d.searchOne` are not used anymore. Can be dropped.
dkirillov marked this conversation as resolved
@ -706,3 +741,1 @@
prmSearch.SetFilters(filters)
res, err := d.sdkPool.SearchObjects(ctx, prmSearch)
treeObjList, err := d.treeService.GetListObjectByPrefix(ctx, d.containerID, path)
Member

I'm not sure this is correct because we get all object and deleting "/a" does delete "/ab" but must not.
It's better to get all nodes by the exact names and list sub tree if one of that node is intermediate.

I'm not sure this is correct because we get all object and `deleting "/a" does delete "/ab"` but must not. It's better to get all nodes by the exact names and list sub tree if one of that node is intermediate.
Author
Member

Not quite. In this case, when requesting deletion with the path /a, the file /ab will not be deleted, because:
GetListObjectByPrefix will first find the root node (intermediate) and then take a subtree with no depth limit. Then a separate request is made to get a node for the /a file via getObjectByPath (in which GetNodes is called along the /a path). As a result, /ab will not be in the final list. But if, despite this, it is better to use your approach, no problem.

Not quite. In this case, when requesting deletion with the path `/a`, the file `/ab` will not be deleted, because: `GetListObjectByPrefix` will first find the root node (intermediate) and then take a subtree with no depth limit. Then a separate request is made to get a node for the `/a` file via `getObjectByPath` (in which `GetNodes` is called along the `/a` path). As a result, `/ab` will not be in the final list. But if, despite this, it is better to use your approach, no problem.
Member

Please check the following test:

diff --git a/registry/storage/driver/frostfs/frostfs.go b/registry/storage/driver/frostfs/frostfs.go
index 2d899ff9..ac32f6ad 100644
--- a/registry/storage/driver/frostfs/frostfs.go
+++ b/registry/storage/driver/frostfs/frostfs.go
@@ -17,6 +17,7 @@ import (
 	resolver "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/ns"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object"
 	oid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id"
+	oidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id/test"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/transformer"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool"
 	treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree"
@@ -527,11 +528,7 @@ func (d *driver) PutContent(ctx context.Context, path string, content []byte) er
 	var prm pool.PrmObjectPut
 	prm.SetHeader(*obj)
 
-	id, err := d.sdkPool.PutObject(ctx, prm)
-	if err != nil {
-		return fmt.Errorf("couldn't put object '%s': %w", path, err)
-	}
-	_, err = d.treeService.AddObject(ctx, d.containerID, path, id, uint64(len(content)))
+	_, err := d.treeService.AddObject(ctx, d.containerID, path, oidtest.ID(), uint64(len(content)))
 	if err != nil {
 		return fmt.Errorf("can't add object to tree: %w", err)
 	}
@@ -690,10 +687,6 @@ func (d *driver) Delete(ctx context.Context, path string) error {
 	}
 
 	for _, treeObj := range treeObjList {
-		if err := d.delete(ctx, treeObj.ObjID); err != nil {
-			return fmt.Errorf("error in delete object oid: %s : %w", treeObj.ObjID, err)
-		}
-
 		if err = d.treeService.DeleteObject(ctx, d.containerID, treeObj.ID); err != nil {
 			return fmt.Errorf("couldn't delete object from tree: %w", err)
 		}
diff --git a/registry/storage/driver/frostfs/frostfs_test.go b/registry/storage/driver/frostfs/frostfs_test.go
index 201c97af..ed8fc934 100644
--- a/registry/storage/driver/frostfs/frostfs_test.go
+++ b/registry/storage/driver/frostfs/frostfs_test.go
@@ -12,16 +12,19 @@ import (
 
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container"
 	cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
+	cidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id/test"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/netmap"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/user"
 	storagedriver "github.com/distribution/distribution/v3/registry/storage/driver"
+	"github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree"
 	"github.com/google/uuid"
 	"github.com/nspcc-dev/neo-go/pkg/crypto/keys"
 	"github.com/nspcc-dev/neo-go/pkg/wallet"
 	"github.com/stretchr/testify/require"
 	"github.com/testcontainers/testcontainers-go"
 	"github.com/testcontainers/testcontainers-go/wait"
+	"go.uber.org/zap/zaptest"
 )
 
 const (
@@ -44,6 +47,42 @@ func params(walletPath string, containerID cid.ID) map[string]interface{} {
 	}
 }
 
+func TestDelete(t *testing.T) {
+	ctx := context.Background()
+
+	key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb")
+	require.NoError(t, err)
+
+	var owner user.ID
+	user.IDFromKey(&owner, key.PrivateKey.PublicKey)
+
+	log := zaptest.NewLogger(t)
+
+	treeCli, err := tree.NewTreeServiceClientMemory()
+	require.NoError(t, err)
+
+	d := driver{
+		treeService: tree.NewTree(treeCli, log),
+		owner:       &owner,
+		key:         &key.PrivateKey,
+		containerID: cidtest.ID(),
+	}
+
+	err = d.PutContent(ctx, "a", []byte("test"))
+	require.NoError(t, err)
+
+	err = d.PutContent(ctx, "ab", []byte("test"))
+	require.NoError(t, err)
+
+	err = d.Delete(ctx, "a")
+	require.NoError(t, err)
+
+	res, err := d.List(ctx, "")
+	require.NoError(t, err)
+	fmt.Println(res)
+	require.Contains(t, res, "ab")
+}
+
 func TestIntegration(t *testing.T) {
 	f, err := os.CreateTemp("", "wallet")
 	require.NoError(t, err)
diff --git a/registry/storage/driver/frostfs/tree/tree.go b/registry/storage/driver/frostfs/tree/tree.go
index a8e18337..0cccb787 100644
--- a/registry/storage/driver/frostfs/tree/tree.go
+++ b/registry/storage/driver/frostfs/tree/tree.go
@@ -196,6 +196,9 @@ func (c *Tree) GetListObjectByPrefix(ctx context.Context, containerID cid.ID, pr
 	nodes := make([]NodeResponse, 0)
 
 	for _, node := range subTree {
+		if node.GetNodeID() == 0 {
+			continue
+		}
 		if !isIntermediate(node) {
 			nodes = append(nodes, node)
 		}
Please check the following test: ```diff diff --git a/registry/storage/driver/frostfs/frostfs.go b/registry/storage/driver/frostfs/frostfs.go index 2d899ff9..ac32f6ad 100644 --- a/registry/storage/driver/frostfs/frostfs.go +++ b/registry/storage/driver/frostfs/frostfs.go @@ -17,6 +17,7 @@ import ( resolver "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/ns" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object" oid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id" + oidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id/test" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/transformer" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool" treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree" @@ -527,11 +528,7 @@ func (d *driver) PutContent(ctx context.Context, path string, content []byte) er var prm pool.PrmObjectPut prm.SetHeader(*obj) - id, err := d.sdkPool.PutObject(ctx, prm) - if err != nil { - return fmt.Errorf("couldn't put object '%s': %w", path, err) - } - _, err = d.treeService.AddObject(ctx, d.containerID, path, id, uint64(len(content))) + _, err := d.treeService.AddObject(ctx, d.containerID, path, oidtest.ID(), uint64(len(content))) if err != nil { return fmt.Errorf("can't add object to tree: %w", err) } @@ -690,10 +687,6 @@ func (d *driver) Delete(ctx context.Context, path string) error { } for _, treeObj := range treeObjList { - if err := d.delete(ctx, treeObj.ObjID); err != nil { - return fmt.Errorf("error in delete object oid: %s : %w", treeObj.ObjID, err) - } - if err = d.treeService.DeleteObject(ctx, d.containerID, treeObj.ID); err != nil { return fmt.Errorf("couldn't delete object from tree: %w", err) } diff --git a/registry/storage/driver/frostfs/frostfs_test.go b/registry/storage/driver/frostfs/frostfs_test.go index 201c97af..ed8fc934 100644 --- a/registry/storage/driver/frostfs/frostfs_test.go +++ b/registry/storage/driver/frostfs/frostfs_test.go @@ -12,16 +12,19 @@ import ( "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container" cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id" + cidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id/test" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/netmap" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/user" storagedriver "github.com/distribution/distribution/v3/registry/storage/driver" + "github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree" "github.com/google/uuid" "github.com/nspcc-dev/neo-go/pkg/crypto/keys" "github.com/nspcc-dev/neo-go/pkg/wallet" "github.com/stretchr/testify/require" "github.com/testcontainers/testcontainers-go" "github.com/testcontainers/testcontainers-go/wait" + "go.uber.org/zap/zaptest" ) const ( @@ -44,6 +47,42 @@ func params(walletPath string, containerID cid.ID) map[string]interface{} { } } +func TestDelete(t *testing.T) { + ctx := context.Background() + + key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb") + require.NoError(t, err) + + var owner user.ID + user.IDFromKey(&owner, key.PrivateKey.PublicKey) + + log := zaptest.NewLogger(t) + + treeCli, err := tree.NewTreeServiceClientMemory() + require.NoError(t, err) + + d := driver{ + treeService: tree.NewTree(treeCli, log), + owner: &owner, + key: &key.PrivateKey, + containerID: cidtest.ID(), + } + + err = d.PutContent(ctx, "a", []byte("test")) + require.NoError(t, err) + + err = d.PutContent(ctx, "ab", []byte("test")) + require.NoError(t, err) + + err = d.Delete(ctx, "a") + require.NoError(t, err) + + res, err := d.List(ctx, "") + require.NoError(t, err) + fmt.Println(res) + require.Contains(t, res, "ab") +} + func TestIntegration(t *testing.T) { f, err := os.CreateTemp("", "wallet") require.NoError(t, err) diff --git a/registry/storage/driver/frostfs/tree/tree.go b/registry/storage/driver/frostfs/tree/tree.go index a8e18337..0cccb787 100644 --- a/registry/storage/driver/frostfs/tree/tree.go +++ b/registry/storage/driver/frostfs/tree/tree.go @@ -196,6 +196,9 @@ func (c *Tree) GetListObjectByPrefix(ctx context.Context, containerID cid.ID, pr nodes := make([]NodeResponse, 0) for _, node := range subTree { + if node.GetNodeID() == 0 { + continue + } if !isIntermediate(node) { nodes = append(nodes, node) } ```
Author
Member

This test will not pass due to the fact that paths must start with a slash (distribution, when calling driver methods, always passes paths starting with a slash). But now I have added path validation to the driver methods. I also made a change regarding skipping the root node in GetListObjectByPrefix.

An example of a test that works and considers the situation you described above:

func TestDelete(t *testing.T) {  
    ctx := context.Background()  
  
    key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb")  
    require.NoError(t, err)  
  
    var owner user.ID  
    user.IDFromKey(&owner, key.PrivateKey.PublicKey)  
  
    treeCli, err := tree.NewTreeServiceClientMemory()  
    require.NoError(t, err)  
  
    d := &driver{  
       treeService: tree.NewTree(treeCli, nil),  
       owner:       &owner,  
       key:         &key.PrivateKey,  
       containerID: cidtest.ID(),  
    }  
  
    err = d.PutContent(ctx, "/a", []byte("test"))  
    require.NoError(t, err)  
  
    err = d.PutContent(ctx, "/ab", []byte("test"))  
    require.NoError(t, err)  
  
    err = d.Delete(ctx, "/a")  
    require.NoError(t, err)  
  
    res, err := d.List(ctx, "/")  
    require.NoError(t, err)  
    require.Contains(t, res, "/ab")  
}
This test will not pass due to the fact that paths must start with a slash (distribution, when calling driver methods, always passes paths starting with a slash). But now I have added path validation to the driver methods. I also made a change regarding skipping the root node in `GetListObjectByPrefix`. An example of a test that works and considers the situation you described above: ``` func TestDelete(t *testing.T) { ctx := context.Background() key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb") require.NoError(t, err) var owner user.ID user.IDFromKey(&owner, key.PrivateKey.PublicKey) treeCli, err := tree.NewTreeServiceClientMemory() require.NoError(t, err) d := &driver{ treeService: tree.NewTree(treeCli, nil), owner: &owner, key: &key.PrivateKey, containerID: cidtest.ID(), } err = d.PutContent(ctx, "/a", []byte("test")) require.NoError(t, err) err = d.PutContent(ctx, "/ab", []byte("test")) require.NoError(t, err) err = d.Delete(ctx, "/a") require.NoError(t, err) res, err := d.List(ctx, "/") require.NoError(t, err) require.Contains(t, res, "/ab") } ```
dkirillov marked this conversation as resolved
@ -0,0 +24,4 @@
ServiceClient interface {
GetNodes(ctx context.Context, p *GetNodesParams) ([]NodeResponse, error)
GetSubTree(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) ([]NodeResponse, error)
GetSubTreeStream(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error)
Member

This is unused method (some others too). Can we drop it then?

This is unused method (some others too). Can we drop it then?
dkirillov marked this conversation as resolved
@ -0,0 +72,4 @@
sizeKV = "Size"
splitIDKV = "SplitID"
versionTree = "version"
Member

@alexvanin Did we decide use the same tree as for s3 objects?

@alexvanin Did we decide use the same tree as for s3 objects?
Owner

I think so, but there is no strong opinion on this.

I think so, but there is no strong opinion on this.
Member

I have some concerns about using the same container/bucket in both ways: via s3 and distribution

I have some concerns about using the same container/bucket in both ways: via s3 and distribution
Owner

Decided to use different tree names for distribution.

Decided to use different tree names for distribution.
dkirillov marked this conversation as resolved
@ -0,0 +9,4 @@
"github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree"
)
type TreeService interface {
Member

I'm not sure there is a sense in this interface. Let's drop it. For tests we have low-level interface and its mock implementation

I'm not sure there is a sense in this interface. Let's drop it. For tests we have low-level interface and its mock implementation
dkirillov marked this conversation as resolved
@ -0,0 +9,4 @@
cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree"
grpcService "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree/service"
errorsFrost "github.com/distribution/distribution/v3/registry/storage/driver/frostfs/errors"
Member

It seems there isn't such package

It seems there isn't such package
dkirillov marked this conversation as resolved
r.loginov force-pushed feature/integrate_tree_service from bb533eeec1 to 6d62809fb6 2024-04-01 13:43:41 +00:00 Compare
r.loginov force-pushed feature/integrate_tree_service from 6d62809fb6 to 7b97b95c32 2024-04-01 14:08:02 +00:00 Compare
r.loginov force-pushed feature/integrate_tree_service from 7b97b95c32 to 951d7af1fd 2024-04-02 10:45:06 +00:00 Compare
r.loginov force-pushed feature/integrate_tree_service from 951d7af1fd to 7a74e42e07 2024-04-02 12:52:30 +00:00 Compare
dkirillov reviewed 2024-04-02 14:56:52 +00:00
@ -0,0 +75,4 @@
modificationTime = "ModificationTime"
objectTree = "object"
splitTree = "split"
Member

Can we name this more distribution specific?
Like distribution-object, distribution-split for example

Can we name this more distribution specific? Like `distribution-object`, `distribution-split` for example
dkirillov marked this conversation as resolved
@ -0,0 +196,4 @@
nodes := make([]NodeResponse, 0)
for _, node := range subTree {
if node.GetNodeID() == 0 {
Member

Actually we have to check for rootID:

if node.GetNodeID() == rootID {
Actually we have to check for `rootID`: ```golang if node.GetNodeID() == rootID { ```
dkirillov marked this conversation as resolved
@ -0,0 +251,4 @@
return s.res[s.offset-1], nil
}
func (c *ServiceClientMemory) GetSubTreeStream(_ context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error) {
Member

Unused

Unused
dkirillov marked this conversation as resolved
r.loginov force-pushed feature/integrate_tree_service from 7a74e42e07 to 248fee64cc 2024-04-02 15:11:34 +00:00 Compare
alexvanin approved these changes 2024-04-02 15:24:24 +00:00
dkirillov approved these changes 2024-04-03 06:25:39 +00:00
alexvanin merged commit 248fee64cc into tcl/master 2024-04-03 07:11:44 +00:00
alexvanin deleted branch feature/integrate_tree_service 2024-04-03 07:11:44 +00:00
Sign in to join this conversation.
No description provided.