feature/integrate_tree_service #8

r.loginov · 2024-03-27T10:59:27Z

r.loginov commented

2024-03-27 10:59:27 +00:00

close #3

r.loginov self-assigned this 2024-03-27 10:59:27 +00:00

requested reviews from alexvanin, dkirillov

2024-03-27 11:00:04 +00:00

r.loginov force-pushed feature/integrate_tree_service from 8d2fe3a6cf to 51fd77f1b9

2024-03-27 13:43:12 +00:00

Compare

r.loginov force-pushed feature/integrate_tree_service from 51fd77f1b9 to bb533eeec1

2024-03-27 14:08:32 +00:00

Compare

alexvanin reviewed 2024-03-28 08:17:24 +00:00

registry/storage/driver/frostfs/frostfs.go Outdated

					
				@ -652,28 +683,28 @@ func (d *driver) List(ctx context.Context, path string) ([]string, error) {

				}

				func (d *driver) Move(ctx context.Context, sourcePath string, destPath string) error {

alexvanin commented

2024-03-28 08:11:18 +00:00

As far as I see, move operation does not modify object payload. I guess it is sufficient to change tree node in tree service and do not fetch and reupload object into storage. Is it correct?

dkirillov commented

2024-03-28 09:05:55 +00:00

Currently we store filepath in object attributes, so we have to reupload object to change this attribute

alexvanin commented

2024-03-29 05:45:39 +00:00

Aside of restoring tree in case of pilorama faillure on all container nodes, do we need filepath attribute in the object?

dkirillov commented

2024-03-29 06:23:49 +00:00

Probably not

alexvanin commented

2024-04-01 08:23:01 +00:00

We've decided to keep filepath attribute in the object in case of global tree failures. At the same time, Move operation will not update filepath attribute, therefore will not reupload object.

We've decided to keep filepath attribute in the object in case of global tree failures. At the same time, `Move` operation will not update filepath attribute, therefore will not reupload object.

alexvanin marked this conversation as resolved

registry/storage/driver/frostfs/frostfs.go Outdated

					
				@ -513,4 +551,4 @@

					var prmHead pool.PrmObjectHead

					prmHead.SetAddress(addr)

					obj, err := d.sdkPool.HeadObject(ctx, prmHead)

alexvanin commented

2024-03-28 08:05:30 +00:00

Seems like we can store payload size in k/v pair in treeObj meta, so we can remove this head request. The same is applicable for other head requests.

Seems like we can store payload size in k/v pair in `treeObj` meta, so we can remove this head request. The same is applicable for other head requests.

👍 1

alexvanin commented

2024-03-28 08:21:25 +00:00

Maybe I am wrong, because we upload object part-by-part and we don't know total size unless we use head request. Need to look into it.

dkirillov commented

2024-03-28 09:25:28 +00:00

If object isn't fully uploaded we shouldn't read it I suppose

👍 1

alexvanin marked this conversation as resolved

registry/storage/driver/frostfs/tree_service.go Outdated

					
				@ -0,0 +16,4 @@

					GetListOIDBySplitID(ctx context.Context, containerID cid.ID, path string, splitID *object.SplitID) ([]oid.ID, error)

					AddObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID) (uint64, error)

					AddPHYObject(ctx context.Context, containerID cid.ID, path string, oid oid.ID, splitID *object.SplitID) (uint64, error)

alexvanin commented

2024-03-28 08:16:17 +00:00

@dkirillov What you think about having different tree IDs for high-level regular object and system-related physical objects? Is it worth it?

dkirillov commented

2024-03-28 09:26:43 +00:00

Yes, I think it's ok. The same mechanics in s3-gw for multipart upload

alexvanin marked this conversation as resolved

dkirillov reviewed 2024-03-28 09:26:06 +00:00

registry/storage/driver/frostfs/frostfs.go Outdated

					
				@ -591,52 +629,45 @@ func (d *driver) Stat(ctx context.Context, path string) (storagedriver.FileInfo,

						return newFileInfoDir(path), nil

					}

					id, ok, err := d.searchOneBase(ctx, path, true)

dkirillov commented

2024-03-28 08:47:08 +00:00

Methods d.searchOneBase and d.searchOne are not used anymore. Can be dropped.

Methods `d.searchOneBase` and `d.searchOne` are not used anymore. Can be dropped.

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/frostfs.go Outdated

					
				@ -706,3 +741,1 @@

					prmSearch.SetFilters(filters)

					res, err := d.sdkPool.SearchObjects(ctx, prmSearch)

					treeObjList, err := d.treeService.GetListObjectByPrefix(ctx, d.containerID, path)

dkirillov commented

2024-03-28 09:18:52 +00:00

I'm not sure this is correct because we get all object and deleting "/a" does delete "/ab" but must not.
It's better to get all nodes by the exact names and list sub tree if one of that node is intermediate.

I'm not sure this is correct because we get all object and `deleting "/a" does delete "/ab"` but must not. It's better to get all nodes by the exact names and list sub tree if one of that node is intermediate.

r.loginov commented

2024-04-01 06:46:34 +00:00

Not quite. In this case, when requesting deletion with the path /a, the file /ab will not be deleted, because:
GetListObjectByPrefix will first find the root node (intermediate) and then take a subtree with no depth limit. Then a separate request is made to get a node for the /a file via getObjectByPath (in which GetNodes is called along the /a path). As a result, /ab will not be in the final list. But if, despite this, it is better to use your approach, no problem.

Not quite. In this case, when requesting deletion with the path `/a`, the file `/ab` will not be deleted, because: `GetListObjectByPrefix` will first find the root node (intermediate) and then take a subtree with no depth limit. Then a separate request is made to get a node for the `/a` file via `getObjectByPath` (in which `GetNodes` is called along the `/a` path). As a result, `/ab` will not be in the final list. But if, despite this, it is better to use your approach, no problem.

dkirillov commented

2024-04-01 14:55:05 +00:00

Please check the following test:

diff --git a/registry/storage/driver/frostfs/frostfs.go b/registry/storage/driver/frostfs/frostfs.go
index 2d899ff9..ac32f6ad 100644
--- a/registry/storage/driver/frostfs/frostfs.go
+++ b/registry/storage/driver/frostfs/frostfs.go
@@ -17,6 +17,7 @@ import (
 	resolver "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/ns"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object"
 	oid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id"
+	oidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id/test"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/transformer"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool"
 	treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree"
@@ -527,11 +528,7 @@ func (d *driver) PutContent(ctx context.Context, path string, content []byte) er
 	var prm pool.PrmObjectPut
 	prm.SetHeader(*obj)
 
-	id, err := d.sdkPool.PutObject(ctx, prm)
-	if err != nil {
-		return fmt.Errorf("couldn't put object '%s': %w", path, err)
-	}
-	_, err = d.treeService.AddObject(ctx, d.containerID, path, id, uint64(len(content)))
+	_, err := d.treeService.AddObject(ctx, d.containerID, path, oidtest.ID(), uint64(len(content)))
 	if err != nil {
 		return fmt.Errorf("can't add object to tree: %w", err)
 	}
@@ -690,10 +687,6 @@ func (d *driver) Delete(ctx context.Context, path string) error {
 	}
 
 	for _, treeObj := range treeObjList {
-		if err := d.delete(ctx, treeObj.ObjID); err != nil {
-			return fmt.Errorf("error in delete object oid: %s : %w", treeObj.ObjID, err)
-		}
-
 		if err = d.treeService.DeleteObject(ctx, d.containerID, treeObj.ID); err != nil {
 			return fmt.Errorf("couldn't delete object from tree: %w", err)
 		}
diff --git a/registry/storage/driver/frostfs/frostfs_test.go b/registry/storage/driver/frostfs/frostfs_test.go
index 201c97af..ed8fc934 100644
--- a/registry/storage/driver/frostfs/frostfs_test.go
+++ b/registry/storage/driver/frostfs/frostfs_test.go
@@ -12,16 +12,19 @@ import (
 
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container"
 	cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"
+	cidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id/test"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/netmap"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool"
 	"git.frostfs.info/TrueCloudLab/frostfs-sdk-go/user"
 	storagedriver "github.com/distribution/distribution/v3/registry/storage/driver"
+	"github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree"
 	"github.com/google/uuid"
 	"github.com/nspcc-dev/neo-go/pkg/crypto/keys"
 	"github.com/nspcc-dev/neo-go/pkg/wallet"
 	"github.com/stretchr/testify/require"
 	"github.com/testcontainers/testcontainers-go"
 	"github.com/testcontainers/testcontainers-go/wait"
+	"go.uber.org/zap/zaptest"
 )
 
 const (
@@ -44,6 +47,42 @@ func params(walletPath string, containerID cid.ID) map[string]interface{} {
 	}
 }
 
+func TestDelete(t *testing.T) {
+	ctx := context.Background()
+
+	key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb")
+	require.NoError(t, err)
+
+	var owner user.ID
+	user.IDFromKey(&owner, key.PrivateKey.PublicKey)
+
+	log := zaptest.NewLogger(t)
+
+	treeCli, err := tree.NewTreeServiceClientMemory()
+	require.NoError(t, err)
+
+	d := driver{
+		treeService: tree.NewTree(treeCli, log),
+		owner:       &owner,
+		key:         &key.PrivateKey,
+		containerID: cidtest.ID(),
+	}
+
+	err = d.PutContent(ctx, "a", []byte("test"))
+	require.NoError(t, err)
+
+	err = d.PutContent(ctx, "ab", []byte("test"))
+	require.NoError(t, err)
+
+	err = d.Delete(ctx, "a")
+	require.NoError(t, err)
+
+	res, err := d.List(ctx, "")
+	require.NoError(t, err)
+	fmt.Println(res)
+	require.Contains(t, res, "ab")
+}
+
 func TestIntegration(t *testing.T) {
 	f, err := os.CreateTemp("", "wallet")
 	require.NoError(t, err)
diff --git a/registry/storage/driver/frostfs/tree/tree.go b/registry/storage/driver/frostfs/tree/tree.go
index a8e18337..0cccb787 100644
--- a/registry/storage/driver/frostfs/tree/tree.go
+++ b/registry/storage/driver/frostfs/tree/tree.go
@@ -196,6 +196,9 @@ func (c *Tree) GetListObjectByPrefix(ctx context.Context, containerID cid.ID, pr
 	nodes := make([]NodeResponse, 0)
 
 	for _, node := range subTree {
+		if node.GetNodeID() == 0 {
+			continue
+		}
 		if !isIntermediate(node) {
 			nodes = append(nodes, node)
 		}

Please check the following test: ```diff diff --git a/registry/storage/driver/frostfs/frostfs.go b/registry/storage/driver/frostfs/frostfs.go index 2d899ff9..ac32f6ad 100644 --- a/registry/storage/driver/frostfs/frostfs.go +++ b/registry/storage/driver/frostfs/frostfs.go @@ -17,6 +17,7 @@ import ( resolver "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/ns" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object" oid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id" + oidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/id/test" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/object/transformer" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool" treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree" @@ -527,11 +528,7 @@ func (d *driver) PutContent(ctx context.Context, path string, content []byte) er var prm pool.PrmObjectPut prm.SetHeader(*obj) - id, err := d.sdkPool.PutObject(ctx, prm) - if err != nil { - return fmt.Errorf("couldn't put object '%s': %w", path, err) - } - _, err = d.treeService.AddObject(ctx, d.containerID, path, id, uint64(len(content))) + _, err := d.treeService.AddObject(ctx, d.containerID, path, oidtest.ID(), uint64(len(content))) if err != nil { return fmt.Errorf("can't add object to tree: %w", err) } @@ -690,10 +687,6 @@ func (d *driver) Delete(ctx context.Context, path string) error { } for _, treeObj := range treeObjList { - if err := d.delete(ctx, treeObj.ObjID); err != nil { - return fmt.Errorf("error in delete object oid: %s : %w", treeObj.ObjID, err) - } - if err = d.treeService.DeleteObject(ctx, d.containerID, treeObj.ID); err != nil { return fmt.Errorf("couldn't delete object from tree: %w", err) } diff --git a/registry/storage/driver/frostfs/frostfs_test.go b/registry/storage/driver/frostfs/frostfs_test.go index 201c97af..ed8fc934 100644 --- a/registry/storage/driver/frostfs/frostfs_test.go +++ b/registry/storage/driver/frostfs/frostfs_test.go @@ -12,16 +12,19 @@ import ( "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container" cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id" + cidtest "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id/test" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/netmap" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool" "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/user" storagedriver "github.com/distribution/distribution/v3/registry/storage/driver" + "github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree" "github.com/google/uuid" "github.com/nspcc-dev/neo-go/pkg/crypto/keys" "github.com/nspcc-dev/neo-go/pkg/wallet" "github.com/stretchr/testify/require" "github.com/testcontainers/testcontainers-go" "github.com/testcontainers/testcontainers-go/wait" + "go.uber.org/zap/zaptest" ) const ( @@ -44,6 +47,42 @@ func params(walletPath string, containerID cid.ID) map[string]interface{} { } } +func TestDelete(t *testing.T) { + ctx := context.Background() + + key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb") + require.NoError(t, err) + + var owner user.ID + user.IDFromKey(&owner, key.PrivateKey.PublicKey) + + log := zaptest.NewLogger(t) + + treeCli, err := tree.NewTreeServiceClientMemory() + require.NoError(t, err) + + d := driver{ + treeService: tree.NewTree(treeCli, log), + owner: &owner, + key: &key.PrivateKey, + containerID: cidtest.ID(), + } + + err = d.PutContent(ctx, "a", []byte("test")) + require.NoError(t, err) + + err = d.PutContent(ctx, "ab", []byte("test")) + require.NoError(t, err) + + err = d.Delete(ctx, "a") + require.NoError(t, err) + + res, err := d.List(ctx, "") + require.NoError(t, err) + fmt.Println(res) + require.Contains(t, res, "ab") +} + func TestIntegration(t *testing.T) { f, err := os.CreateTemp("", "wallet") require.NoError(t, err) diff --git a/registry/storage/driver/frostfs/tree/tree.go b/registry/storage/driver/frostfs/tree/tree.go index a8e18337..0cccb787 100644 --- a/registry/storage/driver/frostfs/tree/tree.go +++ b/registry/storage/driver/frostfs/tree/tree.go @@ -196,6 +196,9 @@ func (c *Tree) GetListObjectByPrefix(ctx context.Context, containerID cid.ID, pr nodes := make([]NodeResponse, 0) for _, node := range subTree { + if node.GetNodeID() == 0 { + continue + } if !isIntermediate(node) { nodes = append(nodes, node) } ```

r.loginov commented

2024-04-02 12:58:47 +00:00

This test will not pass due to the fact that paths must start with a slash (distribution, when calling driver methods, always passes paths starting with a slash). But now I have added path validation to the driver methods. I also made a change regarding skipping the root node in GetListObjectByPrefix.

An example of a test that works and considers the situation you described above:

func TestDelete(t *testing.T) {  
    ctx := context.Background()  
  
    key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb")  
    require.NoError(t, err)  
  
    var owner user.ID  
    user.IDFromKey(&owner, key.PrivateKey.PublicKey)  
  
    treeCli, err := tree.NewTreeServiceClientMemory()  
    require.NoError(t, err)  
  
    d := &driver{  
       treeService: tree.NewTree(treeCli, nil),  
       owner:       &owner,  
       key:         &key.PrivateKey,  
       containerID: cidtest.ID(),  
    }  
  
    err = d.PutContent(ctx, "/a", []byte("test"))  
    require.NoError(t, err)  
  
    err = d.PutContent(ctx, "/ab", []byte("test"))  
    require.NoError(t, err)  
  
    err = d.Delete(ctx, "/a")  
    require.NoError(t, err)  
  
    res, err := d.List(ctx, "/")  
    require.NoError(t, err)  
    require.Contains(t, res, "/ab")  
}

This test will not pass due to the fact that paths must start with a slash (distribution, when calling driver methods, always passes paths starting with a slash). But now I have added path validation to the driver methods. I also made a change regarding skipping the root node in `GetListObjectByPrefix`. An example of a test that works and considers the situation you described above: ``` func TestDelete(t *testing.T) { ctx := context.Background() key, err := keys.NewPrivateKeyFromHex("1dd37fba80fec4e6a6f13fd708d8dcb3b29def768017052f6c930fa1c5d90bbb") require.NoError(t, err) var owner user.ID user.IDFromKey(&owner, key.PrivateKey.PublicKey) treeCli, err := tree.NewTreeServiceClientMemory() require.NoError(t, err) d := &driver{ treeService: tree.NewTree(treeCli, nil), owner: &owner, key: &key.PrivateKey, containerID: cidtest.ID(), } err = d.PutContent(ctx, "/a", []byte("test")) require.NoError(t, err) err = d.PutContent(ctx, "/ab", []byte("test")) require.NoError(t, err) err = d.Delete(ctx, "/a") require.NoError(t, err) res, err := d.List(ctx, "/") require.NoError(t, err) require.Contains(t, res, "/ab") } ```

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/tree/tree.go Outdated

					
				@ -0,0 +24,4 @@

					ServiceClient interface {

						GetNodes(ctx context.Context, p *GetNodesParams) ([]NodeResponse, error)

						GetSubTree(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) ([]NodeResponse, error)

						GetSubTreeStream(ctx context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error)

dkirillov commented

2024-03-28 08:37:19 +00:00

This is unused method (some others too). Can we drop it then?

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/tree/tree.go Outdated

					
				@ -0,0 +72,4 @@

					sizeKV      = "Size"

					splitIDKV   = "SplitID"

					versionTree = "version"

dkirillov commented

2024-03-28 08:25:36 +00:00

@alexvanin Did we decide use the same tree as for s3 objects?

alexvanin commented

2024-03-29 05:43:38 +00:00

I think so, but there is no strong opinion on this.

dkirillov commented

2024-03-29 06:25:15 +00:00

I have some concerns about using the same container/bucket in both ways: via s3 and distribution

alexvanin commented

2024-04-01 08:23:40 +00:00

Decided to use different tree names for distribution.

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/tree_service.go Outdated

					
				@ -0,0 +9,4 @@

					"github.com/distribution/distribution/v3/registry/storage/driver/frostfs/tree"

				)

				type TreeService interface {

dkirillov commented

2024-03-28 08:36:09 +00:00

I'm not sure there is a sense in this interface. Let's drop it. For tests we have low-level interface and its mock implementation

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/wrapper/pool_wrapper.go Outdated

					
				@ -0,0 +9,4 @@

					cid "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/container/id"

					treepool "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree"

					grpcService "git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool/tree/service"

					errorsFrost "github.com/distribution/distribution/v3/registry/storage/driver/frostfs/errors"

dkirillov commented

2024-03-28 08:30:52 +00:00

It seems there isn't such package

dkirillov marked this conversation as resolved

r.loginov force-pushed feature/integrate_tree_service from bb533eeec1 to 6d62809fb6

2024-04-01 13:43:41 +00:00

Compare

r.loginov force-pushed feature/integrate_tree_service from 6d62809fb6 to 7b97b95c32

2024-04-01 14:08:02 +00:00

Compare

r.loginov force-pushed feature/integrate_tree_service from 7b97b95c32 to 951d7af1fd

2024-04-02 10:45:06 +00:00

Compare

r.loginov force-pushed feature/integrate_tree_service from 951d7af1fd to 7a74e42e07

2024-04-02 12:52:30 +00:00

Compare

dkirillov reviewed 2024-04-02 14:56:52 +00:00

registry/storage/driver/frostfs/tree/tree.go Outdated

					
				@ -0,0 +75,4 @@

					modificationTime = "ModificationTime"

					objectTree = "object"

					splitTree  = "split"

dkirillov commented

2024-04-02 14:56:49 +00:00

Can we name this more distribution specific?
Like distribution-object, distribution-split for example

Can we name this more distribution specific? Like `distribution-object`, `distribution-split` for example

👍 1

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/tree/tree.go Outdated

					
				@ -0,0 +196,4 @@

					nodes := make([]NodeResponse, 0)

					for _, node := range subTree {

						if node.GetNodeID() == 0 {

dkirillov commented

2024-04-02 14:38:37 +00:00

Actually we have to check for rootID:

if node.GetNodeID() == rootID {

Actually we have to check for `rootID`: ```golang if node.GetNodeID() == rootID { ```

👍 1

dkirillov marked this conversation as resolved

registry/storage/driver/frostfs/tree/tree_client_in_memory.go Outdated

					
				@ -0,0 +251,4 @@

					return s.res[s.offset-1], nil

				}

				func (c *ServiceClientMemory) GetSubTreeStream(_ context.Context, containerID cid.ID, treeID string, rootID uint64, depth uint32) (SubTreeStream, error) {