[#466] Implement PATCH for multipart objects #466

Merged
alexvanin merged 1 commit from mbiryukova/frostfs-s3-gw:feature/patch_multipart into feature/patch 2024-08-30 06:55:47 +00:00
Member
No description provided.
mbiryukova self-assigned this 2024-08-20 09:50:29 +00:00
mbiryukova added 2 commits 2024-08-20 09:50:32 +00:00
[#462] Implement PATCH for simple objects
All checks were successful
/ Builds (1.21) (pull_request) Successful in 2m18s
/ Builds (1.22) (pull_request) Successful in 2m16s
/ DCO (pull_request) Successful in 2m13s
/ Vulncheck (pull_request) Successful in 2m13s
/ Lint (pull_request) Successful in 3m17s
/ Tests (1.21) (pull_request) Successful in 2m24s
/ Tests (1.22) (pull_request) Successful in 2m21s
40866df3a9
Signed-off-by: Marina Biryukova <m.biryukova@yadro.com>
[#xxx] Implement PATCH for multipart objects
Some checks failed
/ DCO (pull_request) Failing after 1m33s
/ Builds (1.21) (pull_request) Successful in 2m11s
/ Builds (1.22) (pull_request) Successful in 2m10s
/ Vulncheck (pull_request) Successful in 2m5s
/ Lint (pull_request) Successful in 2m58s
/ Tests (1.21) (pull_request) Successful in 2m9s
/ Tests (1.22) (pull_request) Successful in 2m8s
324665e77a
Signed-off-by: Marina Biryukova <m.biryukova@yadro.com>
mbiryukova force-pushed feature/patch_multipart from 324665e77a to 2d6a8516b6 2024-08-20 09:50:59 +00:00 Compare
mbiryukova changed title from WIP: [#xxx] Implement PATCH for multipart objects to WIP: [#466] Implement PATCH for multipart objects 2024-08-20 09:51:33 +00:00
mbiryukova force-pushed feature/patch_multipart from 2d6a8516b6 to c01772149e 2024-08-20 09:55:08 +00:00 Compare
mbiryukova changed title from WIP: [#466] Implement PATCH for multipart objects to [#466] Implement PATCH for multipart objects 2024-08-21 13:49:50 +00:00
mbiryukova requested review from storage-services-committers 2024-08-21 13:50:09 +00:00
mbiryukova requested review from storage-services-developers 2024-08-21 13:50:09 +00:00
dkirillov reviewed 2024-08-23 13:23:32 +00:00
@ -604,0 +624,4 @@
}
allPatchLen := p.Range.End - p.Range.Start + 1
patchReader := newPartialReader(p.NewBytes, 64*1024)
Member

Should we configure buffer size?

Should we configure buffer size?
dkirillov marked this conversation as resolved
@ -604,0 +641,4 @@
prmPatch.Object = part.OID
prmPatch.ObjectSize = part.Size
prmPatch.Range = *p.Range
Member

Can we use brand new range param. It's not easy to catch idea with p.Range.Start -= part.Size etc.

Can we use brand new range param. It's not easy to catch idea with `p.Range.Start -= part.Size` etc.
dkirillov marked this conversation as resolved
@ -604,0 +646,4 @@
var patchLen uint64
if i != len(parts)-1 {
if prmPatch.Range.End >= part.Size-1 {
prmPatch.Range.End = part.Size - 1
Member

If we have to patch all part it's better to use single put method rather than patch + head

If we have to patch all part it's better to use single `put` method rather than `patch + head`
dkirillov marked this conversation as resolved
@ -604,0 +652,4 @@
} else {
patchLen = allPatchLen - patchedLen
}
prmPatch.Payload = patchReader.partReader(patchLen)
Member

Can we use io.LimitReader(p.NewBytes, patchLen) ?

Can we use `io.LimitReader(p.NewBytes, patchLen)` ?
dkirillov marked this conversation as resolved
@ -604,0 +655,4 @@
prmPatch.Payload = patchReader.partReader(patchLen)
patchedLen += patchLen
objID, err := n.frostFS.PatchObject(ctx, prmPatch)
Member

Can we move to separate method patching part?

Can we move to separate method patching part?
dkirillov marked this conversation as resolved
@ -604,0 +712,4 @@
a.SetValue(headerParts.String())
prmPatch.Attributes = append(prmPatch.Attributes, a)
objID, err := n.frostFS.PatchObject(ctx, prmPatch)
Member

Why do we use patch to update "combined" object? It seems we can use regular put

Why do we use patch to update "combined" object? It seems we can use regular put
dkirillov marked this conversation as resolved
@ -604,0 +733,4 @@
Size: obj.PayloadSize(),
Created: &p.Object.Created,
Owner: &n.gateOwner,
// TODO: Add creation epoch
Member

Probably we should rebase feature/patch before merge this PR?

Probably we should rebase `feature/patch` before merge this PR?
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-08-26 13:21:22 +00:00
@ -604,0 +632,4 @@
continue
}
if p.Range.Start > part.Size || (p.Range.Start == part.Size && i != len(parts)-1) {
Member

I don't understand why do we need || (p.Range.Start == part.Size && i != len(parts)-1). If we really need this, let's add test for this condition. Currently without this condition tests pass

I don't understand why do we need `|| (p.Range.Start == part.Size && i != len(parts)-1)`. If we really need this, let's add test for this condition. Currently without this condition tests pass
Author
Member

If range starts after the last part - we have to patch the last part. Otherwise, we move on to the next part. I will add test for this condition

If range starts after the last part - we have to patch the last part. Otherwise, we move on to the next part. I will add test for this condition
Member

I still can freely delete this condition

I still can freely delete this condition
Author
Member

Seems that it doesn't affect the result, but reduces number of patch requests. If it's ok, I can count their number and compare it with expected one

Seems that it doesn't affect the result, but reduces number of patch requests. If it's ok, I can count their number and compare it with expected one
dkirillov marked this conversation as resolved
mbiryukova force-pushed feature/patch_multipart from c01772149e to dde5e04461 2024-08-27 15:59:46 +00:00 Compare
mbiryukova force-pushed feature/patch_multipart from dde5e04461 to 48ca6cb146 2024-08-27 16:04:47 +00:00 Compare
mbiryukova force-pushed feature/patch_multipart from 48ca6cb146 to 3c504e4661 2024-08-28 11:14:14 +00:00 Compare
mbiryukova force-pushed feature/patch_multipart from 3c504e4661 to eb80d14923 2024-08-28 13:32:17 +00:00 Compare
mbiryukova force-pushed feature/patch_multipart from eb80d14923 to 9f9ef38fdc 2024-08-28 15:57:14 +00:00 Compare
mbiryukova force-pushed feature/patch_multipart from 9f9ef38fdc to d1a5403b1f 2024-08-29 07:41:06 +00:00 Compare
dkirillov reviewed 2024-08-29 09:22:10 +00:00
@ -110,0 +115,4 @@
bktName, objName, partSize := "bucket-for-multipart-patch", "object-for-multipart-patch", 5*1024*1024
createTestBucket(tc, bktName)
// patch beginning of the first part
Member

Maybe it's better to use t.Run

t.Run("patch beginning of the first part", func(t *testing.T) {
	multipartInfo := createMultipartUpload(tc, bktName, objName, map[string]string{})
	etag1, data1 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 1, partSize)
	etag2, data2 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 2, partSize)
	etag3, data3 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 3, partSize)
	completeMultipartUpload(tc, bktName, objName, multipartInfo.UploadID, []string{etag1, etag2, etag3})

	patchObject(t, tc, bktName, objName, "bytes 0-"+strconv.Itoa(patchSize-1)+"/*", patchBody, nil)
	object, header := getObject(tc, bktName, objName)
	contentLen, err := strconv.Atoi(header.Get(api.ContentLength))
	require.NoError(t, err)
	equalDataSlices(t, bytes.Join([][]byte{patchBody, data1[patchSize:], data2, data3}, []byte("")), object)
	require.Equal(t, partSize*3, contentLen)
	require.True(t, strings.HasSuffix(data.UnQuote(header.Get(api.ETag)), "-3"))
})
Maybe it's better to use `t.Run` ```golang t.Run("patch beginning of the first part", func(t *testing.T) { multipartInfo := createMultipartUpload(tc, bktName, objName, map[string]string{}) etag1, data1 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 1, partSize) etag2, data2 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 2, partSize) etag3, data3 := uploadPart(tc, bktName, objName, multipartInfo.UploadID, 3, partSize) completeMultipartUpload(tc, bktName, objName, multipartInfo.UploadID, []string{etag1, etag2, etag3}) patchObject(t, tc, bktName, objName, "bytes 0-"+strconv.Itoa(patchSize-1)+"/*", patchBody, nil) object, header := getObject(tc, bktName, objName) contentLen, err := strconv.Atoi(header.Get(api.ContentLength)) require.NoError(t, err) equalDataSlices(t, bytes.Join([][]byte{patchBody, data1[patchSize:], data2, data3}, []byte("")), object) require.Equal(t, partSize*3, contentLen) require.True(t, strings.HasSuffix(data.UnQuote(header.Get(api.ETag)), "-3")) }) ```
dkirillov marked this conversation as resolved
dkirillov reviewed 2024-08-29 09:24:45 +00:00
@ -579,0 +641,4 @@
allPatchLen := p.Range.End - p.Range.Start + 1
var multipartObjectSize, patchedLen uint64
for i, part := range parts {
Member

Probably it's matter of taste but what do you think about writing something like this

diff --git a/api/layer/layer.go b/api/layer/layer.go
index b8c2923f..ead72703 100644
--- a/api/layer/layer.go
+++ b/api/layer/layer.go
@@ -639,36 +639,22 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams)
 	}
 	n.prepareAuthParameters(ctx, &prmPatch.PrmAuth, p.BktInfo.Owner)
 
-	allPatchLen := p.Range.End - p.Range.Start + 1
-	var multipartObjectSize, patchedLen uint64
+	off, ln := p.Range.Start, p.Range.End-p.Range.Start+1
+	var multipartObjectSize uint64
 	for i, part := range parts {
-		if patchedLen == allPatchLen {
+		if off > part.Size || (off == part.Size && i != len(parts)-1) || ln == 0 {
 			multipartObjectSize += part.Size
+			if ln != 0 {
+				off -= part.Size
+			}
 			continue
 		}
 
-		if p.Range.Start > part.Size || (p.Range.Start == part.Size && i != len(parts)-1) {
-			multipartObjectSize += part.Size
-			p.Range.Start -= part.Size
-			p.Range.End -= part.Size
-			continue
-		}
-
-		prmPatch.Object = part.OID
-		prmPatch.ObjectSize = part.Size
-		prmPatch.Offset = p.Range.Start
-		prmPatch.Length = p.Range.End - p.Range.Start + 1
-
-		if i != len(parts)-1 && prmPatch.Length+prmPatch.Offset > part.Size {
-			prmPatch.Length = part.Size - prmPatch.Offset
-		}
-		patchedLen += prmPatch.Length
-
 		var createdObj *data.CreatedObjectInfo
-		if prmPatch.Offset == 0 && prmPatch.Length >= part.Size {
+		if off == 0 && ln > part.Size && part.Size != 0  {
 			prm := PrmObjectCreate{
 				Container:    p.BktInfo.CID,
-				Payload:      io.LimitReader(p.NewBytes, int64(prmPatch.Length)),
+				Payload:      io.LimitReader(p.NewBytes, int64(part.Size)),
 				CreationTime: part.Created,
 				CopiesNumber: p.CopiesNumbers,
 			}
@@ -677,13 +663,25 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams)
 			if err != nil {
 				return nil, fmt.Errorf("put new part object '%s': %w", part.OID.EncodeToString(), err)
 			}
+			ln -= part.Size
 		} else {
+			curLen := ln
+			if off+curLen > part.Size && i != len(parts)-1 {
+				curLen = part.Size - off
+			}
+			prmPatch.Object = part.OID
+			prmPatch.ObjectSize = part.Size
+			prmPatch.Offset = off
+			prmPatch.Length = curLen
 			prmPatch.Payload = io.LimitReader(p.NewBytes, int64(prmPatch.Length))
 
 			createdObj, err = n.patchObject(ctx, prmPatch)
 			if err != nil {
 				return nil, fmt.Errorf("patch part object '%s': %w", part.OID.EncodeToString(), err)
 			}
+
+			ln -= curLen
+			off = 0
 		}
 
 		parts[i].OID = createdObj.ID
@@ -692,11 +690,6 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams)
 		parts[i].ETag = hex.EncodeToString(createdObj.HashSum)
 
 		multipartObjectSize += createdObj.Size
-
-		if p.Range.Start > 0 {
-			p.Range.Start = 0
-		}
-		p.Range.End -= part.Size
 	}
 
 	newParts, err := json.Marshal(parts)

Probably it's matter of taste but what do you think about writing something like this ```diff diff --git a/api/layer/layer.go b/api/layer/layer.go index b8c2923f..ead72703 100644 --- a/api/layer/layer.go +++ b/api/layer/layer.go @@ -639,36 +639,22 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams) } n.prepareAuthParameters(ctx, &prmPatch.PrmAuth, p.BktInfo.Owner) - allPatchLen := p.Range.End - p.Range.Start + 1 - var multipartObjectSize, patchedLen uint64 + off, ln := p.Range.Start, p.Range.End-p.Range.Start+1 + var multipartObjectSize uint64 for i, part := range parts { - if patchedLen == allPatchLen { + if off > part.Size || (off == part.Size && i != len(parts)-1) || ln == 0 { multipartObjectSize += part.Size + if ln != 0 { + off -= part.Size + } continue } - if p.Range.Start > part.Size || (p.Range.Start == part.Size && i != len(parts)-1) { - multipartObjectSize += part.Size - p.Range.Start -= part.Size - p.Range.End -= part.Size - continue - } - - prmPatch.Object = part.OID - prmPatch.ObjectSize = part.Size - prmPatch.Offset = p.Range.Start - prmPatch.Length = p.Range.End - p.Range.Start + 1 - - if i != len(parts)-1 && prmPatch.Length+prmPatch.Offset > part.Size { - prmPatch.Length = part.Size - prmPatch.Offset - } - patchedLen += prmPatch.Length - var createdObj *data.CreatedObjectInfo - if prmPatch.Offset == 0 && prmPatch.Length >= part.Size { + if off == 0 && ln > part.Size && part.Size != 0 { prm := PrmObjectCreate{ Container: p.BktInfo.CID, - Payload: io.LimitReader(p.NewBytes, int64(prmPatch.Length)), + Payload: io.LimitReader(p.NewBytes, int64(part.Size)), CreationTime: part.Created, CopiesNumber: p.CopiesNumbers, } @@ -677,13 +663,25 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams) if err != nil { return nil, fmt.Errorf("put new part object '%s': %w", part.OID.EncodeToString(), err) } + ln -= part.Size } else { + curLen := ln + if off+curLen > part.Size && i != len(parts)-1 { + curLen = part.Size - off + } + prmPatch.Object = part.OID + prmPatch.ObjectSize = part.Size + prmPatch.Offset = off + prmPatch.Length = curLen prmPatch.Payload = io.LimitReader(p.NewBytes, int64(prmPatch.Length)) createdObj, err = n.patchObject(ctx, prmPatch) if err != nil { return nil, fmt.Errorf("patch part object '%s': %w", part.OID.EncodeToString(), err) } + + ln -= curLen + off = 0 } parts[i].OID = createdObj.ID @@ -692,11 +690,6 @@ func (n *Layer) patchMultipartObject(ctx context.Context, p *PatchObjectParams) parts[i].ETag = hex.EncodeToString(createdObj.HashSum) multipartObjectSize += createdObj.Size - - if p.Range.Start > 0 { - p.Range.Start = 0 - } - p.Range.End -= part.Size } newParts, err := json.Marshal(parts) ```
Author
Member

Looks good, but in some cases we will make unnecessary requests (which are not made with the condition discussed in the comment above)

Looks good, but in some cases we will make unnecessary requests (which are not made with the condition discussed in the comment above)
Member

I've updated diff in previous comment

UPD: again

I've updated diff in previous comment **UPD**: again
dkirillov marked this conversation as resolved
mbiryukova force-pushed feature/patch_multipart from d1a5403b1f to e8f22ed77e 2024-08-29 12:21:20 +00:00 Compare
alexvanin approved these changes 2024-08-29 13:47:34 +00:00
dkirillov approved these changes 2024-08-30 06:37:28 +00:00
dkirillov left a comment
Member

LGTM

LGTM
alexvanin merged commit e8f22ed77e into feature/patch 2024-08-30 06:55:47 +00:00
alexvanin deleted branch feature/patch_multipart 2024-08-30 06:55:49 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-services-developers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-s3-gw#466
No description provided.