client: Introduce ObjectPatch method #249

Merged
fyrchik merged 4 commits from aarifullin/frostfs-sdk-go:feat/patch_cli into master 2024-09-04 19:51:16 +00:00
Member
  1. go.mod: bump frostfs-api-go/v2 version
  2. object: Introduce SplitByMaxChunkLength method
  3. client: Introduce ObjectPatch method
  4. session: Support patch verb
  5. pool: Introduce objectPatch verb
1. go.mod: bump frostfs-api-go/v2 version 2. object: Introduce `SplitByMaxChunkLength` method 3. client: Introduce `ObjectPatch` method 4. session: Support `patch` verb 5. pool: Introduce `objectPatch` verb
aarifullin added the
discussion
label 2024-08-05 20:23:14 +00:00
aarifullin added 5 commits 2024-08-05 20:23:20 +00:00
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
* Add unit-test to test `SplitByMaxChunkLength`.

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
[#248] pool: Introduce objectPatch method
Some checks failed
DCO / DCO (pull_request) Failing after 25s
Tests and linters / Tests (1.22) (pull_request) Successful in 52s
Tests and linters / Tests (1.21) (pull_request) Successful in 57s
Tests and linters / Lint (pull_request) Successful in 1m59s
c0109f796a
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
aarifullin force-pushed feat/patch_cli from c0109f796a to 6c28bf0dbd 2024-08-05 20:24:19 +00:00 Compare
aarifullin reviewed 2024-08-05 20:24:57 +00:00
@ -0,0 +35,4 @@
// into a few patches. After split each `PayloadPatch` has a `Chunk` with the size <= `MaxChunkLength`.
//
// Result means success. Failure reason can be received via Close.
Patch(_ context.Context, patch *object.Patch) bool
Author
Member

@fyrchik, please, let's go on discussing here

(from #248 (comment))

@fyrchik, please, let's go on discussing here (from https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pulls/248#issuecomment-46339)
Owner

will need some decoration for patcher in the future

let's introduce complexity when it is really needed

>will need some decoration for patcher in the future let's introduce complexity when it is really needed
Owner

Oh, I see now, it is similar to ObjectWriter. Then OK.

Oh, I see now, it is similar to `ObjectWriter`. Then OK.
Owner

Can we make it more similar to ObjectWriter then? It has WriteHeader and WriteChunk as separate methods, similar to what you have done for patcher implementation.

Can we make it more similar to `ObjectWriter` then? It has `WriteHeader` and `WriteChunk` as separate methods, similar to what you have done for patcher implementation.
Author
Member

Please, check the refactored interface out

Please, check the refactored interface out
aarifullin requested review from mbiryukova 2024-08-05 20:25:10 +00:00
aarifullin requested review from storage-core-committers 2024-08-05 20:25:10 +00:00
aarifullin requested review from storage-core-developers 2024-08-05 20:25:23 +00:00
aarifullin requested review from storage-services-committers 2024-08-05 20:25:24 +00:00
aarifullin requested review from storage-services-developers 2024-08-05 20:26:17 +00:00
fyrchik reviewed 2024-08-06 08:34:14 +00:00
@ -0,0 +147,4 @@
maxChunkLen int
}
func (x *objectPatcher) Patch(_ context.Context, patch *object.Patch) bool {
Owner

The interface requires us to perform splitting, even though it also performs it internally.
Can we avoid the duplication here?
We might have Init function return a more convenient interface on top of it (with PatchHeader and PatchAttributes) and make this patch private (or allow the user to cast to this "raw" interface via another method).

The interface requires us to perform splitting, even though it also performs it internally. Can we avoid the duplication here? We might have `Init` function return a more convenient interface on top of it (with `PatchHeader` and `PatchAttributes`) and make this `patch` private (or allow the user to cast to this "raw" interface via another method).
object/patch.go Outdated
@ -80,1 +80,4 @@
// SplitByMaxChunkLength splits a payload patch into a few payload patches if its `Chunk` size
// exceeds `maxChunkLen`.
func (p *PayloadPatch) SplitByMaxChunkLength(maxChunkLen int) []*PayloadPatch {
Owner

I have concerns about this method:

  1. This method has restricted use: big chunks are not stored in memory and small chunks can be put in a single gRPC message.
  2. The semantics of PayloadPatch is a patch from a single gRPC message.
  3. We store the whole resulting slice in memory.
  4. This is a very specific helper for application <-> transport level conversion, I do not see the reason for it to be exported.
I have concerns about this method: 1. This method has restricted use: big chunks are not stored in memory and small chunks can be put in a single gRPC message. 2. The semantics of `PayloadPatch` _is_ a patch from a single gRPC message. 3. We store the whole resulting slice in memory. 4. This is a very specific helper for application <-> transport level conversion, I do not see the reason for it to be exported.
Author
Member

Removed this method, the logic is moved to object_patch.go

Removed this method, the logic is moved to `object_patch.go`
dstepanov-yadro requested changes 2024-08-06 09:18:50 +00:00
object/patch.go Outdated
@ -81,0 +81,4 @@
// SplitByMaxChunkLength splits a payload patch into a few payload patches if its `Chunk` size
// exceeds `maxChunkLen`.
func (p *PayloadPatch) SplitByMaxChunkLength(maxChunkLen int) []*PayloadPatch {
if len(p.Chunk) <= maxChunkLen {

If it is expected that patch chunk will too large for single message, then Chunk must be something like io.Reader to not to store all data in memory.

If it is expected that patch chunk will too large for single message, then `Chunk` must be something like `io.Reader` to not to store all data in memory.
Author
Member

Please, check the refactored interface out

Please, check the refactored interface out
@ -599,6 +599,7 @@ func TestObject_ForVerb(t *testing.T) {
session.VerbObjectRangeHash: v2session.ObjectVerbRangeHash,
session.VerbObjectRange: v2session.ObjectVerbRange,
session.VerbObjectDelete: v2session.ObjectVerbDelete,
session.VerbObjectPatch: v2session.ObjectPatch,

Why v2session.ObjectPatch, but not v2session.ObjectVerbPatch like other methods?

Why `v2session.ObjectPatch`, but not `v2session.ObjectVerbPatch` like other methods?
Author
Member
https://git.frostfs.info/TrueCloudLab/frostfs-api-go/pulls/100#issuecomment-46447
dstepanov-yadro marked this conversation as resolved
aarifullin force-pushed feat/patch_cli from 6c28bf0dbd to cf2254a147 2024-08-06 13:58:08 +00:00 Compare
aarifullin force-pushed feat/patch_cli from cf2254a147 to c642c42b07 2024-08-06 14:00:44 +00:00 Compare
aarifullin force-pushed feat/patch_cli from c642c42b07 to 5a121098f6 2024-08-06 14:04:13 +00:00 Compare
aarifullin requested review from dstepanov-yadro 2024-08-06 14:05:46 +00:00
aarifullin force-pushed feat/patch_cli from 5a121098f6 to 27cff41f36 2024-08-06 14:20:56 +00:00 Compare
aarifullin force-pushed feat/patch_cli from 27cff41f36 to 5fc7ebc322 2024-08-06 15:16:32 +00:00 Compare
dstepanov-yadro approved these changes 2024-08-07 10:35:29 +00:00
aarifullin force-pushed feat/patch_cli from 5fc7ebc322 to 6835affbb1 2024-08-08 08:24:13 +00:00 Compare
aarifullin requested review from dstepanov-yadro 2024-08-08 08:24:36 +00:00
mbiryukova approved these changes 2024-08-08 08:26:59 +00:00
fyrchik reviewed 2024-08-08 10:45:28 +00:00
@ -0,0 +190,4 @@
rngPart.SetOffset(offset)
if originalLength == 0 || offset-rng.GetOffset() >= originalLength {
rngPart.SetLength(0)
Owner

It is the default value, no need to set.

It is the default value, no need to set.
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
@ -0,0 +197,4 @@
if !x.patch(&object.Patch{
Address: x.addr,
Owner

Why are there spaces in every struct literal?

Why are there spaces in every struct literal?
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
@ -0,0 +209,4 @@
offset += uint64(n)
if length > 0 {
length -= patchLength
Owner

The whole offset tracking thing is rather tedious.
I suggest using (offset, length) for the first message and (offset+length,0) for the rest. Should be less code, because we don't need any calculations.

The whole offset tracking thing is rather tedious. I suggest using `(offset, length)` for the first message and `(offset+length,0)` for the rest. Should be less code, because we don't need any calculations.
Author
Member

I suggest using (offset, length) for the first message and (offset+length,0)

Sorry but I don't get the idea and how it helps to recude code. These calculations are needed because we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large. This leads to the point when we need to split (offset, length - max_buf_size) and (offset + length - max_buf_size, length - max_buf_size)

> I suggest using (offset, length) for the first message and (offset+length,0) Sorry but I don't get the idea and how it helps to recude code. These calculations are needed because we can't just simply break down to `(offset, length)` and `(offset + length, 0)` because `chunk` can be very large. This leads to the point when we need to split `(offset, length - max_buf_size)` and `(offset + length - max_buf_size, length - max_buf_size)`
Owner

we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large

I do not see any problem here.
You replace (offset,length) with the first part (of any possible length, you pick).
You replace (offset+length,0) with the second part (of any possible length, you pick).

Where do we need to take chunk into account?

>we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large I do not see any problem here. You replace `(offset,length)` with the first part (of any possible length, you pick). You replace `(offset+length,0)` with the second part (of any possible length, you pick). Where do we need to take `chunk` into account?
Author
Member

Let's consider such a patch:

{ .offset: 0, .length: 4, .payload: '0123456789!@' }

Let maxChunkLen = 2 -> the size for the buff to read data

then the patch should be splitted like that:

{ .offset = 0,  .length = 2, .payload = '01' } -> replaces 'length' bytes of the original payload
{ .offset = 2,  .length = 2, .payload = '23' } -> replaces 'length' bytes of the original payload
{ .offset = 4,  .length = 0, .payload = '45' } -> inserts bytes at the original payload 
{ .offset = 6,  .length = 0, .payload = '67' } -> inserts bytes at the original payload 
{ .offset = 8,  .length = 0, .payload = '89' } -> inserts bytes at the original payload 
{ .offset = 10, .length = 0, .payload = '!@' } -> inserts bytes at the original payload 

if the first read message is like (offset, length) and the rest are (offset + length, 0), then

{ .offset = 0, .length = 2, .payload = '01' }
{ .offset = 2, .length = 0, .payload = '23' } -> incorrect

So, we need to track the length because it's not always 0 in (offset + length, 0)

Let's consider such a patch: ``` { .offset: 0, .length: 4, .payload: '0123456789!@' } ``` Let `maxChunkLen = 2` -> the size for the buff to read data then the patch should be splitted like that: ``` { .offset = 0, .length = 2, .payload = '01' } -> replaces 'length' bytes of the original payload { .offset = 2, .length = 2, .payload = '23' } -> replaces 'length' bytes of the original payload { .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload { .offset = 6, .length = 0, .payload = '67' } -> inserts bytes at the original payload { .offset = 8, .length = 0, .payload = '89' } -> inserts bytes at the original payload { .offset = 10, .length = 0, .payload = '!@' } -> inserts bytes at the original payload ``` if the first read message is like `(offset, length)` and the rest are `(offset + length, 0)`, then ``` { .offset = 0, .length = 2, .payload = '01' } { .offset = 2, .length = 0, .payload = '23' } -> incorrect ``` So, we need to track the length because it's not always `0` in `(offset + length, 0)`
Owner

Why is this correct { .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload
And this is not { .offset = 2, .length = 0, .payload = '23' } -> incorrect ?

Why is this correct `{ .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload ` And this is not `{ .offset = 2, .length = 0, .payload = '23' } -> incorrect` ?
Owner

The misunderstanding can stem from the fact that patch size CAN differ from the part in the original payload, so you can replace X bytes with another Y bytes, and there is no relation between X and Y.

The misunderstanding can stem from the fact that patch size CAN differ from the part in the original payload, so you can replace `X` bytes with another `Y` bytes, and there is no relation between X and Y.
Author
Member
  1. Currently it works like that:
[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }

[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d]

We replacing first .length = 4 bytes and just insert the rest

[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }
 ↓  ↓  ↓  ↓
[o][r][i][g] _  _  _  _  _  _  _  _ [i][n][a][l][p][a][y][l][o][a][d]

[0][1][2][3][4][5][6][7][8][9][!][@][i][n][a][l][p][a][y][l][o][a][d]

So, consider the buffer size is 2, so each subpatch's payload can be only <=2

[0][1] { .offset = 0,  .length = 2, .payload = '01' }
 ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d]

      [2][3] { .offset = 2,  .length = 2, .payload = '23' }
       ↓  ↓
[0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

            [4][5] { .offset = 2,  .length = 0, .payload = '45' }
             ↓  ↓
[0][1][2][3] _  _ [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][4][5][i][n][a][l][p][a][y][l][o][a][d]

...so on...
  1. If { .offset = 2, .length = 0, .payload = '23' } whould be correct

then we insert 23, not replace

[0][1] { .offset = 0,  .length = 2, .payload = '01' }
 ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d]

      [2][3] { .offset = 2,  .length = 0, .payload = '23' }
       ↓  ↓
[0][1] _  _ [i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][g][i][n][a][l][p][a][y][l][o][a][d]

So, that how I assumed how it should work

1. Currently it works like that: ``` [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] ``` We replacing first `.length = 4` bytes and just insert the rest ``` [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } ↓ ↓ ↓ ↓ [o][r][i][g] _ _ _ _ _ _ _ _ [i][n][a][l][p][a][y][l][o][a][d] [0][1][2][3][4][5][6][7][8][9][!][@][i][n][a][l][p][a][y][l][o][a][d] ``` So, consider the buffer size is `2`, so each subpatch's payload can be only `<=2` ``` [0][1] { .offset = 0, .length = 2, .payload = '01' } ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 2, .length = 2, .payload = '23' } ↓ ↓ [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d] [4][5] { .offset = 2, .length = 0, .payload = '45' } ↓ ↓ [0][1][2][3] _ _ [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][4][5][i][n][a][l][p][a][y][l][o][a][d] ...so on... ``` 2. If `{ .offset = 2, .length = 0, .payload = '23' }` whould be correct then we insert `23`, not replace ``` [0][1] { .offset = 0, .length = 2, .payload = '01' } ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 2, .length = 0, .payload = '23' } ↓ ↓ [0][1] _ _ [i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][g][i][n][a][l][p][a][y][l][o][a][d] ``` So, that how I assumed how it should work
Owner

You need to replace { .offset = 0, .length = 2, .payload = '01' } with { .offset = 0, .length = 4, .payload = '01' }.
It should work according to spec, irregardless of what is implemented now.

After { .offset = 0, .length = 4, .payload = '01' }:
[0][1][i][n][a][l][p][a][y][l][o][a][d]

After {.offset = 4, .length = 0, .payload = '23' }:
[0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

You need to replace `{ .offset = 0, .length = 2, .payload = '01' }` with `{ .offset = 0, .length = 4, .payload = '01' }`. It _should_ work according to spec, irregardless of what is implemented now. After `{ .offset = 0, .length = 4, .payload = '01' }`: [0][1][i][n][a][l][p][a][y][l][o][a][d] After `{.offset = 4, .length = 0, .payload = '23' }`: [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]
Author
Member

Finally, I get your point. Alright, let me check patcher then, because I didn't take this fact in account when I was designing it

Finally, I get your point. Alright, let me check `patcher` then, because I didn't take this fact in account when I was designing it
Author
Member

I've checked patcher. It correctly works. So, I fixed PatchPayload according to your suggestion - it's really perfect

I've checked `patcher`. It correctly works. So, I fixed `PatchPayload` according to your suggestion - it's really perfect
Author
Member

For those who don't get the idea we have been discussing:

ε - empty character, it erases a character with the same position

The original payload
[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }

[0][1] ε  ε  { .offset = 0,  .length = 4, .payload = '01' } -- length is greater than payload
 ↓  ↓  ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][n][a][l][p][a][y][l][o][a][d]

           [2][3] { .offset = 0 + 4,  .length = 0, .payload = '23' }
            ↓  
____________[i][n][a][l][p][a][y][l][o][a][d]


replaced  inserted
 ↓  ↓        ↓ ↓ 
[0][1] ε ε [2][3] [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

For those who don't get the idea we have been discussing: `ε - empty character, it erases a character with the same position` ``` The original payload [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } [0][1] ε ε { .offset = 0, .length = 4, .payload = '01' } -- length is greater than payload ↓ ↓ ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 0 + 4, .length = 0, .payload = '23' } ↓ ____________[i][n][a][l][p][a][y][l][o][a][d] replaced inserted ↓ ↓ ↓ ↓ [0][1] ε ε [2][3] [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d] ```
Owner

I don't see where it is changed, you still have length -= patchLength and patchLength depends on the number of items you have read (n, err)

I don't see where it is changed, you still have `length -= patchLength` and patchLength depends on the number of items you have read (`n, err`)
Author
Member

Um... sorry for the confusion - forgot to push after local rebase. Please, check this out

Um... sorry for the confusion - forgot to push after local rebase. Please, check this out
@ -0,0 +38,4 @@
}
func TestObjectPatcher(t *testing.T) {
t.Run("no split payload patch", func(t *testing.T) {
Owner

It seems this suit of tests could be less in size by using table tests pattern: we have payload, we have range and that's all.
What do you think?

It seems this suit of tests could be less in size by using table tests pattern: we have payload, we have range and that's all. What do you think?
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
pool/pool.go Outdated
@ -164,6 +166,7 @@ const (
methodAPEManagerAddChain
methodAPEManagerRemoveChain
methodAPEManagerListChains
methodObjectPatch
Owner

@dkirillov do we need backwards compatibility in this field?

If no, @aarifullin, could you move it to the object group?
It seems we have stringer for the type, and it is only used for metrics.

@dkirillov do we need backwards compatibility in this field? If no, @aarifullin, could you move it to the `object` group? It seems we have stringer for the type, and it is only used for metrics.
Member

I suppose no

I suppose no
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
aarifullin force-pushed feat/patch_cli from 6835affbb1 to f0a8e54735 2024-08-08 12:19:26 +00:00 Compare
aarifullin force-pushed feat/patch_cli from f0a8e54735 to 5b77cbdc79 2024-08-08 12:23:22 +00:00 Compare
dstepanov-yadro approved these changes 2024-08-08 14:04:02 +00:00
aarifullin requested review from mbiryukova 2024-08-08 15:30:28 +00:00
aarifullin force-pushed feat/patch_cli from 5b77cbdc79 to 5d58519253 2024-08-09 11:19:54 +00:00 Compare
mbiryukova approved these changes 2024-08-09 12:19:17 +00:00
fyrchik approved these changes 2024-08-09 12:48:09 +00:00
fyrchik approved these changes 2024-08-09 12:49:00 +00:00
fyrchik merged commit 5d58519253 into master 2024-08-09 12:49:07 +00:00
fyrchik referenced this pull request from a commit 2024-08-09 12:49:10 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
TrueCloudLab/storage-services-committers
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-sdk-go#249
No description provided.