client: Introduce ObjectPatch method #249

aarifullin · 2024-08-05T20:23:14Z

aarifullin commented

2024-08-05 20:23:14 +00:00

go.mod: bump frostfs-api-go/v2 version
object: Introduce SplitByMaxChunkLength method
client: Introduce ObjectPatch method
session: Support patch verb
pool: Introduce objectPatch verb

1. go.mod: bump frostfs-api-go/v2 version 2. object: Introduce `SplitByMaxChunkLength` method 3. client: Introduce `ObjectPatch` method 4. session: Support `patch` verb 5. pool: Introduce `objectPatch` verb

aarifullin added the

discussion

label 2024-08-05 20:23:14 +00:00

aarifullin added 5 commits 2024-08-05 20:23:20 +00:00

[#248 ] go.mod: Bump frostfs-api-go/v2 version 5e9cf6a8ae

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

[#248 ] object: Introduce SplitByMaxChunkLength method 895a2ef0e2

* Add unit-test to test `SplitByMaxChunkLength`.

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

[#248 ] object: Introduce object patcher 8b21c804df

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

[#248 ] session: Support patch verb fcbf7de470

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

[#248 ] pool: Introduce objectPatch method

DCO / DCO (pull_request) Failing after 25s

Details

Tests and linters / Tests (1.22) (pull_request) Successful in 52s

Details

Tests and linters / Tests (1.21) (pull_request) Successful in 57s

Details

Tests and linters / Lint (pull_request) Successful in 1m59s

Details

c0109f796a

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>

aarifullin force-pushed feat/patch_cli from c0109f796a to 6c28bf0dbd

2024-08-05 20:24:19 +00:00

Compare

aarifullin reviewed 2024-08-05 20:24:57 +00:00

client/object_patch.go Outdated

					
				@ -0,0 +35,4 @@

					// into a few patches. After split each `PayloadPatch` has a `Chunk` with the size <= `MaxChunkLength`.

					//

					// Result means success. Failure reason can be received via Close.

					Patch(_ context.Context, patch *object.Patch) bool

aarifullin commented

2024-08-05 20:24:57 +00:00

@fyrchik, please, let's go on discussing here

(from #248 (comment))

@fyrchik, please, let's go on discussing here (from https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pulls/248#issuecomment-46339)

fyrchik commented

2024-08-06 06:20:10 +00:00

will need some decoration for patcher in the future

let's introduce complexity when it is really needed

>will need some decoration for patcher in the future let's introduce complexity when it is really needed

fyrchik commented

2024-08-06 08:28:23 +00:00

Oh, I see now, it is similar to ObjectWriter. Then OK.

Oh, I see now, it is similar to `ObjectWriter`. Then OK.

fyrchik commented

2024-08-06 08:31:18 +00:00

Can we make it more similar to ObjectWriter then? It has WriteHeader and WriteChunk as separate methods, similar to what you have done for patcher implementation.

Can we make it more similar to `ObjectWriter` then? It has `WriteHeader` and `WriteChunk` as separate methods, similar to what you have done for patcher implementation.

aarifullin commented

2024-08-06 14:05:04 +00:00

Please, check the refactored interface out

requested reviews from mbiryukova, storage-core-committers, storage-core-developers, storage-services-committers, storage-services-developers

2024-08-05 20:26:17 +00:00

fyrchik reviewed 2024-08-06 08:34:14 +00:00

client/object_patch.go Outdated

					
				@ -0,0 +147,4 @@

					maxChunkLen int

				}

				func (x *objectPatcher) Patch(_ context.Context, patch *object.Patch) bool {

fyrchik commented

2024-08-06 08:34:10 +00:00

The interface requires us to perform splitting, even though it also performs it internally.
Can we avoid the duplication here?
We might have Init function return a more convenient interface on top of it (with PatchHeader and PatchAttributes) and make this patch private (or allow the user to cast to this "raw" interface via another method).

The interface requires us to perform splitting, even though it also performs it internally. Can we avoid the duplication here? We might have `Init` function return a more convenient interface on top of it (with `PatchHeader` and `PatchAttributes`) and make this `patch` private (or allow the user to cast to this "raw" interface via another method).

object/patch.go Outdated

					
				@ -80,1 +80,4 @@

				// SplitByMaxChunkLength splits a payload patch into a few payload patches if its `Chunk` size

				// exceeds `maxChunkLen`.

				func (p *PayloadPatch) SplitByMaxChunkLength(maxChunkLen int) []*PayloadPatch {

fyrchik commented

2024-08-06 08:17:19 +00:00

I have concerns about this method:

This method has restricted use: big chunks are not stored in memory and small chunks can be put in a single gRPC message.
The semantics of PayloadPatch is a patch from a single gRPC message.
We store the whole resulting slice in memory.
This is a very specific helper for application <-> transport level conversion, I do not see the reason for it to be exported.

I have concerns about this method: 1. This method has restricted use: big chunks are not stored in memory and small chunks can be put in a single gRPC message. 2. The semantics of `PayloadPatch` _is_ a patch from a single gRPC message. 3. We store the whole resulting slice in memory. 4. This is a very specific helper for application <-> transport level conversion, I do not see the reason for it to be exported.

👍 1

aarifullin commented

2024-08-06 14:05:40 +00:00

Removed this method, the logic is moved to object_patch.go

Removed this method, the logic is moved to `object_patch.go`

dstepanov-yadro requested changes 2024-08-06 09:18:50 +00:00

object/patch.go Outdated

					
				@ -81,0 +81,4 @@

				// SplitByMaxChunkLength splits a payload patch into a few payload patches if its `Chunk` size

				// exceeds `maxChunkLen`.

				func (p *PayloadPatch) SplitByMaxChunkLength(maxChunkLen int) []*PayloadPatch {

					if len(p.Chunk) <= maxChunkLen {

dstepanov-yadro commented

2024-08-06 09:14:28 +00:00

If it is expected that patch chunk will too large for single message, then Chunk must be something like io.Reader to not to store all data in memory.

If it is expected that patch chunk will too large for single message, then `Chunk` must be something like `io.Reader` to not to store all data in memory.

👍 1

aarifullin commented

2024-08-06 14:05:21 +00:00

Please, check the refactored interface out

session/object_test.go Outdated

					
				@ -599,6 +599,7 @@ func TestObject_ForVerb(t *testing.T) {

						session.VerbObjectRangeHash: v2session.ObjectVerbRangeHash,

						session.VerbObjectRange:     v2session.ObjectVerbRange,

						session.VerbObjectDelete:    v2session.ObjectVerbDelete,

						session.VerbObjectPatch:     v2session.ObjectPatch,

dstepanov-yadro commented

2024-08-06 09:15:31 +00:00

Why v2session.ObjectPatch, but not v2session.ObjectVerbPatch like other methods?

Why `v2session.ObjectPatch`, but not `v2session.ObjectVerbPatch` like other methods?

aarifullin commented

2024-08-06 09:32:38 +00:00

TrueCloudLab/frostfs-api-go#100 (comment)

https://git.frostfs.info/TrueCloudLab/frostfs-api-go/pulls/100#issuecomment-46447

dstepanov-yadro marked this conversation as resolved

aarifullin force-pushed feat/patch_cli from 6c28bf0dbd to cf2254a147

2024-08-06 13:58:08 +00:00

Compare

aarifullin force-pushed feat/patch_cli from cf2254a147 to c642c42b07

2024-08-06 14:00:44 +00:00

Compare

aarifullin force-pushed feat/patch_cli from c642c42b07 to 5a121098f6

2024-08-06 14:04:13 +00:00

Compare

requested review from dstepanov-yadro

2024-08-06 14:05:46 +00:00

aarifullin force-pushed feat/patch_cli from 5a121098f6 to 27cff41f36

2024-08-06 14:20:56 +00:00

Compare

aarifullin force-pushed feat/patch_cli from 27cff41f36 to 5fc7ebc322

2024-08-06 15:16:32 +00:00

Compare

dstepanov-yadro approved these changes 2024-08-07 10:35:29 +00:00

aarifullin force-pushed feat/patch_cli from 5fc7ebc322 to 6835affbb1

2024-08-08 08:24:13 +00:00

Compare

requested review from dstepanov-yadro

2024-08-08 08:24:36 +00:00

mbiryukova approved these changes 2024-08-08 08:26:59 +00:00

fyrchik reviewed 2024-08-08 10:45:28 +00:00

client/object_patch.go Outdated

					
				@ -0,0 +190,4 @@

						rngPart.SetOffset(offset)

						if originalLength == 0 || offset-rng.GetOffset() >= originalLength {

							rngPart.SetLength(0)

fyrchik commented

2024-08-08 10:40:28 +00:00

It is the default value, no need to set.

aarifullin commented

2024-08-08 12:20:21 +00:00

Fixed

fyrchik marked this conversation as resolved

client/object_patch.go Outdated

					
				@ -0,0 +197,4 @@

						if !x.patch(&object.Patch{

							Address: x.addr,

fyrchik commented

2024-08-08 10:40:13 +00:00

Why are there spaces in every struct literal?

aarifullin commented

2024-08-08 12:20:11 +00:00

Fixed

fyrchik marked this conversation as resolved

client/object_patch.go Outdated

					
				@ -0,0 +209,4 @@

						offset += uint64(n)

						if length > 0 {

							length -= patchLength

fyrchik commented

2024-08-08 10:43:05 +00:00

The whole offset tracking thing is rather tedious.
I suggest using (offset, length) for the first message and (offset+length,0) for the rest. Should be less code, because we don't need any calculations.

The whole offset tracking thing is rather tedious. I suggest using `(offset, length)` for the first message and `(offset+length,0)` for the rest. Should be less code, because we don't need any calculations.

aarifullin commented

2024-08-08 12:17:45 +00:00

I suggest using (offset, length) for the first message and (offset+length,0)

Sorry but I don't get the idea and how it helps to recude code. These calculations are needed because we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large. This leads to the point when we need to split (offset, length - max_buf_size) and (offset + length - max_buf_size, length - max_buf_size)

> I suggest using (offset, length) for the first message and (offset+length,0) Sorry but I don't get the idea and how it helps to recude code. These calculations are needed because we can't just simply break down to `(offset, length)` and `(offset + length, 0)` because `chunk` can be very large. This leads to the point when we need to split `(offset, length - max_buf_size)` and `(offset + length - max_buf_size, length - max_buf_size)`

fyrchik commented

2024-08-08 12:28:13 +00:00

we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large

I do not see any problem here.
You replace (offset,length) with the first part (of any possible length, you pick).
You replace (offset+length,0) with the second part (of any possible length, you pick).

Where do we need to take chunk into account?

>we can't just simply break down to (offset, length) and (offset + length, 0) because chunk can be very large I do not see any problem here. You replace `(offset,length)` with the first part (of any possible length, you pick). You replace `(offset+length,0)` with the second part (of any possible length, you pick). Where do we need to take `chunk` into account?

aarifullin commented

2024-08-08 13:27:59 +00:00

Let's consider such a patch:

{ .offset: 0, .length: 4, .payload: '0123456789!@' }

Let maxChunkLen = 2 -> the size for the buff to read data

then the patch should be splitted like that:

{ .offset = 0,  .length = 2, .payload = '01' } -> replaces 'length' bytes of the original payload
{ .offset = 2,  .length = 2, .payload = '23' } -> replaces 'length' bytes of the original payload
{ .offset = 4,  .length = 0, .payload = '45' } -> inserts bytes at the original payload 
{ .offset = 6,  .length = 0, .payload = '67' } -> inserts bytes at the original payload 
{ .offset = 8,  .length = 0, .payload = '89' } -> inserts bytes at the original payload 
{ .offset = 10, .length = 0, .payload = '!@' } -> inserts bytes at the original payload

if the first read message is like (offset, length) and the rest are (offset + length, 0), then

{ .offset = 0, .length = 2, .payload = '01' }
{ .offset = 2, .length = 0, .payload = '23' } -> incorrect

So, we need to track the length because it's not always 0 in (offset + length, 0)

Let's consider such a patch: ``` { .offset: 0, .length: 4, .payload: '0123456789!@' } ``` Let `maxChunkLen = 2` -> the size for the buff to read data then the patch should be splitted like that: ``` { .offset = 0, .length = 2, .payload = '01' } -> replaces 'length' bytes of the original payload { .offset = 2, .length = 2, .payload = '23' } -> replaces 'length' bytes of the original payload { .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload { .offset = 6, .length = 0, .payload = '67' } -> inserts bytes at the original payload { .offset = 8, .length = 0, .payload = '89' } -> inserts bytes at the original payload { .offset = 10, .length = 0, .payload = '!@' } -> inserts bytes at the original payload ``` if the first read message is like `(offset, length)` and the rest are `(offset + length, 0)`, then ``` { .offset = 0, .length = 2, .payload = '01' } { .offset = 2, .length = 0, .payload = '23' } -> incorrect ``` So, we need to track the length because it's not always `0` in `(offset + length, 0)`

fyrchik commented

2024-08-08 14:07:42 +00:00

Why is this correct { .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload
And this is not { .offset = 2, .length = 0, .payload = '23' } -> incorrect ?

Why is this correct `{ .offset = 4, .length = 0, .payload = '45' } -> inserts bytes at the original payload ` And this is not `{ .offset = 2, .length = 0, .payload = '23' } -> incorrect` ?

fyrchik commented

2024-08-08 14:30:56 +00:00

The misunderstanding can stem from the fact that patch size CAN differ from the part in the original payload, so you can replace X bytes with another Y bytes, and there is no relation between X and Y.

The misunderstanding can stem from the fact that patch size CAN differ from the part in the original payload, so you can replace `X` bytes with another `Y` bytes, and there is no relation between X and Y.

aarifullin commented

2024-08-08 14:40:08 +00:00

Currently it works like that:

[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }

[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d]

We replacing first .length = 4 bytes and just insert the rest

[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }
 ↓  ↓  ↓  ↓
[o][r][i][g] _  _  _  _  _  _  _  _ [i][n][a][l][p][a][y][l][o][a][d]

[0][1][2][3][4][5][6][7][8][9][!][@][i][n][a][l][p][a][y][l][o][a][d]

So, consider the buffer size is 2, so each subpatch's payload can be only <=2

[0][1] { .offset = 0,  .length = 2, .payload = '01' }
 ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d]

      [2][3] { .offset = 2,  .length = 2, .payload = '23' }
       ↓  ↓
[0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

            [4][5] { .offset = 2,  .length = 0, .payload = '45' }
             ↓  ↓
[0][1][2][3] _  _ [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][4][5][i][n][a][l][p][a][y][l][o][a][d]

...so on...

If { .offset = 2, .length = 0, .payload = '23' } whould be correct

then we insert 23, not replace

[0][1] { .offset = 0,  .length = 2, .payload = '01' }
 ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d]

      [2][3] { .offset = 2,  .length = 0, .payload = '23' }
       ↓  ↓
[0][1] _  _ [i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][g][i][n][a][l][p][a][y][l][o][a][d]

So, that how I assumed how it should work

1. Currently it works like that: ``` [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] ``` We replacing first `.length = 4` bytes and just insert the rest ``` [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } ↓ ↓ ↓ ↓ [o][r][i][g] _ _ _ _ _ _ _ _ [i][n][a][l][p][a][y][l][o][a][d] [0][1][2][3][4][5][6][7][8][9][!][@][i][n][a][l][p][a][y][l][o][a][d] ``` So, consider the buffer size is `2`, so each subpatch's payload can be only `<=2` ``` [0][1] { .offset = 0, .length = 2, .payload = '01' } ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 2, .length = 2, .payload = '23' } ↓ ↓ [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d] [4][5] { .offset = 2, .length = 0, .payload = '45' } ↓ ↓ [0][1][2][3] _ _ [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][4][5][i][n][a][l][p][a][y][l][o][a][d] ...so on... ``` 2. If `{ .offset = 2, .length = 0, .payload = '23' }` whould be correct then we insert `23`, not replace ``` [0][1] { .offset = 0, .length = 2, .payload = '01' } ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][g][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 2, .length = 0, .payload = '23' } ↓ ↓ [0][1] _ _ [i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][g][i][n][a][l][p][a][y][l][o][a][d] ``` So, that how I assumed how it should work

fyrchik commented

2024-08-08 14:44:00 +00:00

You need to replace { .offset = 0, .length = 2, .payload = '01' } with { .offset = 0, .length = 4, .payload = '01' }.
It should work according to spec, irregardless of what is implemented now.

After { .offset = 0, .length = 4, .payload = '01' }:
[0][1][i][n][a][l][p][a][y][l][o][a][d]

After {.offset = 4, .length = 0, .payload = '23' }:
[0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

You need to replace `{ .offset = 0, .length = 2, .payload = '01' }` with `{ .offset = 0, .length = 4, .payload = '01' }`. It _should_ work according to spec, irregardless of what is implemented now. After `{ .offset = 0, .length = 4, .payload = '01' }`: [0][1][i][n][a][l][p][a][y][l][o][a][d] After `{.offset = 4, .length = 0, .payload = '23' }`: [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

aarifullin commented

2024-08-08 14:53:10 +00:00

Finally, I get your point. Alright, let me check patcher then, because I didn't take this fact in account when I was designing it

Finally, I get your point. Alright, let me check `patcher` then, because I didn't take this fact in account when I was designing it

aarifullin commented

2024-08-08 15:30:07 +00:00

I've checked patcher. It correctly works. So, I fixed PatchPayload according to your suggestion - it's really perfect

I've checked `patcher`. It correctly works. So, I fixed `PatchPayload` according to your suggestion - it's really perfect

aarifullin commented

2024-08-09 07:40:58 +00:00

For those who don't get the idea we have been discussing:

ε - empty character, it erases a character with the same position

The original payload
[0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 }

[0][1] ε  ε  { .offset = 0,  .length = 4, .payload = '01' } -- length is greater than payload
 ↓  ↓  ↓  ↓
[o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][n][a][l][p][a][y][l][o][a][d]

           [2][3] { .offset = 0 + 4,  .length = 0, .payload = '23' }
            ↓  
____________[i][n][a][l][p][a][y][l][o][a][d]


replaced  inserted
 ↓  ↓        ↓ ↓ 
[0][1] ε ε [2][3] [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d]

For those who don't get the idea we have been discussing: `ε - empty character, it erases a character with the same position` ``` The original payload [0][1][2][3][4][5][6][7][8][9][!][@] {.offset = 0, .length = 4 } [0][1] ε ε { .offset = 0, .length = 4, .payload = '01' } -- length is greater than payload ↓ ↓ ↓ ↓ [o][r][i][g][i][n][a][l][p][a][y][l][o][a][d] -> [0][1][i][n][a][l][p][a][y][l][o][a][d] [2][3] { .offset = 0 + 4, .length = 0, .payload = '23' } ↓ ____________[i][n][a][l][p][a][y][l][o][a][d] replaced inserted ↓ ↓ ↓ ↓ [0][1] ε ε [2][3] [i][n][a][l][p][a][y][l][o][a][d] -> [0][1][2][3][i][n][a][l][p][a][y][l][o][a][d] ```

fyrchik commented

2024-08-09 10:16:28 +00:00

I don't see where it is changed, you still have length -= patchLength and patchLength depends on the number of items you have read (n, err)

I don't see where it is changed, you still have `length -= patchLength` and patchLength depends on the number of items you have read (`n, err`)

aarifullin commented

2024-08-09 11:22:05 +00:00

Um... sorry for the confusion - forgot to push after local rebase. Please, check this out

client/object_patch_test.go Outdated

					
				@ -0,0 +38,4 @@

				}

				func TestObjectPatcher(t *testing.T) {

					t.Run("no split payload patch", func(t *testing.T) {

fyrchik commented

2024-08-08 10:45:24 +00:00

It seems this suit of tests could be less in size by using table tests pattern: we have payload, we have range and that's all.
What do you think?

It seems this suit of tests could be less in size by using table tests pattern: we have payload, we have range and that's all. What do you think?

aarifullin commented

2024-08-08 12:19:47 +00:00

Fixed

fyrchik marked this conversation as resolved

pool/pool.go Outdated

					
				@ -164,6 +166,7 @@ const (

					methodAPEManagerAddChain

					methodAPEManagerRemoveChain

					methodAPEManagerListChains

					methodObjectPatch

fyrchik commented

2024-08-08 10:37:44 +00:00

@dkirillov do we need backwards compatibility in this field?

If no, @aarifullin, could you move it to the object group?
It seems we have stringer for the type, and it is only used for metrics.

@dkirillov do we need backwards compatibility in this field? If no, @aarifullin, could you move it to the `object` group? It seems we have stringer for the type, and it is only used for metrics.

dkirillov commented

2024-08-08 12:21:58 +00:00

I suppose no

aarifullin commented

2024-08-08 12:23:30 +00:00

Fixed

fyrchik marked this conversation as resolved

aarifullin force-pushed feat/patch_cli from 6835affbb1 to f0a8e54735

2024-08-08 12:19:26 +00:00

Compare

aarifullin force-pushed feat/patch_cli from f0a8e54735 to 5b77cbdc79

2024-08-08 12:23:22 +00:00