Introduce patcher
package #247
No reviewers
Labels
No labels
P0
P1
P2
P3
good first issue
pool
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-sdk-go#247
Loading…
Reference in a new issue
No description provided.
Delete branch "aarifullin/frostfs-sdk-go:feat/patch/1"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Introduce
Patch
typePatch
andPayloadPatch
;object: Introduce
patcher
packagepatcher
package that contains such interfaces to be implemented:PatchApplier
- the main patching engine that merges the stream of patches and the stream of original object payload divided by ranges. The merged streams result is output toChunkedObjectWriter
;RangeProvider
- provides the original object payload by ranges;HeaderProvider
- provides the original object header.patcher
that implementsPatchApplier
;patcher
packagee4aa512c1f
toe4d85fa865
e4d85fa865
to2fc0854a48
2fc0854a48
to7d29c58d7f
@ -0,0 +13,4 @@
var (
ErrOffsetExceedsSize = errors.New("patch offset exceeds object size")
ErrEmptyPayloadPatch = errors.New("patch must contain payload")
Changing only attributes is allowed, when do we throw this message?
According to RFC:
That means
ApplyPatch
can be invoked withpatch
that must set onlyPayloadPatch
if this invocation is not first. I think the error message should be slightly fixedRefactored to
ErrPayloadPatchIsNil
@ -0,0 +39,4 @@
// HeaderProvider is the interface that provides a method to get an original's object header.
type HeaderProvider interface {
// GetObjectHeader gets an original's object header.
GetObjectHeader(ctx context.Context, addr oid.Address) (*objectSDK.Object, error)
This is an interface with method called only once (exactly once) for a single address.
Why not provide the header as a parameter?
Briefly, it's kind of trick for
patcher
's initialization. If we agree about this interface,HeaderProvider
will be not needed@ -0,0 +48,4 @@
// ReadRange reads an original object payload by the given range.
// The method returns io.Reader over the data range only. This means if the data is read out,
// then ReadRange has to be invoked to provide reader over the next range.
ReadRange(ctx context.Context, addr oid.Address, rng *objectSDK.Range) io.Reader
We have a single address for patcher, I think it should accept addressless interfaces, there is no need in duplication.
Fixed
@ -0,0 +81,4 @@
}
}
func (p *patcher) ApplyPatch(ctx context.Context, currPatch *objectSDK.Patch) bool {
This API mimics the gRPC restrictions, by accepting
objectSDK.Patch
which can contain both attributes and payload.I believe the right way is to have two methods: one for attributes
func(attr []Attribute, replace bool)
, another for payload patchesfunc(PayloadPatch)
.We then need only to validate that the former is called once, and we won't need multiple nested ifs as in this method.
We agreed that
patch
stream won't get an initialization message (like forPut
), butpatcher
requires some initialization path.Let's consider such an interface:
(first of all I'd like to introduce header patching
func(attr []Attribute, replace bool)
likeApplyHeaderPatch
)So, a client like
frostfs-storage
is responsible to follow the invocation orderInit
->ApplyHeaderPatch
for the same patch ->ApplyPayloadPatch
->Close
. Is it ok?Why does
ApplyHeaderPatch
andApplyPayloadPatch
have the same argument?What do we need
Init
for?I've already reconsidered this approach. So:
Nevermind, we don't. We just need to refactor the consturctor
ApplyAttributesPatch
will be applied for originalhdr
,addr
is used byRangeProvider
This one looks almost perfect!
Have refactored the interface. Please, check this out
@ -0,0 +83,4 @@
func (p *patcher) ApplyPatch(ctx context.Context, currPatch *objectSDK.Patch) bool {
if currPatch == nil {
return true
Why do we return
bool
instead oferror
here. We do this in node inobject
services, frankly this was a costly mistake.TBH, I used this approach because I wanted to test
ApplyPatch
. If we expect an error fromApplyPatch
but check this withoutClose
then we also need to expect on which iteration we expect the error.The method is no longer relevant. Let me resolve the comment
@ -0,0 +161,4 @@
rdr := p.rangeProvider.ReadRange(ctx, p.addr, rng)
for {
remain := make([]byte, 1024)
Magic constant. Can we make it a parameter? This
rdr.Read
could translate to the gRPC call.Yes, that makes sense
Could you clarify, please, how is this point related to buffer size?
The number of times you call read is inversely proportional to the buffer size.
Fixed. I have introduced a configurable parameter.
Also fixed the consturctor
@ -0,0 +194,4 @@
rdr := p.rangeProvider.ReadRange(ctx, p.addr, rng)
for {
orig := make([]byte, 1024)
Magic constant.
@ -0,0 +238,4 @@
}
var mergedAttrs []objectSDK.Attribute
for key, value := range attrMap {
I would leave the order of initial attributes as-is.
Iterating over the map is unstable.
Fixed, please, check this out
@ -0,0 +316,4 @@
},
},
patchPayloads: []*objectSDK.PayloadPatch{
nil,
This should be an invalid message, otherwise we could send an infinite stream of
nil
.Note, that this is not
patch := &Patch{}
itself.PayloadPatch
can be nil if patch is targeted to object attributesAs unit-test have been fixed, that's no longer relevant
@ -0,0 +198,4 @@
var n int
n, err = rdr.Read(orig)
if err != nil {
if err == io.EOF {
What is
err == io.EOF
, butn != 0
.It seems you tests this, but I don't see why it works.
Reader
is explicitly allowed to behave like this:From https://pkg.go.dev/io#Reader
Yes, you're right. It's implementation based and I couldn't catch this with
io.Pipe
andbytes.NewReader
. n > 0` should be proccessed anyway. I'll fix thatFixed, check
copyPayload
out, please@ -0,0 +53,4 @@
var _ RangeProvider = (*mockRangeProvider)(nil)
func (m *mockRangeProvider) ReadRange(_ context.Context, addr oid.Address, rng *objectSDK.Range) io.Reader {
This whole function is rather big.
It seems to me it could be reduced to a
bytes.NewReader(m.originalPayload[offset:offset+length])
.Or no?
Okay, I'll fix that to
bytes.NewReader
. I just wanted to check how it's gonna work withio.Pipe
Fixed
7d29c58d7f
to13dc1d3a01
13dc1d3a01
to950b4b7b20
950b4b7b20
toc11741807b
c11741807b
to3b63e58a09
3b63e58a09
tofd75436d21
@ -0,0 +81,4 @@
}
}
func (p *patcher) ApplyAttributesPatch(ctx context.Context, newAttrs []objectSDK.Attribute, replaceAttrs bool) error {
We need to validate that this function is called once (and an appropriate test).
Fixed. Test - added
@ -0,0 +128,4 @@
rng.SetOffset(p.currOffset)
rng.SetLength(p.originalPayloadSize - p.currOffset)
rdr := p.rangeProvider.ReadRange(ctx, rng)
applyPatch
andClose
intersect in this part (non-surprising, given that both can copy data from the original object).What about reusing this code between them? Should be identical.
Fixed, check
copyPayload
out, please@ -0,0 +453,4 @@
require.NoError(t, err)
require.Equal(t, test.patched, patchedObject.Payload())
// attrOfPatchedObject := patchedObject.Attributes()
Oh, my bad
fd75436d21
tod92e7e4f84
@ -0,0 +116,4 @@
p.firstPayloadPatchCall = false
}()
if !p.attrPatchAlreadyApplied {
We can skip
ApplyAttributesPatch
completely if we want.No longer relevant. The reason I wrote this:
#247 (comment)
@ -0,0 +121,4 @@
}
if payloadPatch == nil {
if p.firstPayloadPatchCall {
What does this if achieve?
Nevermind, I wanted to impose on the client the order of
Apply...
invocation but I understood that it only makes things unreasonably compilcated. I've removed this flagd92e7e4f84
to73ff62bbe1
73ff62bbe1
to1a00fb9a05
1a00fb9a05
to4899a8925f
4899a8925f
to48f4d6933c
48f4d6933c
toc000b255ea
@ -0,0 +25,4 @@
// PatchApplier is the interface that provides method to apply header and payload patches.
type PatchApplier interface {
// ApplyAttributesPatch applies the patch only for the object's attributes.
// This method must be ALWAYS invoked first as attribute patch must income ONLY with the first patch
Can we avoid this restriction? It seems unnecessary.
Sure. The comment is outdated, just forgot to remove this line
@ -0,0 +49,4 @@
// ReadRange reads an original object payload by the given range.
// The method returns io.Reader over the data range only. This means if the data is read out,
// then ReadRange has to be invoked to provide reader over the next range.
ReadRange(ctx context.Context, rng *objectSDK.Range) io.Reader
We have
Range
andGetRange
in node, why add the 3rd way to spell this?Renamed to
GetRange
@ -0,0 +119,4 @@
if replaceAttrs {
p.hdr.SetAttributes(newAttrs...)
} else if len(newAttrs) > 0 {
mergedAttrs := mergeAttributes(newAttrs, p.hdr.Attributes())
Does
p.hdr.Attributes()
copy the slice?It can be unexpected for the user if no (patch will change the header we have provided in constructor).
Sorry, didn't get your question
So,
mergeAttrs
equals top.hdr.Attributes()
iflen(newAttrs) == 0
So
Attributes
allocates, then we can reuse the slice, that was the question.@ -0,0 +157,4 @@
// copy remaining originial payload
if err := p.copyRange(ctx, rng); err != nil {
return PatchRes{}, fmt.Errorf("copy payload error: %w", err)
if you wrap it multiple times we will have
patch error: copy payload error: read error:
. The error conveys no meaning and is probably mentioned somewhere else in the logging level.What about removing it, compare to
applyPatch: copy payload: read: ...
Good point, I'm gonna fix this
Fixed
@ -0,0 +219,4 @@
}
func mergeAttributes(newAttrs, oldAttrs []objectSDK.Attribute) []objectSDK.Attribute {
attrMap := make(map[string]string)
We know it has
len(newAttrs)
elements, could prealloc.Fixed
@ -0,0 +556,4 @@
require.NoError(t, err)
require.Equal(t, test.patched, patchedObject.Payload())
newAttrs := append([]objectSDK.Attribute{}, test.patches[0].NewAttributes...)
Does original object has no headers, though?
This is a trick for
require.Equal
becasue it comparesnil
with empty slice and throws the error although logically they are both emptyNo, I mean do we have the test which patches and object with already existing headers and replaces some of them?
The test should check correctness and the order.
Oh, my bad. I have added
Works fine. The order
key2
,key3
is correct. The value is replacedc000b255ea
tofe2c45671c
fe2c45671c
to233c88a526
@ -0,0 +49,4 @@
var attr1, attr2 Attribute
attr1.SetKey("key1")
attr1.SetValue("val1")
attr1.SetKey("key2")
attr2
?Fixed!
@ -0,0 +362,4 @@
},
originalObjectPayload: []byte("0123456789qwertyuiopasdfghjklzxcvbnm"),
patchedPayload: []byte("inserted at the beginning0123456789qwertyuiopasdfghjklzxcvbnm"),
expectedPayloadPatchErr: ErrPayloadPatchIsNil,
Seems it's not expected in this test
That's true, my bad.
This unit-test case shouldn't expect for the error but it passes as testing loop checks
require
only ifApplyPayloadPatch
returns an error. As we don't use getting an error byClose
, the proper way is to check an error on a specific iteration (iterationExpErr int
), but this makes tests hard to read233c88a526
to66e1ef74df
66e1ef74df
to6504e83169