[#63] poc: Fast multipart upload #157
No reviewers
TrueCloudLab/storage-services-developers
Labels
No labels
P0
P1
P2
P3
good first issue
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-s3-gw#157
Loading…
Reference in a new issue
No description provided.
Delete branch "dkirillov/frostfs-s3-gw:feature/63-fast_multipart_upload"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Please note:
Hash of result object doesn't match with real hash which user may expects (though etag for multipart upload is not real hash of payload)
When we copy object that was created by multipart upload and its overall size is less than 5 GB (if more return error since in such case user must do copying by multipart upload, see note in CopyObject), the new copied object has real payload rather than list of parts.
Take a look at test. If copying was performed with replace metadata directive, resulting object lose info about its parts and we will not see this info in ObjectAttribute response. Actually, I'm not sure if it's ok. I'm ok with always preserving this info (especially taking into account the fact that for real big object, we need do copying by multipart upload, so in result object we will have some parts info)
There are not special locking procedure for combined object. Probably it's ok because we don't fully support locks and if we lock the combined object without its part the use of S3 protocol will not be able to bypass this lock (he has access only for combined object and not for its parts). But if something will happen on storage level and parts are gone (e.g. because of expiration or manual removing via grpc, etc) it will be really sad.
Signed-off-by: Denis Kirillov d.kirillov@yadro.com
Multipart upload comparison
part size: 5mb
max object size: 64mb
local:
bare metal:
part size: 16mb
max object size: 64mb
local:
bare metal:
part size: 64mb
max object size: 64mb
local:
bare metal:
@dkirillov thanks for the comparison! A couple more questions:
Write
is going to be.Every part has size 5mb (except the last)
Yes, it's read comparison between two multipart upload objects.
There is not any significant downgrade in read operations for regular upload objects because we check if object should be treated specially by its header (that we always handle)
f4aaa4ec81
tof481f9ad43
f481f9ad43
to167c610340
WIP: [#63] poc: Fast multipart uploadto [#63] poc: Fast multipart upload167c610340
to0fa9fa657c
[#63] poc: Fast multipart uploadto WIP: [#63] poc: Fast multipart uploadWIP: [#63] poc: Fast multipart uploadto [#63] poc: Fast multipart upload@ -0,0 +20,4 @@
layer *layer
off, ln uint64
Some comments will be appreciated. I guess it is the offset of complete object and total size of complete object.
@ -0,0 +44,4 @@
for x.off != 0 {
if x.parts[0].Size < x.off {
x.parts = x.parts[1:]
x.off -= x.parts[0].Size
What if
len(x.parts) == 1
and afterx.parts[1:]
the slice is empty,and panic happens onx.parts[0].Size
. Is it valid?Also, is it possible to go negative in
x.off
here?In current implementation we have check range in handler, and init reading from frostfs. So we cannot get such invalid value here.
But yes, theoretically panic can happen. I'll add test for that
No, we have check
if x.parts[0].Size < x.off
above@ -0,0 +74,4 @@
x.parts = x.parts[1:]
next, err := x.Read(p[n:])
Should it be recursive and not
x.curReader.Read
? What is the max recursion deepness here? As far as I see the max deepness is one, becausex.curReader
is set above and we expect early return from the recursive function.It seems it should be. We have to handle the case when we ended read one part and must start another.
The max deepness is number of parts (if
p
is large enough to contain all parts payload at once)0027a94878
tob0214811e0
b0214811e0
to68ec470c18
68ec470c18
to438c35bc9f
438c35bc9f
to1a09041cd1
The multipart ceph tests results are the same with this PR
testcases also passed