[#170] tar.gz extraction during upload #176

Open
nzinkevich wants to merge 1 commit from nzinkevich/frostfs-http-gw:zip_explode into master
Member
No description provided.
nzinkevich self-assigned this 2024-12-06 12:04:23 +00:00
nzinkevich added 3 commits 2024-12-06 12:04:24 +00:00
During upload if "X-Attribute-Explode-Archive" is set, gate tries to read tar.gz archive and creates an object for each file. Each object acquires a FilePath attribute which is calculated relative to the archive root

Signed-off-by: Nikita Zinkevich <n.zinkevich@yadro.com>
Split DownloadZipped handler on methods. Add handler DownloadTar for downloading tar.gz archives. Make methods more universal for using in both implementations

Signed-off-by: Nikita Zinkevich <n.zinkevich@yadro.com>
[#170] Updated docs and configuration of archive section
Some checks failed
/ DCO (pull_request) Failing after 1m53s
/ Vulncheck (pull_request) Successful in 2m54s
/ Builds (pull_request) Successful in 3m1s
/ Lint (pull_request) Successful in 4m25s
/ Tests (pull_request) Successful in 3m3s
0b16911a06
nzinkevich requested review from dkirillov 2024-12-06 12:04:24 +00:00
nzinkevich requested review from alexvanin 2024-12-06 12:04:24 +00:00
nzinkevich requested review from storage-services-committers 2024-12-06 12:05:19 +00:00
nzinkevich requested review from storage-services-developers 2024-12-06 12:05:23 +00:00
nzinkevich force-pushed zip_explode from 0b16911a06 to b00a4b4847 2024-12-06 12:06:15 +00:00 Compare
nzinkevich force-pushed zip_explode from b00a4b4847 to 240bf7bc72 2024-12-06 12:09:23 +00:00 Compare
nzinkevich changed title from [#170] GZIP tar downloading and uploading to [#170] tar.gz downloading and uploading 2024-12-06 12:09:55 +00:00
nzinkevich force-pushed zip_explode from 240bf7bc72 to 973373685e 2024-12-06 12:14:48 +00:00 Compare
nzinkevich changed title from [#170] tar.gz downloading and uploading to [#170] Archive extraction during upload 2024-12-06 12:17:25 +00:00
nzinkevich changed title from [#170] Archive extraction during upload to [#170] tar.gz extraction during upload 2024-12-06 12:17:35 +00:00
nzinkevich changed title from [#170] tar.gz extraction during upload to WIP: [#170] tar.gz extraction during upload 2024-12-06 12:29:36 +00:00
nzinkevich force-pushed zip_explode from 973373685e to 69ad977577 2024-12-06 12:38:54 +00:00 Compare
nzinkevich changed title from WIP: [#170] tar.gz extraction during upload to [#170] tar.gz extraction during upload 2024-12-06 12:39:35 +00:00
nzinkevich force-pushed zip_explode from 69ad977577 to 82271223ae 2024-12-10 06:34:43 +00:00 Compare
dkirillov reviewed 2024-12-11 09:52:21 +00:00
@ -24,1 +28,3 @@
drainBufSize = 4096
jsonHeader = "application/json; charset=UTF-8"
drainBufSize = 4096
explodeArchiveHeader = "Explode-Archive"
Member

Header should be X-Expode-Archive

Header should be `X-Expode-Archive`
@ -73,3 +75,3 @@
log.Debug(
logs.CloseTemporaryMultipartFormFile,
zap.Stringer("address", addr),
zap.Stringer("container", bktInfo.CID),
Member

Actually, as I remembered this log was aimed to print address of uploaded objects. It's better to keep such behavior. Maybe using some different approach but still

Actually, as I remembered this log was aimed to print address of uploaded objects. It's better to keep such behavior. Maybe using some different approach but still
Member

Also, I don't think we should add attribute Explode-Archive to result objects

c.Request.Header.Peek(utils.UserAttributeHeaderPrefix + explodeArchiveHeader)
Also, I don't think we should add attribute `Explode-Archive` to result objects ```golang c.Request.Header.Peek(utils.UserAttributeHeaderPrefix + explodeArchiveHeader) ```
@ -88,0 +121,4 @@
return
}
c.Response.Header.SetContentType(jsonHeader)
Member

Despite this won't affect behavior, can we set header before setting body?

Despite this won't affect behavior, can we set header before setting body?
@ -151,0 +198,4 @@
// explodeGzip read files from tar.gz archive and creates objects for each of them.
// Sets FilePath attribute with name from tar.Header.
func (h *Handler) explodeGzip(c *fasthttp.RequestCtx, log *zap.Logger, bktInfo *data.BucketInfo, file io.Reader) {
gzipReader, err := gzip.NewReader(file)
Member

Why do we require archive be gziped? I supposed we can upload archive created like tar -cf dir.tar.gz dir but not it doesn't work:

create gzip reader: gzip: invalid header
Why do we require archive be gziped? I supposed we can upload archive created like `tar -cf dir.tar.gz dir` but not it doesn't work: ``` create gzip reader: gzip: invalid header ```
Member

Oh, It seems I was wrong with command. I should use -z flag

But should we support not zipped tar?

Oh, It seems I was wrong with command. I should use `-z` flag But should we support not zipped tar?
@ -175,0 +229,4 @@
if err != nil {
return
}
log.Debug(logs.ObjectUploaded, zap.String("object ID", idObj.EncodeToString()))
Member

Let's log also object name

Let's log also object name
@ -30,0 +32,4 @@
FailedToReadFileFromTar = "failed to read file from tar" // Error in ../../uploader/upload.go
FailedToFilterHeaders = "failed to filter headers" // Error in ../../uploader/upload.go
FailedToUploadObject = "failed to upload object" // Error in ../../uploader/upload.go
ObjectUploaded = "object uploaded" // Debug in ../../uploader/upload.go
Member

It's better not to add comments at the end of lines

It's better not to add comments at the end of lines
alexvanin added this to the v0.33.0 milestone 2024-12-12 11:57:20 +00:00
All checks were successful
/ DCO (pull_request) Successful in 1m19s
Required
Details
/ Vulncheck (pull_request) Successful in 1m55s
Required
Details
/ Builds (pull_request) Successful in 2m10s
Required
Details
/ Lint (pull_request) Successful in 3m31s
Required
Details
/ Tests (pull_request) Successful in 1m56s
Required
Details
This pull request doesn't have enough approvals yet. 0 of 2 approvals granted.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u zip_explode:nzinkevich-zip_explode
git checkout nzinkevich-zip_explode
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-services-committers
TrueCloudLab/storage-services-developers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-http-gw#176
No description provided.