Commit graph

90 commits

Author SHA1 Message Date
Milos Gajdos
7ba91015f5
fix: remove disabling of multipart combine small parts
This reverts https://github.com/distribution/distribution/pull/3556

This feature is currently broken and requires more fundamental changes
in the S3 driver. Until then it's better to remove it.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-12-18 09:52:19 +00:00
Milos Gajdos
3f3e61e299
fix: update incorrect godoc comment for (writer).Writer()
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-12-13 14:56:06 +00:00
Milos Gajdos
4baddbc608
fix: update S3 storage driver writer
This commit updates (writer).Writer() method in S3 storage driver to
handle the case where an append is attempted to a zer-size content.

S3 does not allow appending to already committed content, so we are
optiing to provide the following case as a narrowed down behaviour:
Writer can only append to zero byte content - in that case, a new S3
MultipartUpload is created that will be used for overriding the already
committed zero size content.

Appending to non-zero size content fails with error.

Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-12-13 09:22:48 +00:00
Milos Gajdos
bd0e476910
Hide our misuses of contexts from the public interface (#4128) 2023-11-03 05:05:19 +00:00
Milos Gajdos
1d7526dea0
cleanup: make chunk sizes easier to understand and change writer append (#4139) 2023-10-31 19:47:06 +00:00
Cory Snider
b45b6d18b8 storage/driver: plumb contexts into factories
...and driver constructors when applicable.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-10-27 17:48:57 -04:00
Cory Snider
f089932de0 storage/driver: replace URLFor method
Several storage drivers and storage middlewares need to introspect the
client HTTP request in order to construct content-redirect URLs. The
request is indirectly passed into the driver interface method URLFor()
through the context argument, which is bad practice. The request should
be passed in as an explicit argument as the method is only called from
request handlers.

Replace the URLFor() method with a RedirectURL() method which takes an
HTTP request as a parameter instead of a context. Drop the options
argument from URLFor() as in practice it only ever encoded the request
method, which can now be fetched directly from the request. No URLFor()
callers ever passed in an "expiry" option, either.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-10-27 10:58:37 -04:00
Cory Snider
d0f5aa670b Move context package internal
Our context package predates the establishment of current best practices
regarding context usage and it shows. It encourages bad practices such
as using contexts to propagate non-request-scoped values like the
application version and using string-typed keys for context values. Move
the package internal to remove it from the API surface of
distribution/v3@v3.0.0 so we are free to iterate on it without being
constrained by compatibility.

Signed-off-by: Cory Snider <csnider@mirantis.com>
2023-10-27 10:58:37 -04:00
Milos Gajdos
852de2c2bb
cleanup: make chunk sizes easier to understand and change writer append
This commit make the S3 driver chunk size constants more straightforward
to understand -- instead of remembering the bit shifts we make this more
explicit.

We are also updating append parameter to the `(writer).Write` to follow
the new convention we are trying to establish.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-10-27 10:57:54 +01:00
Milos Gajdos
cb0d083d8d
feat: Add context to storagedriver.(Filewriter).Commit()
This commit changes storagedriver.Filewriter interface
by adding context.Context as an argument to its Commit
func.

We pass the context appropriately where need be throughout
the distribution codebase to all the writers and tests.

S3 driver writer unfortunately must maintain the context
passed down to it from upstream so it contnues to
implement io.Writer and io.Closer interfaces which do not
allow accepting the context in any of their funcs.

Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-10-19 11:27:27 +01:00
Milos Gajdos
a70964c2fc
Merge pull request #4076 from flavianmissi/s3-loglevel
registry: add loglevel support for aws s3 storage driver
2023-10-04 14:13:15 +01:00
Flavian Missi
3df7e28f44 registry: add loglevel support for aws s3 storage driver
based on the work from
https://github.com/distribution/distribution/pull/3057.

Co-authored-by: Simon Compston <compston@gmail.com>
Signed-off-by: Flavian Missi <fmissi@redhat.com>
2023-10-02 15:47:02 +02:00
Milos Gajdos
4fce3c0028
Move completedParts type back to the original position
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-09-28 15:58:02 +01:00
Milos Gajdos
b888b14b39
Optimise push in S3 driver
This commit cleans up and attempts to optimise the performance of image push in S3 driver.
There are 2 main changes:
* we refactor the S3 driver Writer where instead of using separate bytes
  slices for ready and pending parts which get constantly appended data
  into them causing unnecessary allocations we use optimised bytes
  buffers; we make sure these are used efficiently when written to.
* we introduce a memory pool that is used for allocating the byte
  buffers introduced above

These changes should alleviate high memory pressure on the push path to S3.

Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-09-27 21:33:22 +01:00
Milos Gajdos
9790bc806c
Merge pull request #4037 from milosgajdos/enable-prealloc
Enable prealloc linter
2023-09-04 16:57:29 +01:00
Milos Gajdos
59fd8656ac
Enable prealloc linter
This will give us nice little performance gains in some code paths.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-09-03 22:41:51 +01:00
Milos Gajdos
dcdd8bb740
Propagate storage driver context to S3 API calls
Only some of the S3 storage driver calls were propagating context to the
S3 API calls. This commit updates the S3 storage drivers so the context
is propagated to all the S3 API calls.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-09-03 21:54:54 +01:00
James Hewitt
e22f7cbc73
Pass the last paging flag to storage drivers
Storage drivers may be able to take advantage of the hint to start
their walk more efficiently.

For S3: The API takes a start-after parameter. Registries with many
repositories can drastically reduce calls to s3 by telling s3 to only
list results lexographically after the last parameter.

For the fallback: We can start deeper in the tree and avoid statting
the files and directories before the hint in a walk. For a filesystem
this improves performance a little, but many of the API based drivers
are currently treated like a filesystem, so this drastically improves
the performance of GCP and Azure blob.

Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
2023-08-29 11:27:42 +01:00
Milos Gajdos
59dd684cc8
Merge pull request #3713 from Jamstah/s3-tests 2023-08-21 13:48:43 +01:00
Sebastiaan van Stijn
5b3be39870
s3: add interface assertion
This was added for the other drivers in 6b388b1ba6,
but it missed the s3 storage driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2023-08-21 13:54:13 +02:00
James Hewitt
7622d0a453
Don't return the from of a walk
Other storage drivers will only return children and below, s3 should do
the same. The only reason it was returning was because of the addition
of a / to ensure we treat the from as a directory.

Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
2023-08-15 16:26:37 +01:00
Lucas França de Oliveira
035a8ec52a
Fix panic in the s3 backend walk logic
Signed-off-by: Lucas França de Oliveira <lucasfdo@palantir.com>
2023-05-25 14:56:05 -07:00
Milos Gajdos
2fb8dbdeca
Merge pull request #3839 from kirat-singh/feature.azure-sdk-update
Update Azure SDK and support additional authentication schemes
2023-04-25 19:35:34 +01:00
Kirat Singh
ba4a6bbe02 Update Azure SDK and support additional authentication schemes
Microsoft has updated the golang Azure SDK significantly.  Update the
azure storage driver to use the new SDK.  Add support for client
secret and MSI authentication schemes in addition to shared key
authentication.

Implement rootDirectory support for the azure storage driver to mirror
the S3 driver.

Signed-off-by: Kirat Singh <kirat.singh@beacon.io>

Co-authored-by: Cory Snider <corhere@gmail.com>
2023-04-25 17:23:20 +00:00
Milos Gajdos
0c958010ac
Merge pull request #3763 from distribution/multipart-upload-empty-files
Enable pushing empty blobs
2023-03-27 10:18:44 +01:00
Milos Gajdos
5fa926a609
Enable pushing empty blobs
This is an edge case when we are trying to upload an empty chunk of data using
a MultiPart upload. As a result we are trying to complete the MultipartUpload
with an empty slice of `completedUploadedParts` which will always lead to 400
being returned from S3 See: https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#CompletedMultipartUpload
Solution: we upload an empty i.e. 0 byte part as a single part and then append it
to the completedUploadedParts slice used to complete the Multipart upload.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2023-03-27 10:11:07 +01:00
Aaron Lehmann
2074688be9 Fix S3 multipart upload pagination loop condition
The loop that iterates over paginated lists of S3 multipart upload parts
appears to be using the wrong variable in its loop condition. Nothing
inside the loop affects the value of `resp.IsTruncated`, so this loop
will either be wrongly skipped or loop forever.

It looks like this is a regression caused by commit
7736319f2e. The return value of
`ListMultipartUploads` used to be assigned to a variable named `resp`,
but it was renamed to `partsList` without updating the for loop
condition.

I believe this is causing an error we're seeing with large layer uploads
at commit time:

    upload resumed at wrong offset: 5242880000 != 5815706782

Missing parts of the multipart S3 upload would cause an incorrect size
calculation in `newWriter`.

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2023-02-21 20:57:50 -08:00
Kirat Singh
3117e2eb2f
Use default http.Transport for AWS S3 session
Previously we used a custom Transport in order to modify the user agent header.
This prevented the AWS SDK from being able to customize SSL and other client TLS
parameters since it could not understand the Transport type.

Instead we can simply use the SDK function MakeAddToUserAgentFreeFormHandler to
customize the UserAgent if necessary and leave all the TLS configuration to the
AWS SDK.

The only exception being SkipVerify which we have to handle, but we can set it
onto the standard http.Transport which does not interfere with the SDKs ability
to set other options.

Signed-off-by: Kirat Singh <kirat.singh@gmail.com>
2023-02-15 13:37:01 -05:00
Sebastiaan van Stijn
f8b3af78fc
replace deprecated io/ioutil
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-11-04 23:47:52 +01:00
Hayley Swimelar
e3509fc1de
Merge pull request #3635 from milosgajdos/make-s3-driver-delete-faster
Delete S3 keys incrementally in batches
2022-11-04 16:56:41 +01:00
Hayley Swimelar
52d948a9f5
Merge pull request #3766 from thaJeztah/gofumpt
format code with gofumpt
2022-11-04 12:19:53 +01:00
Sebastiaan van Stijn
e0281dc609
format code with gofumpt
gofumpt (https://github.com/mvdan/gofumpt) provides a supserset of `gofmt` / `go fmt`,
and addresses various formatting issues that linters may be checking for.

We can consider enabling the `gofumpt` linter to verify the formatting in CI, although
not every developer may have it installed, so for now this runs it once to get formatting
in shape.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-11-03 22:48:20 +01:00
Sebastiaan van Stijn
f9ccd2c6ea
use http consts for request methods
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2022-11-02 23:31:47 +01:00
Milos Gajdos
ebc4234fd5
Delete S3 keys incrementally in batches
Instead of first collecting all keys and then batch deleting them,
we will do the incremental delete _online_ per max allowed batch.
Doing this prevents frequent allocations for large S3 keyspaces
and OOM-kills that might happen as a result of those.

This commit introduces storagedriver.Errors type that allows to return
multierrors as a single error from any storage driver implementation.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2022-10-30 19:10:24 +00:00
João Pereira
a7fc49b067
Merge pull request #3622 from ddelange/patch-1
Support all S3 instant retrieval storage classes
2022-04-26 10:23:14 +01:00
Milos Gajdos
27b5563245
Merge pull request #3624 from milosgajdos/aws-s3-listv2
Update s3 ListObjects to V2 API
2022-04-22 13:34:13 +01:00
duanhongyi
15de9e21ba Add forcepathstyle parameter for s3
Signed-off-by: duanhongyi <duanhongyi@doopai.com>
2022-04-20 08:44:12 +08:00
Milos Gajdos
8eab5d1bd6
Update s3 ListObjects to V2 API
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2022-04-09 12:16:46 +01:00
Simone Locci
80952c9e2b
Rename s3accelerate parameter to accelerate
Signed-off-by: Simone Locci <simonelocci88@gmail.com>
2022-04-04 19:35:21 +02:00
Simone Locci
ea27621d4a
Fix review
Signed-off-by: Simone Locci <simonelocci88@gmail.com>
2022-04-04 19:35:09 +02:00
Kirat Singh
51c0c8148a
Add new parameter s3accelerate to S3 storage driver.
If s3accelerate is set to true then we turn on S3 Transfer
Acceleration via the AWS SDK.  It defaults to false since this is an
opt-in feature on the S3 bucket.

Signed-off-by: Kirat Singh <kirat.singh@wsq.io>
Signed-off-by: Simone Locci <simonelocci88@gmail.com>
2022-04-04 19:34:57 +02:00
ddelange
966fae5463
Add tests for all supported storage classes
Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>
2022-04-04 10:54:18 +02:00
ddelange
fb937deabf
Support all S3 instant retrieval storage classes
Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>
2022-04-01 11:55:35 +02:00
João Pereira
514cbd71be
Merge pull request #3519 from jtherin/mpu-paginate
fix: paginate through s3 multipart uploads
2022-03-11 16:06:46 +00:00
libo.huang
117757a5cb feat: add option to disable combining the pending part
Signed-off-by: Libo Huang <huanglibo2010@gmail.com>
2022-01-07 18:20:31 +08:00
Adam Kaplan
e2caaf9cba Add dualstack option to S3 storage driver
Allow the storage driver to optionally use AWS SDK's dualstack mode.
This allows the registry to communicate with S3 in IPv6 environments.

Signed-off-by: Adam Kaplan <adam.kaplan@redhat.com>
2022-01-04 17:19:05 -05:00
Jeremy THERIN
7736319f2e fix: paginate through s3 multipart uploads
Signed-off-by: Jeremy THERIN <jtherin@scaleway.com>
2021-10-26 17:55:03 +02:00
Collin Shoop
cf81f67a16 storagedriver/s3: Optimized Walk implementation + bugfix
Optimized S3 Walk impl by no longer listing files recursively. Overall gives a huge performance increase both in terms of runtime and S3 calls (up to ~500x).

Fixed a bug in WalkFallback where ErrSkipDir for was not handled as documented for non-directory.

Signed-off-by: Collin Shoop <cshoop@digitalocean.com>
2021-08-16 16:07:25 -04:00
Collin Shoop
03f9eb3a18 storagedriver/s3: Fixed a Delete noop edgecase
Delete was not working when the subpath immediately followed the given path started with an ascii lower than "/" such as dash "-" and underscore "_" and requests no files to be deleted.

(cherry picked from commit 5d8fa0ce94b68cce70237805db92cdd8d40de282)
Signed-off-by: Collin Shoop <cshoop@digitalocean.com>
2021-08-12 11:56:13 -04:00
Derek McGowan
a01c71e247
Merge pull request #2815 from bainsy88/issue_2814
Add code to handle pagination of parts. Fixes max layer size of 10GB bug
2021-03-16 09:12:03 -07:00