TrueCloudLab/distribution

Author	SHA1	Message	Date
Milos Gajdos	a18cc8a656	S3 driver: Attempt HeadObject on Stat first, fail over to List Stat always calls ListObjects when stat-ing S3 key. Unfortauntely ListObjects is not a free call - both in terms of egress and actual AWS costs (likely because of the egress). This changes the behaviour of Stat such that we always attempt the HeadObject call first and only ever fall through to ListObjects if the HeadObject returns an AWS API error. Note, that the official docs mention that the only error returned by HEAD is NoSuchKey; experiments show that this is demonstrably wrong and the AWS docs are simply outdated at the time of this commit. HeadObject actually returns the following errors: * NotFound: if the queried key does not exist * NotFound: if the queried key contains subkeys i.e. it's a prefix * BucketRegionError: if the bucket does not exist * Forbidden: if Head operation is not allows via IAM/ACLs Co-authored-by: Cory Snider <corhere@gmail.com> Co-authored-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2024-07-17 10:16:54 +01:00
Benjamin Schanzel	8654a0ee45	Allow setting s3 forcepathstyle without regionendpoint Currently, the `forcepathstyle` parameter for the s3 storage driver is considered only if the `regionendpoint` parameter is set. Since setting a region endpoint explicitly is discouraged with AWS s3, it is not clear how to enforce path style URLs with AWS s3. This also means, that the default value (true) only applies if a region endpoint is configured. This change makes sure we always forward the `forcepathstyle` parameter to the aws-sdk if present in the config. This is a breaking change where a `regionendpoint` is configured but no explicit `forcepathstyle` value is set. Signed-off-by: Benjamin Schanzel <benjamin.schanzel@bmw.de>	2024-04-08 12:45:26 +02:00
Eng Zer Jun	41161a6e12	refactor(storage/s3): remove redundant len check Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2024-01-17 18:27:05 +08:00
Wang Yan	4a360f9da2	fix: remove disabling of multipart combine small parts (#4193 )	2023-12-19 16:10:19 +08:00
Milos Gajdos	7ba91015f5	fix: remove disabling of multipart combine small parts This reverts https://github.com/distribution/distribution/pull/3556 This feature is currently broken and requires more fundamental changes in the S3 driver. Until then it's better to remove it. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-12-18 09:52:19 +00:00
Milos Gajdos	8fa7a81cb2	fix: use http.DefaultTransport in S3 client Unfortunately one of the changes we merged in broken the support for http.ProxyFromEnvironment https://pkg.go.dev/net/http#ProxyFromEnvironment This commit attempts to fix that by cloning the http.DefaultTransport and updating it accordingly. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-12-15 09:34:06 +00:00
Milos Gajdos	3f3e61e299	fix: update incorrect godoc comment for (writer).Writer() Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-12-13 14:56:06 +00:00
Milos Gajdos	4baddbc608	fix: update S3 storage driver writer This commit updates (writer).Writer() method in S3 storage driver to handle the case where an append is attempted to a zer-size content. S3 does not allow appending to already committed content, so we are optiing to provide the following case as a narrowed down behaviour: Writer can only append to zero byte content - in that case, a new S3 MultipartUpload is created that will be used for overriding the already committed zero size content. Appending to non-zero size content fails with error. Co-authored-by: Cory Snider <corhere@gmail.com> Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-12-13 09:22:48 +00:00
Milos Gajdos	bd0e476910	Hide our misuses of contexts from the public interface (#4128 )	2023-11-03 05:05:19 +00:00
Milos Gajdos	1d7526dea0	cleanup: make chunk sizes easier to understand and change writer append (#4139 )	2023-10-31 19:47:06 +00:00
Cory Snider	b45b6d18b8	storage/driver: plumb contexts into factories ...and driver constructors when applicable. Signed-off-by: Cory Snider <csnider@mirantis.com>	2023-10-27 17:48:57 -04:00
Cory Snider	f089932de0	storage/driver: replace URLFor method Several storage drivers and storage middlewares need to introspect the client HTTP request in order to construct content-redirect URLs. The request is indirectly passed into the driver interface method URLFor() through the context argument, which is bad practice. The request should be passed in as an explicit argument as the method is only called from request handlers. Replace the URLFor() method with a RedirectURL() method which takes an HTTP request as a parameter instead of a context. Drop the options argument from URLFor() as in practice it only ever encoded the request method, which can now be fetched directly from the request. No URLFor() callers ever passed in an "expiry" option, either. Signed-off-by: Cory Snider <csnider@mirantis.com>	2023-10-27 10:58:37 -04:00
Cory Snider	d0f5aa670b	Move context package internal Our context package predates the establishment of current best practices regarding context usage and it shows. It encourages bad practices such as using contexts to propagate non-request-scoped values like the application version and using string-typed keys for context values. Move the package internal to remove it from the API surface of distribution/v3@v3.0.0 so we are free to iterate on it without being constrained by compatibility. Signed-off-by: Cory Snider <csnider@mirantis.com>	2023-10-27 10:58:37 -04:00
Milos Gajdos	852de2c2bb	cleanup: make chunk sizes easier to understand and change writer append This commit make the S3 driver chunk size constants more straightforward to understand -- instead of remembering the bit shifts we make this more explicit. We are also updating append parameter to the `(writer).Write` to follow the new convention we are trying to establish. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-10-27 10:57:54 +01:00
Milos Gajdos	cb0d083d8d	feat: Add context to storagedriver.(Filewriter).Commit() This commit changes storagedriver.Filewriter interface by adding context.Context as an argument to its Commit func. We pass the context appropriately where need be throughout the distribution codebase to all the writers and tests. S3 driver writer unfortunately must maintain the context passed down to it from upstream so it contnues to implement io.Writer and io.Closer interfaces which do not allow accepting the context in any of their funcs. Co-authored-by: Cory Snider <corhere@gmail.com> Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-10-19 11:27:27 +01:00
Milos Gajdos	a70964c2fc	Merge pull request #4076 from flavianmissi/s3-loglevel registry: add loglevel support for aws s3 storage driver	2023-10-04 14:13:15 +01:00
Flavian Missi	3df7e28f44	registry: add loglevel support for aws s3 storage driver based on the work from https://github.com/distribution/distribution/pull/3057. Co-authored-by: Simon Compston <compston@gmail.com> Signed-off-by: Flavian Missi <fmissi@redhat.com>	2023-10-02 15:47:02 +02:00
Milos Gajdos	4fce3c0028	Move completedParts type back to the original position Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-09-28 15:58:02 +01:00
Milos Gajdos	b888b14b39	Optimise push in S3 driver This commit cleans up and attempts to optimise the performance of image push in S3 driver. There are 2 main changes: * we refactor the S3 driver Writer where instead of using separate bytes slices for ready and pending parts which get constantly appended data into them causing unnecessary allocations we use optimised bytes buffers; we make sure these are used efficiently when written to. * we introduce a memory pool that is used for allocating the byte buffers introduced above These changes should alleviate high memory pressure on the push path to S3. Co-authored-by: Cory Snider <corhere@gmail.com> Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-09-27 21:33:22 +01:00
Milos Gajdos	9790bc806c	Merge pull request #4037 from milosgajdos/enable-prealloc Enable prealloc linter	2023-09-04 16:57:29 +01:00
Milos Gajdos	59fd8656ac	Enable prealloc linter This will give us nice little performance gains in some code paths. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-09-03 22:41:51 +01:00
Milos Gajdos	dcdd8bb740	Propagate storage driver context to S3 API calls Only some of the S3 storage driver calls were propagating context to the S3 API calls. This commit updates the S3 storage drivers so the context is propagated to all the S3 API calls. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-09-03 21:54:54 +01:00
James Hewitt	e22f7cbc73	Pass the last paging flag to storage drivers Storage drivers may be able to take advantage of the hint to start their walk more efficiently. For S3: The API takes a start-after parameter. Registries with many repositories can drastically reduce calls to s3 by telling s3 to only list results lexographically after the last parameter. For the fallback: We can start deeper in the tree and avoid statting the files and directories before the hint in a walk. For a filesystem this improves performance a little, but many of the API based drivers are currently treated like a filesystem, so this drastically improves the performance of GCP and Azure blob. Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>	2023-08-29 11:27:42 +01:00
Milos Gajdos	59dd684cc8	Merge pull request #3713 from Jamstah/s3-tests	2023-08-21 13:48:43 +01:00
Sebastiaan van Stijn	5b3be39870	s3: add interface assertion This was added for the other drivers in `6b388b1ba6`, but it missed the s3 storage driver. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2023-08-21 13:54:13 +02:00
James Hewitt	7622d0a453	Don't return the from of a walk Other storage drivers will only return children and below, s3 should do the same. The only reason it was returning was because of the addition of a / to ensure we treat the from as a directory. Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>	2023-08-15 16:26:37 +01:00
Lucas França de Oliveira	035a8ec52a	Fix panic in the s3 backend walk logic Signed-off-by: Lucas França de Oliveira <lucasfdo@palantir.com>	2023-05-25 14:56:05 -07:00
Milos Gajdos	2fb8dbdeca	Merge pull request #3839 from kirat-singh/feature.azure-sdk-update Update Azure SDK and support additional authentication schemes	2023-04-25 19:35:34 +01:00
Kirat Singh	ba4a6bbe02	Update Azure SDK and support additional authentication schemes Microsoft has updated the golang Azure SDK significantly. Update the azure storage driver to use the new SDK. Add support for client secret and MSI authentication schemes in addition to shared key authentication. Implement rootDirectory support for the azure storage driver to mirror the S3 driver. Signed-off-by: Kirat Singh <kirat.singh@beacon.io> Co-authored-by: Cory Snider <corhere@gmail.com>	2023-04-25 17:23:20 +00:00
Milos Gajdos	0c958010ac	Merge pull request #3763 from distribution/multipart-upload-empty-files Enable pushing empty blobs	2023-03-27 10:18:44 +01:00
Milos Gajdos	5fa926a609	Enable pushing empty blobs This is an edge case when we are trying to upload an empty chunk of data using a MultiPart upload. As a result we are trying to complete the MultipartUpload with an empty slice of `completedUploadedParts` which will always lead to 400 being returned from S3 See: https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#CompletedMultipartUpload Solution: we upload an empty i.e. 0 byte part as a single part and then append it to the completedUploadedParts slice used to complete the Multipart upload. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2023-03-27 10:11:07 +01:00
Aaron Lehmann	2074688be9	Fix S3 multipart upload pagination loop condition The loop that iterates over paginated lists of S3 multipart upload parts appears to be using the wrong variable in its loop condition. Nothing inside the loop affects the value of `resp.IsTruncated`, so this loop will either be wrongly skipped or loop forever. It looks like this is a regression caused by commit `7736319f2e`. The return value of `ListMultipartUploads` used to be assigned to a variable named `resp`, but it was renamed to `partsList` without updating the for loop condition. I believe this is causing an error we're seeing with large layer uploads at commit time: upload resumed at wrong offset: 5242880000 != 5815706782 Missing parts of the multipart S3 upload would cause an incorrect size calculation in `newWriter`. Signed-off-by: Aaron Lehmann <alehmann@netflix.com>	2023-02-21 20:57:50 -08:00
Kirat Singh	3117e2eb2f	Use default http.Transport for AWS S3 session Previously we used a custom Transport in order to modify the user agent header. This prevented the AWS SDK from being able to customize SSL and other client TLS parameters since it could not understand the Transport type. Instead we can simply use the SDK function MakeAddToUserAgentFreeFormHandler to customize the UserAgent if necessary and leave all the TLS configuration to the AWS SDK. The only exception being SkipVerify which we have to handle, but we can set it onto the standard http.Transport which does not interfere with the SDKs ability to set other options. Signed-off-by: Kirat Singh <kirat.singh@gmail.com>	2023-02-15 13:37:01 -05:00
Sebastiaan van Stijn	f8b3af78fc	replace deprecated io/ioutil Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-11-04 23:47:52 +01:00
Hayley Swimelar	e3509fc1de	Merge pull request #3635 from milosgajdos/make-s3-driver-delete-faster Delete S3 keys incrementally in batches	2022-11-04 16:56:41 +01:00
Hayley Swimelar	52d948a9f5	Merge pull request #3766 from thaJeztah/gofumpt format code with gofumpt	2022-11-04 12:19:53 +01:00
Sebastiaan van Stijn	e0281dc609	format code with gofumpt gofumpt (https://github.com/mvdan/gofumpt) provides a supserset of `gofmt` / `go fmt`, and addresses various formatting issues that linters may be checking for. We can consider enabling the `gofumpt` linter to verify the formatting in CI, although not every developer may have it installed, so for now this runs it once to get formatting in shape. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-11-03 22:48:20 +01:00
Sebastiaan van Stijn	f9ccd2c6ea	use http consts for request methods Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2022-11-02 23:31:47 +01:00
Milos Gajdos	ebc4234fd5	Delete S3 keys incrementally in batches Instead of first collecting all keys and then batch deleting them, we will do the incremental delete _online_ per max allowed batch. Doing this prevents frequent allocations for large S3 keyspaces and OOM-kills that might happen as a result of those. This commit introduces storagedriver.Errors type that allows to return multierrors as a single error from any storage driver implementation. Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2022-10-30 19:10:24 +00:00
João Pereira	a7fc49b067	Merge pull request #3622 from ddelange/patch-1 Support all S3 instant retrieval storage classes	2022-04-26 10:23:14 +01:00
Milos Gajdos	27b5563245	Merge pull request #3624 from milosgajdos/aws-s3-listv2 Update s3 ListObjects to V2 API	2022-04-22 13:34:13 +01:00
duanhongyi	15de9e21ba	Add forcepathstyle parameter for s3 Signed-off-by: duanhongyi <duanhongyi@doopai.com>	2022-04-20 08:44:12 +08:00
Milos Gajdos	8eab5d1bd6	Update s3 ListObjects to V2 API Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>	2022-04-09 12:16:46 +01:00
Simone Locci	80952c9e2b	Rename s3accelerate parameter to accelerate Signed-off-by: Simone Locci <simonelocci88@gmail.com>	2022-04-04 19:35:21 +02:00
Simone Locci	ea27621d4a	Fix review Signed-off-by: Simone Locci <simonelocci88@gmail.com>	2022-04-04 19:35:09 +02:00
Kirat Singh	51c0c8148a	Add new parameter s3accelerate to S3 storage driver. If s3accelerate is set to true then we turn on S3 Transfer Acceleration via the AWS SDK. It defaults to false since this is an opt-in feature on the S3 bucket. Signed-off-by: Kirat Singh <kirat.singh@wsq.io> Signed-off-by: Simone Locci <simonelocci88@gmail.com>	2022-04-04 19:34:57 +02:00
ddelange	966fae5463	Add tests for all supported storage classes Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>	2022-04-04 10:54:18 +02:00
ddelange	fb937deabf	Support all S3 instant retrieval storage classes Signed-off-by: ddelange <14880945+ddelange@users.noreply.github.com>	2022-04-01 11:55:35 +02:00
João Pereira	514cbd71be	Merge pull request #3519 from jtherin/mpu-paginate fix: paginate through s3 multipart uploads	2022-03-11 16:06:46 +00:00
libo.huang	117757a5cb	feat: add option to disable combining the pending part Signed-off-by: Libo Huang <huanglibo2010@gmail.com>	2022-01-07 18:20:31 +08:00

1 2

95 commits