This commit cleans up and attempts to optimise the performance of image push in S3 driver.
There are 2 main changes:
* we refactor the S3 driver Writer where instead of using separate bytes
slices for ready and pending parts which get constantly appended data
into them causing unnecessary allocations we use optimised bytes
buffers; we make sure these are used efficiently when written to.
* we introduce a memory pool that is used for allocating the byte
buffers introduced above
These changes should alleviate high memory pressure on the push path to S3.
Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Add two new checks to the testsuite that check
the driver can handle zero byte files and appends to zero
byte files correctly
Signed-off-by: Neil Wilson <neil@aldur.co.uk>
The SuccessStatus acted on the response's status code, and was used to return
early, before checking the same status code with HandleErrorResponse.
This patch combines both functions into a HandleHTTPResponseError, which
returns an error for "non-success" status-codes, which simplifies handling
of responses, and makes some logic slightly more idiomatic.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It was used for signing schema v1 manifests in tests which have now been
removed so there is no point in keeping these there anymore.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Use the non-exported function to all errors; there's currently no external
consumers of this function (perhaps it should be deprecated).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
In case drvr.PutContent fails and returns error we'd have
some extra memory allocated, though in this case
(test with known size of the slice being iterated), that's fine.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Only some of the S3 storage driver calls were propagating context to the
S3 API calls. This commit updates the S3 storage drivers so the context
is propagated to all the S3 API calls.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This integrates the new module, which was extracted from this repository
at commit b9b19409cf458dcb9e1253ff44ba75bd0620faa6;
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
# create a temporary clone of docker
cd ~/Projects
git clone https://github.com/distribution/distribution.git reference
cd reference
# commit taken from
git rev-parse --verify HEAD
b9b19409cf
# remove all code, except for general files, 'reference/', and rename to /
git filter-repo \
--path .github/workflows/codeql-analysis.yml \
--path .github/workflows/fossa.yml \
--path .golangci.yml \
--path distribution-logo.svg \
--path CODE-OF-CONDUCT.md \
--path CONTRIBUTING.md \
--path GOVERNANCE.md \
--path README.md \
--path LICENSE \
--path MAINTAINERS \
--path-glob 'reference/*.*' \
--path-rename reference/:
# initialize go.mod
go mod init github.com/distribution/reference
go mod tidy -go=1.20
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Storage drivers may be able to take advantage of the hint to start
their walk more efficiently.
For S3: The API takes a start-after parameter. Registries with many
repositories can drastically reduce calls to s3 by telling s3 to only
list results lexographically after the last parameter.
For the fallback: We can start deeper in the tree and avoid statting
the files and directories before the hint in a walk. For a filesystem
this improves performance a little, but many of the API based drivers
are currently treated like a filesystem, so this drastically improves
the performance of GCP and Azure blob.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
Client attempts to parse the body of every error it receives as JSON
regardless of the content-type. This commit rectifies by only parsing
he error body as JSON if the Content-Type header is set to
either "application/json" or "application/vnd.api+json".
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
We are replacing the very outdated redigo Go module with the official
redis Go module, go-redis.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
We've replaced all the schema1 references with OCI schema manifest.
Note, there are some TODO items that must be addressed at some point in
the future once the schema1 package is removed completely from the
codebase.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This puts back the original flow where old clients are fetching manifest
lists schema1 images where we want to try returning some image for the
default architecture. This was incorrectly removed by one of the
previous commits.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
schema1 package was deprecated a while ago so we are removing
any references to it from handlers. in preparation to
removing it from the codebase altogether.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This commit removes `oss` storage driver from distribution as well as
`alicdn` storage middleware which only works with the `oss` driver.
There are several reasons for it:
* no real-life expertise among the maintainers
* oss is compatible with S3 API operations required by S3 storage driver
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
The Azure tests fail if there is no Azure configuration available,
instead they should be skipped.
Also, one of the Azure tests is wrong and doesn't match the code.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
Other storage drivers will only return children and below, s3 should do
the same. The only reason it was returning was because of the addition
of a / to ensure we treat the from as a directory.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
This test will only work on an s3 bucket on an s3 outpost. Most
developers won't have access to one of these.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
If we haven't set a storage class there's no point in checking the
storage class applied to the object - s3 will choose one.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
This commit removes swift storage driver from distribution.
There are several reasons for it:
* no real life expertise among the maintainers
* swift is compatible with S3 API operations required by S3 storage driver
This will also remove depedencies that are also hard to keep up with.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Client ReadFrom doesn't set Content-Type header leading to server
side implementor to assume it's application/octet-stream. This commit
makes this explicit on the client side.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
schema1 package was deprecated a while ago so we are removing
any references to it from the proxy package in preparation to
removing it from the codebase altogether.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Redis introduced an Access Control List (ACL) mechanism since version 6.0. This commit implements the necessary changes to support configuring the username for Redis. Users can now define a specific username to authenticate with Redis and enhance security through the ACL feature.
Signed-off-by: chlins <chenyuzh@vmware.com>
Split request and hit metrics into separate metrics, rather than using
labels. This avoids duplication of data and makes metric math easier.
* Count cache errors separately to avoid weird math.
* Hit ratio: `registry_storage_cache_hits_total / registry_storage_cache_requests_total`
* Miss ratio: `1 - (registry_storage_cache_hits_total / registry_storage_cache_requests_total`
* Misses: `registry_storage_cache_requests_total -
registry_storage_cache_hits_total`
Signed-off-by: Ben Kochie <superq@gmail.com>
This enables go build tags so the GCS and OSS driver support is
available in the binary distributed via the image build by Dockerfile.
This led to quite a few fixes in the GCS and OSS packages raised as
warning by golang-ci linter.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Something seems broken on azure/azure sdk side - it is currently not
possible to copy a blob of type AppendBlob using `CopyFromURL`.
Using the AppendBlob client via NewAppendBlobClient does not work
either.
According to Azure the correct way to do this is by using
StartCopyFromURL. Because this is an async operation, we need to do
polling ourselves. A simple backoff mechanism is used, where during each
iteration, the configured delay is multiplied by the retry number.
Also introduces two new config options for the Azure driver:
copy_status_poll_max_retry, and copy_status_poll_delay.
Signed-off-by: Flavian Missi <fmissi@redhat.com>