Stat always calls ListObjects when stat-ing S3 key.
Unfortauntely ListObjects is not a free call - both in terms of egress
and actual AWS costs (likely because of the egress).
This changes the behaviour of Stat such that we always attempt the
HeadObject call first and only ever fall through to ListObjects if the
HeadObject returns an AWS API error.
Note, that the official docs mention that the only error returned by
HEAD is NoSuchKey; experiments show that this is demonstrably wrong and
the AWS docs are simply outdated at the time of this commit.
HeadObject actually returns the following errors:
* NotFound: if the queried key does not exist
* NotFound: if the queried key contains subkeys i.e. it's a prefix
* BucketRegionError: if the bucket does not exist
* Forbidden: if Head operation is not allows via IAM/ACLs
Co-authored-by: Cory Snider <corhere@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Defining an interface on the implementer side is generally not best
practice in Go code. There is no code in the distribution module which
consumes a ManifestBuilder value so there is no need to define the
interface in the distribution module. Export the concrete
ManifestBuilder types and modify the constructors to return concrete
values.
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Cory Snider <csnider@mirantis.com>
This allows to rewrite 'URLFor' of the storage driver to use a specific
host/trim the base path.
It is different from the 'redirect' middleware, as it still calls the
storage driver URLFor.
For example, with Azure storage provider, this allows to transform the
SAS Azure Blob Storage URL into the URL compatible with Azure Front
Door.
Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
The garbage-collect should remove unsed layer link file
P.S. This was originally contributed by @m-masataka, now I would like to take over it.
Thanks @m-masataka efforts with PR https://github.com/distribution/distribution/pull/2288
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Huge help from @milosgajdos who figured out how to do the entire
marshalling/unmarshalling for the configs
Signed-off-by: Anders Ingemann <aim@orbit.online>
This bumps go-jose to the latest available version: v4.0.3.
This slightly breaks the backwards compatibility with the existing
registry deployments but brings more security with it.
We now require the users to specify the list of token signing algorithms in
the configuration. We do strive to maintain the b/w compat by providing
a list of supported algorithms, though, this isn't something we
recommend due to security issues, see:
* https://github.com/go-jose/go-jose/issues/64
* https://github.com/go-jose/go-jose/pull/69
As part of this change we now return to the original flow of the token
signature validation:
1. X2C (tls) headers
2. JWKS
3. KeyID
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Enable configuration options that can selectively disable validation
that dependencies exist within the registry before the image index
is uploaded.
This enables sparse indexes, where a registry holds a manifest index that
could be signed (so the digest must not change) but does not hold every
referenced image in the index. The use case for this is when a registry
mirror does not need to mirror all platforms, but does need to maintain
the digests of all manifests either because they are signed or because
they are pulled by digest.
The registry administrator can also select specific image architectures
that must exist in the registry, enabling a registry operator to select
only the platforms they care about and ensure all image indexes uploaded
to the registry are valid for those platforms.
Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
With the current logic we only verifies the region and return if it's
empty; we were not validating the regionEndpoint parameter.
Signed-off-by: Ankur Kothiwal <ankur.kothiwal@cern.com>
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).
Therefore, we can use concurrent lookup and untag to optimize performance as described in https://github.com/goharbor/harbor/issues/12948.
P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR https://github.com/distribution/distribution/pull/3890.
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
it is reasonable to ignore the error that the manifest tag path does not exist when querying
all tags of the specified repository when executing gc.
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Currently, the `forcepathstyle` parameter for the s3 storage driver is
considered only if the `regionendpoint` parameter is set. Since setting
a region endpoint explicitly is discouraged with AWS s3, it is not clear
how to enforce path style URLs with AWS s3.
This also means, that the default value (true) only applies if a region
endpoint is configured.
This change makes sure we always forward the `forcepathstyle` parameter
to the aws-sdk if present in the config. This is a breaking change where
a `regionendpoint` is configured but no explicit `forcepathstyle` value
is set.
Signed-off-by: Benjamin Schanzel <benjamin.schanzel@bmw.de>
In commit 17952924f3 we updated ServeBlob() to use an io.MultiWriter to
write simultaneously to the local store and the HTTP response.
However, copyContent was using a type assertion to only add headers if
the io.Writer was a http.ResponseWriter. Therefore, this change caused
us to stop sending the expected headers (i.e. Content-Length, Etag,
etc.) on the first request for a blob.
Resolve the issue by explicitly passing in http.Header and setting it
unconditionally.
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>