Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).
Therefore, we can use concurrent lookup and untag to optimize performance as described in https://github.com/goharbor/harbor/issues/12948.
P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR https://github.com/distribution/distribution/pull/3890.
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
it is reasonable to ignore the error that the manifest tag path does not exist when querying
all tags of the specified repository when executing gc.
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Currently, the `forcepathstyle` parameter for the s3 storage driver is
considered only if the `regionendpoint` parameter is set. Since setting
a region endpoint explicitly is discouraged with AWS s3, it is not clear
how to enforce path style URLs with AWS s3.
This also means, that the default value (true) only applies if a region
endpoint is configured.
This change makes sure we always forward the `forcepathstyle` parameter
to the aws-sdk if present in the config. This is a breaking change where
a `regionendpoint` is configured but no explicit `forcepathstyle` value
is set.
Signed-off-by: Benjamin Schanzel <benjamin.schanzel@bmw.de>
In commit 17952924f3 we updated ServeBlob() to use an io.MultiWriter to
write simultaneously to the local store and the HTTP response.
However, copyContent was using a type assertion to only add headers if
the io.Writer was a http.ResponseWriter. Therefore, this change caused
us to stop sending the expected headers (i.e. Content-Length, Etag,
etc.) on the first request for a blob.
Resolve the issue by explicitly passing in http.Header and setting it
unconditionally.
Signed-off-by: Mikel Rychliski <mikel@mikelr.com>
Future-proof the version package's exported interface by only making the
data available through getter functions. This affords us the flexibility
to e.g. implement them in terms of "runtime/debug".ReadBuildInfo() in
the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
This reverts https://github.com/distribution/distribution/pull/3556
This feature is currently broken and requires more fundamental changes
in the S3 driver. Until then it's better to remove it.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Unfortunately one of the changes we merged in broken the support for
http.ProxyFromEnvironment https://pkg.go.dev/net/http#ProxyFromEnvironment
This commit attempts to fix that by cloning the http.DefaultTransport
and updating it accordingly.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This commit updates (writer).Writer() method in S3 storage driver to
handle the case where an append is attempted to a zer-size content.
S3 does not allow appending to already committed content, so we are
optiing to provide the following case as a narrowed down behaviour:
Writer can only append to zero byte content - in that case, a new S3
MultipartUpload is created that will be used for overriding the already
committed zero size content.
Appending to non-zero size content fails with error.
Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
GCS storage driver used to be conditionally built due to its being
outdated and basically unmaintained. Recently the driver has gone
through a rework and updates. Let's remove the build tag so we have less
headaches dealing with it and try keeping it up to date.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>