Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).
Therefore, we can use concurrent lookup and untag to optimize performance as described in https://github.com/goharbor/harbor/issues/12948.
P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR https://github.com/distribution/distribution/pull/3890.
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
Future-proof the version package's exported interface by only making the
data available through getter functions. This affords us the flexibility
to e.g. implement them in terms of "runtime/debug".ReadBuildInfo() in
the future.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The API for periodic health checks is repetitive, with a distinct
function for polling a checker to each kind of updater. It also gives
the user no control over the lifetime of the polling goroutines nor
which context is passed into the checker.
Replace the existing PeriodicXYZChecker functions with a single Poll
function which composes an Updater with a Checker. Its context parameter
is passed into the checker and also controls when the polling loop
terminates. To guard against health checks failing closed (ostensibly
healthy) when the polling loop is terminated, the updater is forcefully
updated to an error status, overriding any configured threshold.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Allow health checkers to abort if the request context is canceled.
Modify the checkers to respect context cancelation and return wrapped
errors so the caller of CheckStatus() would be able to discriminate true
failed checks from checks which were aborted because the context became
done.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The specifics of how the authorization for a request is propagated
through the registry app are private implementation details. Hide those
details from outsiders so they can be changed as needed without fear of
breaking third-party code. Move the utilities for attaching a request's
authorization status to its context and retrieving it from the context
into the registry/handlers package as unexported symbols.
Signed-off-by: Cory Snider <csnider@mirantis.com>
The details of how request-scoped information is propagated through the
registry server app should be left as private implementation details so
they can be changed without fear of breaking compatibility with
third-party code which imports the distribution module. The
AccessController interface unnecessarily bakes into the public API
details of how authorization grants are propagated through request
contexts. In practice the only values the in-tree authorizers attach to
the request contexts are the UserInfo and Resources for the request.
Change the AccessController interface to return the UserInfo and
Resources directly to allow us to change how request contexts are used
within the app without altering the AccessController interface contract.
Signed-off-by: Cory Snider <csnider@mirantis.com>
Our context package predates the establishment of current best practices
regarding context usage and it shows. It encourages bad practices such
as using contexts to propagate non-request-scoped values like the
application version and using string-typed keys for context values. Move
the package internal to remove it from the API surface of
distribution/v3@v3.0.0 so we are free to iterate on it without being
constrained by compatibility.
Signed-off-by: Cory Snider <csnider@mirantis.com>
It was used for signing schema v1 manifests in tests which have now been
removed so there is no point in keeping these there anymore.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This integrates the new module, which was extracted from this repository
at commit b9b19409cf458dcb9e1253ff44ba75bd0620faa6;
# install filter-repo (https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
brew install git-filter-repo
# create a temporary clone of docker
cd ~/Projects
git clone https://github.com/distribution/distribution.git reference
cd reference
# commit taken from
git rev-parse --verify HEAD
b9b19409cf
# remove all code, except for general files, 'reference/', and rename to /
git filter-repo \
--path .github/workflows/codeql-analysis.yml \
--path .github/workflows/fossa.yml \
--path .golangci.yml \
--path distribution-logo.svg \
--path CODE-OF-CONDUCT.md \
--path CONTRIBUTING.md \
--path GOVERNANCE.md \
--path README.md \
--path LICENSE \
--path MAINTAINERS \
--path-glob 'reference/*.*' \
--path-rename reference/:
# initialize go.mod
go mod init github.com/distribution/reference
go mod tidy -go=1.20
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We are replacing the very outdated redigo Go module with the official
redis Go module, go-redis.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
This puts back the original flow where old clients are fetching manifest
lists schema1 images where we want to try returning some image for the
default architecture. This was incorrectly removed by one of the
previous commits.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
schema1 package was deprecated a while ago so we are removing
any references to it from handlers. in preparation to
removing it from the codebase altogether.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
Redis introduced an Access Control List (ACL) mechanism since version 6.0. This commit implements the necessary changes to support configuring the username for Redis. Users can now define a specific username to authenticate with Redis and enhance security through the ACL feature.
Signed-off-by: chlins <chenyuzh@vmware.com>
Introduced a Catalog entry in the configuration struct. With it,
it's possible to control the maximum amount of entries returned
by /v2/catalog (`GetCatalog` in registry/handlers/catalog.go).
It's set to a default value of 1000.
`GetCatalog` returns 100 entries by default if no `n` is
provided. When provided it will be validated to be between `0`
and `MaxEntries` defined in Configuration. When `n` is outside
the aforementioned boundary, an error response is returned.
`GetCatalog` now handles `n=0` gracefully with an empty response
as well.
Signed-off-by: José D. Gómez R. <1josegomezr@gmail.com>
Currently, "response completed with error" log lines include an
`auth.user.name` key, but successful "response completed" lines do not
include this, because they are logged a few stack frames up where
`auth.user.name` is not present on the `Context`. Move the successful
request logging inside the `dispatcher` closure, where the logger on the
context automatically includes this key.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Go 1.18 and up now provides a strings.Cut() which is better suited for
splitting key/value pairs (and similar constructs), and performs better:
```go
func BenchmarkSplit(b *testing.B) {
b.ReportAllocs()
data := []string{"12hello=world", "12hello=", "12=hello", "12hello"}
for i := 0; i < b.N; i++ {
for _, s := range data {
_ = strings.SplitN(s, "=", 2)[0]
}
}
}
func BenchmarkCut(b *testing.B) {
b.ReportAllocs()
data := []string{"12hello=world", "12hello=", "12=hello", "12hello"}
for i := 0; i < b.N; i++ {
for _, s := range data {
_, _, _ = strings.Cut(s, "=")
}
}
}
```
BenchmarkSplit
BenchmarkSplit-10 8244206 128.0 ns/op 128 B/op 4 allocs/op
BenchmarkCut
BenchmarkCut-10 54411998 21.80 ns/op 0 B/op 0 allocs/op
While looking at occurrences of `strings.Split()`, I also updated some for alternatives,
or added some constraints;
- for cases where an specific number of items is expected, I used `strings.SplitN()`
with a suitable limit. This prevents (theoretical) unlimited splits.
- in some cases it we were using `strings.Split()`, but _actually_ were trying to match
a prefix; for those I replaced the code to just match (and/or strip) the prefix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
gofumpt (https://github.com/mvdan/gofumpt) provides a supserset of `gofmt` / `go fmt`,
and addresses various formatting issues that linters may be checking for.
We can consider enabling the `gofumpt` linter to verify the formatting in CI, although
not every developer may have it installed, so for now this runs it once to get formatting
in shape.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
1, return the right upload offset for client when asks.
2, do not call ResumeBlobUpload on getting status.
3, return 416 rather than 404 on failed to patch chunk blob.
4, add the missing upload close
Signed-off-by: Wang Yan <wangyan@vmware.com>
Instead of letting the cache grow without bound, use a LRU to impose a
size limit.
The limit is configurable through a new `blobdescriptorsize` config key.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
1, Fix GoSec G404: Use of weak random number generator (math/rand instead of crypto/rand)
2, Fix Static check: ST1019: package "github.com/sirupsen/logrus" is being imported more than once
Signed-off-by: Wang Yan <wangyan@vmware.com>