`registry/storage/driver/inmemory/driver_test.go` times out after ~10min. The slow test is `testsuites.go:TestWriteReadLargeStreams()` which writes a 5GB file.
Root cause is inefficient slice reallocation algorithm. The slice holding file bytes grows only 32K on each allocation. To fix it, this PR grows slice exponentially.
Signed-off-by: Wei Meng <wemeng@microsoft.com>
Go 1.18 and up now provides a strings.Cut() which is better suited for
splitting key/value pairs (and similar constructs), and performs better:
```go
func BenchmarkSplit(b *testing.B) {
b.ReportAllocs()
data := []string{"12hello=world", "12hello=", "12=hello", "12hello"}
for i := 0; i < b.N; i++ {
for _, s := range data {
_ = strings.SplitN(s, "=", 2)[0]
}
}
}
func BenchmarkCut(b *testing.B) {
b.ReportAllocs()
data := []string{"12hello=world", "12hello=", "12=hello", "12hello"}
for i := 0; i < b.N; i++ {
for _, s := range data {
_, _, _ = strings.Cut(s, "=")
}
}
}
```
BenchmarkSplit
BenchmarkSplit-10 8244206 128.0 ns/op 128 B/op 4 allocs/op
BenchmarkCut
BenchmarkCut-10 54411998 21.80 ns/op 0 B/op 0 allocs/op
While looking at occurrences of `strings.Split()`, I also updated some for alternatives,
or added some constraints;
- for cases where an specific number of items is expected, I used `strings.SplitN()`
with a suitable limit. This prevents (theoretical) unlimited splits.
- in some cases it we were using `strings.Split()`, but _actually_ were trying to match
a prefix; for those I replaced the code to just match (and/or strip) the prefix.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Embed the interface that we're mocking; calling any of it's methods
that are not implemented will panic, so should give the same result
as before.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Both of these were deprecated in 55f675811a,
but the format of the GoDoc comments didn't follow the correct format, which
caused them not being picked up by tools as "deprecated".
This patch updates uses in the codebase to use the alternatives.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
gofumpt (https://github.com/mvdan/gofumpt) provides a supserset of `gofmt` / `go fmt`,
and addresses various formatting issues that linters may be checking for.
We can consider enabling the `gofumpt` linter to verify the formatting in CI, although
not every developer may have it installed, so for now this runs it once to get formatting
in shape.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Instead of first collecting all keys and then batch deleting them,
we will do the incremental delete _online_ per max allowed batch.
Doing this prevents frequent allocations for large S3 keyspaces
and OOM-kills that might happen as a result of those.
This commit introduces storagedriver.Errors type that allows to return
multierrors as a single error from any storage driver implementation.
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
1, return the right upload offset for client when asks.
2, do not call ResumeBlobUpload on getting status.
3, return 416 rather than 404 on failed to patch chunk blob.
4, add the missing upload close
Signed-off-by: Wang Yan <wangyan@vmware.com>
Instead of letting the cache grow without bound, use a LRU to impose a
size limit.
The limit is configurable through a new `blobdescriptorsize` config key.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
This simple change mainly affects the distribution client. By respecting
the context the caller passes in, timeouts and cancellations will work
as expected. Also, transports which rely on the context (such as tracing
transports that retrieve a span from the context) will work properly.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
(*App).context, called in the HTTP handler on each request, creates a
URLBuilder, which involves calling Router(). This shows up in profiles a
hot spot because it involves compiling the regexps which define all the
routes. For efficiency, cache the router and return the same object each
time.
It appears to be safe to reuse the router because GetRoute is the only
method ever called on the returned router object.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
This reverts commit 06a098c632
This changes the function of linkedBlobStatter.Clear(). It was either removing the first of two possible manifest links or returning nil if none were found. Now it once again it removes only the valid manifest link or returns an error if none are found.
Signed-off-by: Bracken Dawson <abdawson@gmail.com>
If s3accelerate is set to true then we turn on S3 Transfer
Acceleration via the AWS SDK. It defaults to false since this is an
opt-in feature on the S3 bucket.
Signed-off-by: Kirat Singh <kirat.singh@wsq.io>
Signed-off-by: Simone Locci <simonelocci88@gmail.com>
This commit removes the following cipher suites that are known to be insecure:
TLS_RSA_WITH_RC4_128_SHA
TLS_RSA_WITH_AES_128_CBC_SHA256
TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
TLS_ECDHE_RSA_WITH_RC4_128_SHA
TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256
TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
And this commit deletes the tlsVersions of tls1.0 and tls1.1. The tls1.2 is the minimal supported tls version for creating a safer tls configuration.
Signed-off-by: david.bao <baojn1998@163.com>
Allow the storage driver to optionally use AWS SDK's dualstack mode.
This allows the registry to communicate with S3 in IPv6 environments.
Signed-off-by: Adam Kaplan <adam.kaplan@redhat.com>
1, Fix GoSec G404: Use of weak random number generator (math/rand instead of crypto/rand)
2, Fix Static check: ST1019: package "github.com/sirupsen/logrus" is being imported more than once
Signed-off-by: Wang Yan <wangyan@vmware.com>
The wording of the error message had a typo (missing the word "not") that gave it the opposite meaning from the intended meaning.
Signed-off-by: Chad Faragher <wyckster@hotmail.com>
When updatefrequency is set and is a string, its value should be saved
into updateFrequency, and it shouldn't override duration.
Signed-off-by: Oleg Bulatov <oleg@bulatov.me>
Optimized S3 Walk impl by no longer listing files recursively. Overall gives a huge performance increase both in terms of runtime and S3 calls (up to ~500x).
Fixed a bug in WalkFallback where ErrSkipDir for was not handled as documented for non-directory.
Signed-off-by: Collin Shoop <cshoop@digitalocean.com>
Delete was not working when the subpath immediately followed the given path started with an ascii lower than "/" such as dash "-" and underscore "_" and requests no files to be deleted.
(cherry picked from commit 5d8fa0ce94b68cce70237805db92cdd8d40de282)
Signed-off-by: Collin Shoop <cshoop@digitalocean.com>
Fixes#3141
1, return 416 for Out-of-order blob upload
2, return 400 for content length and content size mismatch
Signed-off-by: wang yan <wangyan@vmware.com>
According to OCI image spec, the descriptor's digest field is required.
For the normal config/layer blobs, the valivation should check the
presence of the blob when put manifest.
REF: https://github.com/opencontainers/image-spec/blob/v1.0.1/descriptor.md
Signed-off-by: Arko Dasgupta <arko.dasgupta@docker.com>
Signed-off-by: Wei Fu <fuweid89@gmail.com>
It's possible to configure log fields in the configuration file, and we would
like these fields to be included in all logs. Previously these fields were
included only in logs produced using the main routine's context, meaning that
any logs from a request handler were missing the fields since those use a
context based on the HTTP request's context.
Add a configurable default logger to the `context` package, and set it when
configuring logging at startup time.
Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
Configuration of list of cipher suites allows a user to disable use
of weak ciphers or continue to support them for legacy usage if they
so choose.
List of available cipher suites at:
https://golang.org/pkg/crypto/tls/#pkg-constants
Default cipher suites have been updated to:
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_AES_128_GCM_SHA256
- TLS_CHACHA20_POLY1305_SHA256
- TLS_AES_256_GCM_SHA384
MinimumTLS has also been updated to include TLS 1.3 as an option
and now defaults to TLS 1.2 since 1.0 and 1.1 have been deprecated.
Signed-off-by: David Luu <david@davidluu.info>
Fixes#3363
Without this, we emit illegal json logs, the user-agent
ends up as:
```
"http.request.useragent": "docker/19.03.4 go/go1.12.10 git-commit/9013bf583a kernel/5.10.10-051010-generic os/linux arch/amd64 UpstreamClient(Docker-Client/19.03.4 \(linux\))"
```
which is not valid according to [spec](https://www.json.org/json-en.html)
specifically, string: "<any codepoint except " or \ or control>*"
Signed-off-by: Don Bowman <don@agilicus.com>
Go 1.13 and up enforce import paths to be versioned if a project
contains a go.mod and has released v2 or up.
The current v2.x branches (and releases) do not yet have a go.mod,
and therefore are still allowed to be imported with a non-versioned
import path (go modules add a `+incompatible` annotation in that case).
However, now that this project has a `go.mod` file, incompatible
import paths will not be accepted by go modules, and attempting
to use code from this repository will fail.
This patch uses `v3` for the import-paths (not `v2`), because changing
import paths itself is a breaking change, which means that the
next release should increment the "major" version to comply with
SemVer (as go modules dictate).
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
When a given prefix is empty and we attempt to list its content AWS
returns that the prefix contains one object with key defined as the
prefix with an extra "/" at the end.
e.g.
If we call ListObjects() passing to it an existing but empty prefix,
say "my/empty/prefix", AWS will return that "my/empty/prefix/" is an
object inside "my/empty/prefix" (ListObjectsOutput.Contents).
This extra "/" causes the upload purging process to panic. On normal
circunstances we never find empty prefixes on S3 but users may touch
it.
Signed-off-by: Ricardo Maraschini <rmarasch@redhat.com>
The OCI distribution spec allows implementations to support deleting manifests
by tag, but also permits returning the `UNSUPPORTED` error code for such
requests. docker/distribution has never supported deleting manifests by tag, but
previously returned `DIGEST_INVALID`.
The `Tag` and `Digest` fields of the `manifestHandler` are already correctly
populated based on which kind of reference was given in the request URL. Return
`UNSUPPORTED` if the `Tag` field is populated.
Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
A repository need not contain any unique layers, if its images use only layers
mounted from other repositories. But, the catalog endpoint was looking for the
_layers directory to indicate that a directory was a repository.
Use the _manifests directory as the marker instead, since any repository with
revisions will contain a _manifests directory.
Signed-off-by: Adam Wolfe Gordon <awg@digitalocean.com>
Instead of constructing the list of credential providers manually, if we
use the default list we can take advantage of the AWS SDK checking the
environment and returning either the EC2RoleProvider or the generic HTTP
credentials provider, configured to use the ECS credentials endpoint.
Also, use the `defaults.Config()` function instead of `aws.NewConfig()`,
as this results in an initialised HTTP client which prevents a fatal
error when retrieving credentials from the ECS credentials endpoint.
Fixes#2960
Signed-off-by: Andrew Bulford <andrew.bulford@redmatter.com>
The metrics tracker in cached blob statter was replaced with prometheus
metrics and no longer needed.
Remove unused log wrapping.
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* Update redis.go
If the dgst key does not exist in the cache when calling HGET, `redis.String` will return an `ErrNil` which we need to translate into `distribution.ErrBlobUnknown` so that the error being returned can be properly handled. This will ensure that `SetDescriptor` is properly called from `cachedBlobStatter::Stat` for `repositoryScopedRedisBlobDescriptorService` which will update the redis cache and be considered as a Miss rather than an Error.
cc @manishtomar
* Update suite.go
Add unit test to ensure missing blobs for scoped repo properly return ErrBlobUnknown when HGET returns redis.ErrNil.
(cherry picked from commit dca6b9526a1d30dd218a9f321c4f84ecc4b5e62e)
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* redis metrics
it is working but metrics are not very useful since default buckets
start from 5ms and almost all of them are in that range.
* remove extra comment
(cherry picked from commit ba1a1d74e7eb047dd1056548ccf0695e8846782c)
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
* fix redis caching issue
earlier redis cache was updated when there was any error including any
temporary connectivity issue. This would trigger set calls which would
further increase load and possibly connectivity errors from redis
leaving the system with continuous errors and high latency. Now the
cache is updated only when it is genuine cache miss. Other errors do not
trigger a cache update.
* add back tracker Hit() and Miss() calls
*squashed commits*
(cherry picked from commit 6f3e1c10260ef59ba4e9c42e939329fad9fdd8c3)
(cherry picked from commit 6738ff3320cf82cc2df919a95a1bde2f7789a501)
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Unit test coverge was increased to cover the usages of crypto. This helps to ensure that everything is working fine with fips mode enabled.
Also updated sha1 to sha256 in registry/storage/driver/testsuites/testsuites.go because sha1 is not supported in fips mode.
Signed-off-by: Naveed Jamil <naveed.jamil@tenpearl.com>
Use a synthetic upstream registry when creating the testing mirror configuration
to avoid the test fail when trying to reach http://example.com
Signed-off-by: Fernando Mayo Fernandez <fernando@undefinedlabs.com>
Some registries (ECR) don't provide a `Docker-Upload-UUID` when creating
a blob. So we can't rely on that header. Fallback to reading it from the
URL.
Signed-off-by: Damien Mathieu <dmathieu@salesforce.com>
I've found this logic being in a single method to be quite hard to get.
I believe extracting it makes it easier to read, as we can then more
easily see what the main method does and possibly ignore the intricacies
of `ResumeBlobUpload`.
Signed-off-by: Damien Mathieu <dmathieu@salesforce.com>
When uploading segments to Swift, the registry generates a random file,
by taking the hash of the container path and 32-bytes of random data.
The registry attempts to shard across multiple directory paths, by
taking the first three hex characters as leader.
The implementation in registry, unfortunately, takes the hash of
nothing, and appends it to the path and random data. This results in all
segments being created in one directory.
Fixes: #2407Fixes: #2311
Signed-off-by: Terin Stock <terinjokes@gmail.com>
fetchTokenWithBasicAuth checks if a token is in the token response
but fetchTokenWithOAuth does not
these changes implements the same behaviour for the latter
returning a `ErrNoToken` if a token is not found in the resposne
Signed-off-by: Sevki Hasirci <sevki@cloudflare.com>
Radosgw does not support S3 `GET Bucket` API v2 API but v1.
This API has backward compatibility, so most of this API is working
correctly but we can not get `KeyCount` in v1 API and which is only
for v2 API.
Signed-off-by: Eohyung Lee <liquidnuker@gmail.com>
Use mime.ParseMediaType to parse the media types in Accept header in manifest request. Ignore the failed ones.
Signed-off-by: Yu Wang <yuwa@microsoft.com>
This fixes registry endpoints to return the proper `application/json`
content-type for JSON content, also updating spec examples for that.
As per IETF specification and IANA registry [0], the `application/json`
type is a binary media, so the content-type label does not need any
text-charset selector. Additionally, the media type definition
explicitly states that it has no required nor optional parameters,
which makes the current registry headers non-compliant.
[0]: https://www.iana.org/assignments/media-types/application/json
Signed-off-by: Luca Bruno <lucab@debian.org>
Add an interface alongside TagStore that provides API to retreive
digests of all manifests that a tag historically pointed to. It also
includes currently linked tag.
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
It's possible to run into a race condition in which the enumerator lists
lots of repositories and then starts the long process of enumerating through
them. In that time if someone deletes a repo, the enumerator may error out.
Signed-off-by: Ryan Abrams <rdabrams@gmail.com>
This is done by draining the connections for configured time after registry receives a SIGTERM signal.
This adds a `draintimeout` setting under `HTTP`. Registry doesn't drain
if draintimeout is not provided.
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
context.App.repoRemover is single registry instance stored throughout
app run. It was wrapped in another remover when processing each request.
This remover happened to be remover got from previous request. This way
every remover created was stored in infinite linked list causing memory
leak. Fixing it by storing the wrapped remover inside the request context
which will get gced when request context is gced. This was introduced in
PR #2648.
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
by having another interface RepositoryRemover that is implemented by
registry instance and is injected in app context for event tracking
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
OCI Image manifests and indexes are supported both with and without
an embeded MediaType (the field is reserved according to the spec).
Test storing and retrieving both types from the manifest store.
Signed-off-by: Owen W. Taylor <otaylor@fishsoup.net>
In the OCI image specification, the MediaType field is reserved
and otherwise undefined; assume that manifests without a media
in storage are OCI images or image indexes, and determine which
by looking at what fields are in the JSON. We do keep a check
that when unmarshalling an OCI image or image index, if it has
a MediaType field, it must match that media type of the upload.
Signed-off-by: Owen W. Taylor <otaylor@fishsoup.net>
According golang documentation [1]: no goroutine should expect to be
able to acquire a read lock until the initial read lock is released.
[1] https://golang.org/pkg/sync/#RWMutex
Signed-off-by: Gladkov Alexey <agladkov@redhat.com>
at the first iteration, only the following metrics are collected:
- HTTP metrics of each API endpoint
- cache counter for request/hit/miss
- histogram of storage actions, including:
GetContent, PutContent, Stat, List, Move, and Delete
Signed-off-by: tifayuki <tifayuki@gmail.com>
This adds a configuration setting `HTTP.TLS.LetsEncrypt.Hosts` which can
be set to a list of hosts that the registry will whitelist for retrieving
certificates from Let's Encrypt. HTTPS connections with SNI hostnames
that are not whitelisted will be closed with an "unknown host" error.
It is required to avoid lots of unsuccessful registrations attempts that
are triggered by malicious clients connecting with bogus SNI hostnames.
NOTE: Due to a bug in the deprecated vendored rsc.io/letsencrypt library
clearing the host list requires deleting or editing of the cachefile to
reset the hosts list to null.
Signed-off-by: Felix Buenemann <felix.buenemann@gmail.com>
This removes the old global walk function, and changes all
the code to use the per-driver walk functions.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This changes the Walk Method used for catalog enumeration. Just to show
how much an effect this has on our s3 storage:
Original:
List calls: 6839
real 3m16.636s
user 0m0.000s
sys 0m0.016s
New:
ListObjectsV2 Calls: 1805
real 0m49.970s
user 0m0.008s
sys 0m0.000s
This is because it no longer performs a list and stat per item, and instead
is able to use the metadata gained from the list as a replacement to stat.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Move the Walk types into registry/storage/driver, and add a Walk method to each
storage driver. Although this is yet another API to implement, there is a fall
back implementation that relies on List and Stat. For some filesystems this is
very slow.
Also, this WalkDir Method conforms better do a traditional WalkDir (a la filepath).
This change is in preparation for refactoring.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
If tenant or tenantid are passed as env variables, we systematically use Sprint to make sure they are string and not integer as it would make mapstructure fail.
Signed-off-by: Raphaël Enrici <raphael@root-42.com>
The previous code assumed that the link returned when listing tags was
always absolute. However, some registries, such as quay.io, return the
link as a relative link (e.g. the second page for the quay.io/coreos/etcd
image is /v2/coreos/etcd/tags/list?next_page=<truncated>&n=50). Because
the relative link was retrieved directly, the fetch failed (with the
error `unsupported protocol scheme ""`).
Signed-off-by: Kevin Lin <kevin@kelda.io>
If htpasswd authentication option is configured but the htpasswd file is
missing, populate it with a default user and automatically generated
password.
The password will be printed to stdout.
Signed-off-by: Liron Levin <liron@twistlock.com>
The current registry/client sends the registered manifest types in
random order. Allow clients to request a single specific manifest type
or a preferred order as per the HTTP spec.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
A statically hosted registry that responds correctly to GET with a
manifest will load the right digest (by looking at the manifest body and
calculating the digest). If the registry returns a HEAD without
`Docker-Content-Digest`, then the client Tags().Get() call will return
an empty digest.
This commit changes the client to fallback to loading the tag via GET if
the `Docker-Content-Digest` header is not set.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
AuthorizeRequest() injects the 'pull' scope if `from` is set
unconditionally. If the current token already has that scope, it will
be inserted into the scope list twice and `addedScopes` will be set to
true, resulting in a new token being fetched that has no net new scopes.
Instead, check whether `additionalScopes` are actually new.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
To simplify the vendoring story for the client, we have now removed the
requirement for `logrus` and the forked `context` package (usually
imported as `dcontext`). We inject the logger via the metrics tracker
for the blob cache and via options on the token handler. We preserve
logs on the proxy cache for that case. Clients expecting these log
messages may need to be updated accordingly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Back in the before time, the best practices surrounding usage of Context
weren't quite worked out. We defined our own type to make usage easier.
As this packaged was used elsewhere, it make it more and more
challenging to integrate with the forked `Context` type. Now that it is
available in the standard library, we can just use that one directly.
To make usage more consistent, we now use `dcontext` when referring to
the distribution context package.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Under certain circumstances, the use of `StorageDriver.GetContent` can
result in unbounded memory allocations. In particualr, this happens when
accessing a layer through the manifests endpoint.
This problem is mitigated by setting a 4MB limit when using to access
content that may have been accepted from a user. In practice, this means
setting the limit with the use of `BlobProvider.Get` by wrapping
`StorageDriver.GetContent` in a helper that uses `StorageDriver.Reader`
with a `limitReader` that returns an error.
When mitigating this security issue, we also noticed that the size of
manifests uploaded to the registry is also unlimited. We apply similar
logic to the request body of payloads that are full buffered.
Signed-off-by: Stephen J Day <stephen.day@docker.com>