This is done by draining the connections for configured time after registry receives a SIGTERM signal.
This adds a `draintimeout` setting under `HTTP`. Registry doesn't drain
if draintimeout is not provided.
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
context.App.repoRemover is single registry instance stored throughout
app run. It was wrapped in another remover when processing each request.
This remover happened to be remover got from previous request. This way
every remover created was stored in infinite linked list causing memory
leak. Fixing it by storing the wrapped remover inside the request context
which will get gced when request context is gced. This was introduced in
PR #2648.
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
by having another interface RepositoryRemover that is implemented by
registry instance and is injected in app context for event tracking
Signed-off-by: Manish Tomar <manish.tomar@docker.com>
OCI Image manifests and indexes are supported both with and without
an embeded MediaType (the field is reserved according to the spec).
Test storing and retrieving both types from the manifest store.
Signed-off-by: Owen W. Taylor <otaylor@fishsoup.net>
In the OCI image specification, the MediaType field is reserved
and otherwise undefined; assume that manifests without a media
in storage are OCI images or image indexes, and determine which
by looking at what fields are in the JSON. We do keep a check
that when unmarshalling an OCI image or image index, if it has
a MediaType field, it must match that media type of the upload.
Signed-off-by: Owen W. Taylor <otaylor@fishsoup.net>
According golang documentation [1]: no goroutine should expect to be
able to acquire a read lock until the initial read lock is released.
[1] https://golang.org/pkg/sync/#RWMutex
Signed-off-by: Gladkov Alexey <agladkov@redhat.com>
at the first iteration, only the following metrics are collected:
- HTTP metrics of each API endpoint
- cache counter for request/hit/miss
- histogram of storage actions, including:
GetContent, PutContent, Stat, List, Move, and Delete
Signed-off-by: tifayuki <tifayuki@gmail.com>
This adds a configuration setting `HTTP.TLS.LetsEncrypt.Hosts` which can
be set to a list of hosts that the registry will whitelist for retrieving
certificates from Let's Encrypt. HTTPS connections with SNI hostnames
that are not whitelisted will be closed with an "unknown host" error.
It is required to avoid lots of unsuccessful registrations attempts that
are triggered by malicious clients connecting with bogus SNI hostnames.
NOTE: Due to a bug in the deprecated vendored rsc.io/letsencrypt library
clearing the host list requires deleting or editing of the cachefile to
reset the hosts list to null.
Signed-off-by: Felix Buenemann <felix.buenemann@gmail.com>
This removes the old global walk function, and changes all
the code to use the per-driver walk functions.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This changes the Walk Method used for catalog enumeration. Just to show
how much an effect this has on our s3 storage:
Original:
List calls: 6839
real 3m16.636s
user 0m0.000s
sys 0m0.016s
New:
ListObjectsV2 Calls: 1805
real 0m49.970s
user 0m0.008s
sys 0m0.000s
This is because it no longer performs a list and stat per item, and instead
is able to use the metadata gained from the list as a replacement to stat.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Move the Walk types into registry/storage/driver, and add a Walk method to each
storage driver. Although this is yet another API to implement, there is a fall
back implementation that relies on List and Stat. For some filesystems this is
very slow.
Also, this WalkDir Method conforms better do a traditional WalkDir (a la filepath).
This change is in preparation for refactoring.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
If tenant or tenantid are passed as env variables, we systematically use Sprint to make sure they are string and not integer as it would make mapstructure fail.
Signed-off-by: Raphaël Enrici <raphael@root-42.com>
The previous code assumed that the link returned when listing tags was
always absolute. However, some registries, such as quay.io, return the
link as a relative link (e.g. the second page for the quay.io/coreos/etcd
image is /v2/coreos/etcd/tags/list?next_page=<truncated>&n=50). Because
the relative link was retrieved directly, the fetch failed (with the
error `unsupported protocol scheme ""`).
Signed-off-by: Kevin Lin <kevin@kelda.io>
If htpasswd authentication option is configured but the htpasswd file is
missing, populate it with a default user and automatically generated
password.
The password will be printed to stdout.
Signed-off-by: Liron Levin <liron@twistlock.com>
The current registry/client sends the registered manifest types in
random order. Allow clients to request a single specific manifest type
or a preferred order as per the HTTP spec.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
A statically hosted registry that responds correctly to GET with a
manifest will load the right digest (by looking at the manifest body and
calculating the digest). If the registry returns a HEAD without
`Docker-Content-Digest`, then the client Tags().Get() call will return
an empty digest.
This commit changes the client to fallback to loading the tag via GET if
the `Docker-Content-Digest` header is not set.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
AuthorizeRequest() injects the 'pull' scope if `from` is set
unconditionally. If the current token already has that scope, it will
be inserted into the scope list twice and `addedScopes` will be set to
true, resulting in a new token being fetched that has no net new scopes.
Instead, check whether `additionalScopes` are actually new.
Signed-off-by: Clayton Coleman <ccoleman@redhat.com>
To simplify the vendoring story for the client, we have now removed the
requirement for `logrus` and the forked `context` package (usually
imported as `dcontext`). We inject the logger via the metrics tracker
for the blob cache and via options on the token handler. We preserve
logs on the proxy cache for that case. Clients expecting these log
messages may need to be updated accordingly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Back in the before time, the best practices surrounding usage of Context
weren't quite worked out. We defined our own type to make usage easier.
As this packaged was used elsewhere, it make it more and more
challenging to integrate with the forked `Context` type. Now that it is
available in the standard library, we can just use that one directly.
To make usage more consistent, we now use `dcontext` when referring to
the distribution context package.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Under certain circumstances, the use of `StorageDriver.GetContent` can
result in unbounded memory allocations. In particualr, this happens when
accessing a layer through the manifests endpoint.
This problem is mitigated by setting a 4MB limit when using to access
content that may have been accepted from a user. In practice, this means
setting the limit with the use of `BlobProvider.Get` by wrapping
`StorageDriver.GetContent` in a helper that uses `StorageDriver.Reader`
with a `limitReader` that returns an error.
When mitigating this security issue, we also noticed that the size of
manifests uploaded to the registry is also unlimited. We apply similar
logic to the request body of payloads that are full buffered.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
If the client doesn't support manifest lists, the registry will
rewrite a manifest list into the old format. The Docker-Content-Digest
header should be updated in this case.
Signed-off-by: Oleg Bulatov <oleg@bulatov.me>
In some conditions, regulator.exit may not send a signal to blocked
regulator.enter.
Let's assume we are in the critical section of regulator.exit and r.available
is equal to 0. And there are three more gorotines. One goroutine also executes
regulator.exit and waits for the lock. Rest run regulator.enter and wait for
the signal.
We send the signal, and after releasing the lock, there will be lock
contention:
1. Wait from regulator.enter
2. Lock from regulator.exit
If the winner is Lock from regulator.exit, we will not send another signal to
unlock the second Wait.
Signed-off-by: Oleg Bulatov <obulatov@redhat.com>
Partially reverts change adding support for X-Forwarded-Port.
Changes the logic to prefer the standard Forwarded header over
X-Forwarded headers. Prefer forwarded "host" over "for" since
"for" represents the client and not the client's request.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Since there's no default case, if there's not a tag or digest you get
back a confusing error from the router about it not matching the
expected pattern.
Also redoing the tests for URLs a bit so that they can handle checking
for failures.
Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
A previous inspection of the code surrounding zero-length blobs led to
some interesting question. After inspection, it was found that the hash
was indeed for the empty string (""), and not an empty tar, so the code
was correct. The variable naming and comments have been updated
accordingly.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The registry uses partial Named values which the named parsers
no longer support. To allow the registry service to continue
to operate without canonicalization, switch to use WithName.
In the future, the registry should start using fully canonical
values on the backend and WithName should no longer support
creating partial values.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
When get manifest, the handler will try to retrieve it from storage driver. When storage driver is cloud storage, it can fail due to various reasons even if the manifest exists
(like 500, 503, etc. from storage server). Currently manifest handler blindly return 404 which can be confusing to user.
This change will return 404 if the manifest blob doesn't exist, and return 500 UnknownError for all other errors (consistent with the behavior of other handlers).
Signed-off-by: Yu Wang (UC) <yuwa@microsoft.com>
Once upon a time, we referred to manifests and images interchangably.
That simple past is no more. As we grow, we update our nomenclature and
so follows our code.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
With token authentication, requiring the "*" action for DELETE requests
makes it impossible to administratively lock a repository against pushes
and pulls but still allow deletion. This change adds a new "delete"
action for DELETE requests to make that possible.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
`app.driver.List` on `"/"` is very expensive if registry contains significant amount of images. And the result isn't used anyways.
In most (if not all) storage drivers, `Stat` has a cheaper implementation, so use it instead to achieve the same goal.
Signed-off-by: yixi zhang <yixi@memsql.com>
Modify manifest builder so it can be used to build
manifests with different configuration media types.
Rename config media type const to image config.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Derek McGowan <derek@mcgstyle.net>
Enforces backwards compatibility with older authorization servers
without requiring the client to know about the compatibility
requirements.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
See #2077 for background.
The PR #1438 which was not reviewed by azure folks basically introduced
a race condition around uploads to the same blob by multiple clients
concurrently as it used the "writer" type for PutContent(), introduced in #1438.
This does chunked upload of blobs using "AppendBlob" type, which was not atomic.
Usage of "writer" type and thus AppendBlobs on metadata files is currently not
concurrency-safe and generally, they are not the right type of blob for the job.
This patch fixes PutContent() to use the atomic upload operation that works
for uploads smaller than 64 MB and creates blobs with "BlockBlob" type. To be
backwards compatible, we query the type of the blob first and if it is not
a "BlockBlob" we delete the blob first before doing an atomic PUT. This
creates a small inconsistency/race window "only once". Once the blob is made
"BlockBlob", it is overwritten with a single PUT atomicallly next time.
Therefore, going forward, PutContent() will be producing BlockBlobs and it
will silently migrate the AppendBlobs introduced in #1438 to BlockBlobs with
this patch.
Tested with existing code side by side, both registries with and without this
patch work fine without breaking each other. So this should be good from a
backwards/forward compatiblity perspective, with a cost of doing an extra
HEAD checking the blob type.
Fixes#2077.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
Use whitelist of allowed repository classes to enforce.
By default all repository classes are allowed.
Add authorized resources to context after authorization.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Golint now checks for new lines at the end of go error strings,
remove these unneeded new lines.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Allow clients to handle errors being set in the WWW-Authenticate
rather than in the body. The WWW-Authenticate errors give a
more precise error describing what is needed to authorize
with the server.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Split challenges into its own package. Avoids possible
import cycle with challenges from client.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Updating to a recent version of Azure Storage SDK to be
able to patch some memory leaks through configurable HTTP client
changes which were made possible by recent patches to it.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
The current code determines the header order for the
"string-to-sign" payload by sorting on the concatenation
of headers and values, whereas it should only happen on the
key.
During multipart uploads, since `x-amz-copy-source-range` and
`x-amz-copy-source` headers are present, V2 signatures fail to
validate since header order is swapped.
This patch reverts to the expected behavior.
Signed-off-by: Pierre-Yves Ritschard <pyr@spootnik.org>
Prefer non-standard headers like X-Forwarded-Proto, X-Forwarded-Host and
X-Forwarded-Port over the standard Forwarded header to maintain
backwards compatibility.
If a port is not specified neither in Host nor in forwarded headers but
it is specified just with X-Forwarded-Port, use its value in base urls
for redirects.
Forwarded header is defined in rfc7239.
X-Forwarded-Port is a non-standard header. Here's a description copied
from "HTTP Headers and Elastic Load Balancing" of AWS ELB docs:
> The X-Forwarded-Port request header helps you identify the port that
> an HTTP or HTTPS load balancer uses to connect to the client.
Signed-off-by: Michal Minář <miminar@redhat.com>
Driver was passing connections by copying. Storing
`swift.Connection` as pointer to fix the warnings.
Ref: #2030.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
In GetContent() we read the bytes from a blob but do not close
the underlying response body.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
To allow generic manifest walking, we define an interface method of
`References` that returns the referenced items in the manifest. The
current implementation does not return the config target from schema2,
making this useless for most applications.
The garbage collector has been modified to show the utility of this
correctly formed `References` method. We may be able to make more
generic traversal methods with this, as well.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Context should use type values instead of strings.
Updated direct calls to WithValue, but still other uses of string keys.
Update Acl to ACL in s3 driver.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
The Redis tests were failing with a "connection pool exhausted" error
from Redigo. Closing the connection used for FLUSHDB fixes the problem.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
The Hub registry generates a large volume of notifications, many of
which are uninteresting based on target media type. Discarding them
within the notification endpoint consumes considerable resources that
could be saved by discarding them within the registry. To that end,
this change adds registry configuration options to restrict the
notifications sent to an endpoint based on target media type.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
Access logging is great. Access logging you can turn off is even
better. This change adds a configuration option for that.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
The token auth package logs JWT validation and verification failures at
the `error` level. But from the server's perspective, these aren't
errors. They're the expected response to bad input. Logging them at
the `info` level better reflects that distinction.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
If a user specifies `mydomain.com:443` in the `Host` configuration, the
PATCH request for the layer upload will fail because the challenge does not
appear to be in the map. To fix this, we normalize the map keys to always
use the Host:Port combination.
Closes https://github.com/docker/docker/issues/18469
Signed-off-by: Stan Hu <stanhu@gmail.com>
Running with the race detector may cause some parts
of the code to run slower causing a race in the scheduler
ordering.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
* tag service: properly handle error responses on HEAD requests by
re-issuing requests as GET for proper error details.
Fixes#1911.
Signed-off-by: dmitri <deemok@gmail.com>
* Simplify handling of failing HEAD requests in TagService and
make a GET request for cases:
- if the server does not handle HEAD
- if the response was an error to get error details
Signed-off-by: dmitri <deemok@gmail.com>
* Add a missing http.Response.Body.Close call for the GET request.
Signed-off-by: dmitri <deemok@gmail.com>
This change to the S3 Move method uses S3's multipart upload API to copy
objects whose size exceeds a threshold. Parts are copied concurrently.
The level of concurrency, part size, and threshold are all configurable
with reasonable defaults.
Using the multipart upload API has two benefits.
* The S3 Move method can now handle objects over 5 GB, fixing #886.
* Moving most objects, and espectially large ones, is faster. For
example, moving a 1 GB object averaged 30 seconds but now averages 10.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
This is already supported by ncw/swift, so we just need to pass the
parameters from the storage driver.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
Use the much faster math/rand.Read function where cryptographic
guarantees are not required. The unit test suite should speed up a
little bit but we've already optimized around this, so it may not
matter.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Previous component-wise path comparison is recursive and generates a
large amount of garbage. This more efficient version simply replaces the
path comparison with the zero-value to sort before everything. We do
this by replacing the byte-wise comparison that swaps a single character
inline for the separator comparison, such that separators sort first.
The resulting implementation provides component-wise path comparison
with no cost incurred for allocation or stack frame.
Direction of the comparison is also reversed to match Go style.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
* Allow precomputed stats on cross-mounted blobs
Signed-off-by: Michal Minář <miminar@redhat.com>
* Extended cross-repo mount tests
Signed-off-by: Michal Minář <miminar@redhat.com>
* Add Object ACL Support to the S3 Storage Backend
Signed-off-by: Frank Chen <frankchn@gmail.com>
* Made changes per @RichardScothern's comments
Signed-off-by: Frank Chen <frankchn@gmail.com>
* Fix Typos
Signed-off-by: Frank Chen <frankchn@gmail.com>
Pass the manifestURL directly into the schema2 manifest handler instead of
accessing through the repository as it has since the reference is now an
interface.
Signed-off-by: Richard Scothern <richard.scothern@docker.com>
Until we have some experience hosting foreign layer manifests, the Hub
operators wish to limit foreign layers on Hub. To that end, this change
adds registry configuration options to restrict the URLs that may appear
in pushed manifests.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
Adds a constant leeway (60 seconds) to the nbf and exp claim check to
account for clock skew between the registry servers and the
authentication server that generated the JWT.
The leeway of 60 seconds is a bit arbitrary but based on the RFC
recommendation and hub.docker.com logs/metrics where we don't see
drifts of more than a second on our servers running ntpd.
I didn't attempt to make the leeway configurable as it would add extra
complexity to the PR and I am not sure how Distribution prefer to
handle runtime flags like that.
Also, I am simplifying the exp and nbf check for readability as the
previous `NOT (A AND B)` with cmp operators was not very friendly.
Ref:
https://tools.ietf.org/html/rfc7519#section-4.1.5
Signed-off-by: Marcus Martins <marcus@docker.com>
This fixes errors other than io.EOF from being dropped when a storage driver
lists repositories. For example, filesystem driver may point to a missing
directory and errors, which then gets subsequently dropped.
Signed-off-by: Edgar Lee <edgar.lee@docker.com>
Allows using v2 for v1 endpoints.
The primary use case being for search which does not have a v2 specification.
Added a user scope for allowing v2 search
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This is similar to waitForSegmentsToShowUp which is called during
Close/Commit. Intuitively, you wouldn't expect missing segments to be a
problem during read operations, since the previous Close/Commit
confirmed that all segments are there.
But due to the distributed nature of Swift, the read request could be
hitting a different storage node of the Swift cluster, where the
segments are still missing.
Load tests on my team's staging Swift cluster have shown this to occur
about once every 100-200 layer uploads when the Swift proxies are under
high load. The retry logic, borrowed from waitForSegmentsToShowUp, fixes
this temporary inconsistency.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
In Go's header parsing, the same header multiple times results in multiple entries in the `r.Header[...]` slice, but Go does no further parsing beyond that (and in https://golang.org/cl/4528086 it was determined that until/unless the stdlib itself needs it, Go will not do so).
The consequence here for parsing of `Accept:` headers is that we support the way Go outputs headers, but not all language HTTP libraries have a facility to output multiple headers instead of a single list header.
This change ensures that the following (valid) header blocks all parse to the same result for the purposes of what is being tested here:
```
Accept: a/b
Accept: b/c
Accept: d/e
```
```
Accept: a/b; q=0.5, b/c
Accept: d/e
```
```
Accept: a/b; q=0.1, b/c; q=0.2, d/e; q=0.8
```
Signed-off-by: Andrew "Tianon" Page <admwiggin@gmail.com>
The client may need the content digest to delete a manifest using the digest used by the registry.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This lets us access registry config within middleware for additional
configuration of whatever it is that you're overriding.
Signed-off-by: Tony Holdstock-Brown <tony@docker.com>
go1.5 doesn't export http.StatusTooManyRequests while
go1.6 does. Fix this by hardcoding the status code for now.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
This commit refactors base.regulator into the 2.4 interfaces and adds a
filesystem configuration option `maxthreads` to configure the regulator.
By default `maxthreads` is set to 100. This means the FS driver is
limited to 100 concurrent blocking file operations. Any subsequent
operations will block in Go until previous filesystem operations
complete.
This ensures that the registry can never open thousands of simultaneous
threads from os filesystem operations.
Note that `maxthreads` can never be less than 25.
Add test case covering parsable string maxthreads
Signed-off-by: Tony Holdstock-Brown <tony@docker.com>
subsequent close.
When a blob upload is cancelled close the blobwriter before removing
upload state to ensure old hashstates don't persist.
Signed-off-by: Richard Scothern <richard.scothern@docker.com>
It's easily possible for a flood of requests to trigger thousands of
concurrent file accesses on the storage driver. Each file I/O call creates
a new OS thread that is not reaped by the Golang runtime. By limiting it
to only 100 at a time we can effectively bound the number of OS threads
in use by the storage driver.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Signed-off-by: Tony Holdstock-Brown <tony@docker.com>