This PR refactors the blob service API to be oriented around blob descriptors. Identified by digests, blobs become an abstract entity that can be read and written using a descriptor as a handle. This allows blobs to take many forms, such as a ReadSeekCloser or a simple byte buffer, allowing blob oriented operations to better integrate with blob agnostic APIs (such as the `io` package). The error definitions are now better organized to reflect conditions that can only be seen when interacting with the blob API. The main benefit of this is to separate the much smaller metadata from large file storage. Many benefits also follow from this. Reading and writing has been separated into discrete services. Backend implementation is also simplified, by reducing the amount of metadata that needs to be picked up to simply serve a read. This also improves cacheability. "Opening" a blob simply consists of an access check (Stat) and a path calculation. Caching is greatly simplified and we've made the mapping of provisional to canonical hashes a first-class concept. BlobDescriptorService and BlobProvider can be combined in different ways to achieve varying effects. Recommend Review Approach ------------------------- This is a very large patch. While apologies are in order, we are getting a considerable amount of refactoring. Most changes follow from the changes to the root package (distribution), so start there. From there, the main changes are in storage. Looking at (*repository).Blobs will help to understand the how the linkedBlobStore is wired. One can explore the internals within and also branch out into understanding the changes to the caching layer. Following the descriptions below will also help to guide you. To reduce the chances for regressions, it was critical that major changes to unit tests were avoided. Where possible, they are left untouched and where not, the spirit is hopefully captured. Pay particular attention to where behavior may have changed. Storage ------- The primary changes to the `storage` package, other than the interface updates, were to merge the layerstore and blobstore. Blob access is now layered even further. The first layer, blobStore, exposes a global `BlobStatter` and `BlobProvider`. Operations here provide a fast path for most read operations that don't take access control into account. The `linkedBlobStore` layers on top of the `blobStore`, providing repository- scoped blob link management in the backend. The `linkedBlobStore` implements the full `BlobStore` suite, providing access-controlled, repository-local blob writers. The abstraction between the two is slightly broken in that `linkedBlobStore` is the only channel under which one can write into the global blob store. The `linkedBlobStore` also provides flexibility in that it can act over different link sets depending on configuration. This allows us to use the same code for signature links, manifest links and blob links. Eventually, we will fully consolidate this storage. The improved cache flow comes from the `linkedBlobStatter` component of `linkedBlobStore`. Using a `cachedBlobStatter`, these combine together to provide a simple cache hierarchy that should streamline access checks on read and write operations, or at least provide a single path to optimize. The metrics have been changed in a slightly incompatible way since the former operations, Fetch and Exists, are no longer relevant. The fileWriter and fileReader have been slightly modified to support the rest of the changes. The most interesting is the removal of the `Stat` call from `newFileReader`. This was the source of unnecessary round trips that were only present to look up the size of the resulting reader. Now, one must simply pass in the size, requiring the caller to decide whether or not the `Stat` call is appropriate. In several cases, it turned out the caller already had the size already. The `WriterAt` implementation has been removed from `fileWriter`, since it is no longer required for `BlobWriter`, reducing the number of paths which writes may take. Cache ----- Unfortunately, the `cache` package required a near full rewrite. It was pretty mechanical in that the cache is oriented around the `BlobDescriptorService` slightly modified to include the ability to set the values for individual digests. While the implementation is oriented towards caching, it can act as a primary store. Provisions are in place to have repository local metadata, in addition to global metadata. Fallback is implemented as a part of the storage package to maintain this flexibility. One unfortunate side-effect is that caching is now repository-scoped, rather than global. This should have little effect on performance but may increase memory usage. Handlers -------- The `handlers` package has been updated to leverage the new API. For the most part, the changes are superficial or mechanical based on the API changes. This did expose a bug in the handling of provisional vs canonical digests that was fixed in the unit tests. Configuration ------------- One user-facing change has been made to the configuration and is updated in the associated documentation. The `layerinfo` cache parameter has been deprecated by the `blobdescriptor` cache parameter. Both are equivalent and configuration files should be backward compatible. Notifications ------------- Changes the `notification` package are simply to support the interface changes. Context ------- A small change has been made to the tracing log-level. Traces have been moved from "info" to "debug" level to reduce output when not needed. Signed-off-by: Stephen J Day <stephen.day@docker.com>
28 KiB
Registry Configuration Reference
You configure a registry server using a YAML file. This page explains the configuration options and the values they can take. You'll also find examples of middleware and development environment configurations.
List of configuration options
This section lists all the registry configuration options. Some options in the list are mutually exclusive. So, make sure to read the detailed reference information about each option that appears later in this page.
version: 0.1
log:
level: debug
formatter: text
fields:
service: registry
environment: staging
loglevel: debug # deprecated: use "log"
storage:
filesystem:
rootdirectory: /tmp/registry
azure:
accountname: accountname
accountkey: base64encodedaccountkey
container: containername
s3:
accesskey: awsaccesskey
secretkey: awssecretkey
region: us-west-1
bucket: bucketname
encrypt: true
secure: true
v4auth: true
chunksize: 5242880
rootdirectory: /s3/object/name/prefix
cache:
blobdescriptor: redis
maintenance:
uploadpurging:
enabled: true
age: 168h
interval: 24h
dryrun: false
auth:
silly:
realm: silly-realm
service: silly-service
token:
realm: token-realm
service: token-service
issuer: registry-token-issuer
rootcertbundle: /root/certs/bundle
middleware:
registry:
- name: ARegistryMiddleware
options:
foo: bar
repository:
- name: ARepositoryMiddleware
options:
foo: bar
storage:
- name: cloudfront
options:
baseurl: https://my.cloudfronted.domain.com/
privatekey: /path/to/pem
keypairid: cloudfrontkeypairid
duration: 3000
reporting:
bugsnag:
apikey: bugsnagapikey
releasestage: bugsnagreleasestage
endpoint: bugsnagendpoint
newrelic:
licensekey: newreliclicensekey
name: newrelicname
verbose: true
http:
addr: localhost:5000
prefix: /my/nested/registry/
secret: asecretforlocaldevelopment
tls:
certificate: /path/to/x509/public
key: /path/to/x509/private
clientcas:
- /path/to/ca.pem
- /path/to/another/ca.pem
debug:
addr: localhost:5001
notifications:
endpoints:
- name: alistener
disabled: false
url: https://my.listener.com/event
headers: <http.Header>
timeout: 500
threshold: 5
backoff: 1000
redis:
addr: localhost:6379
password: asecret
db: 0
dialtimeout: 10ms
readtimeout: 10ms
writetimeout: 10ms
pool:
maxidle: 16
maxactive: 64
idletimeout: 300s
In some instances a configuration option is optional but it contains child options marked as required. This indicates that you can omit the parent with all its children. However, if the parent is included, you must also include all the children marked required.
Override configuration options
You can use environment variables to override most configuration parameters. The
exception is the version
variable which cannot be overridden. You can set
environment variables on the command line using the -e
flag on docker run
or
from within a Dockerfile using the ENV
instruction.
To override a configuration option, create an environment variable named
REGISTRY\variable_
where variable
is the name of the configuration option
and the _
(underscore) represents indention levels. For example, you can
configure the rootdirectory
of the filesystem
storage backend:
storage:
filesystem:
rootdirectory: /tmp/registry
To override this value, set an environment variable like this:
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY=/tmp/registry/test
This variable overrides the /tmp/registry
value to the /tmp/registry/test
directory.
Note
: If an environment variable changes a map value into a string, such as replacing the storage driver type with
REGISTRY_STORAGE=filesystem
, then all sub-fields will be erased. As such, specifying the storage type in the environment will remove all parameters related to the old storage configuration.
version
version: 0.1
The version
option is required. It specifies the configuration's version.
It is expected to remain a top-level field, to allow for a consistent version
check before parsing the remainder of the configuration file.
log
The log
subsection configures the behavior of the logging system. The logging
system outputs everything to stdout. You can adjust the granularity and format
with this configuration section.
log:
level: debug
formatter: text
fields:
service: registry
environment: staging
Parameter | Required | Description |
---|---|---|
level
|
no |
Sets the sensitivity of logging output. Permitted values are
error , warn , info and
debug . The default is info .
|
formatter
|
no |
This selects the format of logging output. The format primarily affects how keyed
attributes for a log line are encoded. Options are text , json or
logstash . The default is text .
|
fields
|
no | A map of field names to values. These are added to every log line for the context. This is useful for identifying log messages source after being mixed in other systems. |
loglevel
DEPRECATED: Please use log instead.
loglevel: debug
Permitted values are error
, warn
, info
and debug
. The default is
info
.
storage
storage:
filesystem:
rootdirectory: /tmp/registry
azure:
accountname: accountname
accountkey: base64encodedaccountkey
container: containername
s3:
accesskey: awsaccesskey
secretkey: awssecretkey
region: us-west-1
bucket: bucketname
encrypt: true
secure: true
v4auth: true
chunksize: 5242880
rootdirectory: /s3/object/name/prefix
cache:
blobdescriptor: inmemory
maintenance:
uploadpurging:
enabled: true
age: 168h
interval: 24h
dryrun: false
The storage option is required and defines which storage backend is in use. You must configure one backend; if you configure more, the registry returns an error.
cache
Use the cache
subsection to enable caching of data accessed in the storage
backend. Currently, the only available cache provides fast access to layer
metadata. This, if configured, uses the blobdescriptor
field.
You can set blobdescriptor
field to redis
or inmemory
. The redis
value uses
a Redis pool to cache layer metadata. The inmemory
value uses an in memory
map.
NOTE: Formerly,
blobdescriptor
was known aslayerinfo
. While these are equivalent,layerinfo
has been deprecated, in favor orblobdescriptor
.
filesystem
The filesystem
storage backend uses the local disk to store registry files. It
is ideal for development and may be appropriate for some small-scale production
applications.
This backend has a single, required rootdirectory
parameter. The parameter
specifies the absolute path to a directory. The registry stores all its data
here so make sure there is adequate space available.
azure
This storage backend uses Microsoft's Azure Storage platform.
Parameter | Required | Description |
---|---|---|
accountname
|
yes | Azure account name. |
accountkey
|
yes | Azure account key. |
container
|
yes | Name of the Azure container into which to store data. |
S3
This storage backend uses Amazon's Simple Storage Service (S3).
Parameter | Required | Description |
---|---|---|
accesskey
|
yes | Your AWS Access Key. |
secretkey
|
yes | Your AWS Secret Key. |
region
|
yes | The AWS region in which your bucket exists. For the moment, the Go AWS library in use does not use the newer DNS based bucket routing. |
bucket
|
yes | The bucket name in which you want to store the registry's data. |
encrypt
|
no | Specifies whether the registry stores the image in encrypted format or not. A boolean value. The default is false. |
secure
|
no | Indicates whether to use HTTPS instead of HTTP. A boolean value. The default is false. |
v4auth
|
no |
Indicates whether the registry uses Version 4 of AWS's authentication.
Generally, you should set this to true . By default, this is
false .
|
chunksize
|
no | The S3 API requires multipart upload chunks to be at least 5MB. This value should be a number that is larger than 5*1024*1024. |
rootdirectory
|
no | This is a prefix that will be applied to all S3 keys to allow you to segment data in your bucket if necessary. |
Maintenance
Currently the registry can perform one maintenance function: upload purging. This and future maintenance functions which are related to storage can be configured under the maintenance section.
Upload Purging
Upload purging is a background process that periodically removes orphaned files from the upload directories of the registry. Upload purging is enabled by default. To configure upload directory purging, the following parameters must be set.
Parameter | Required | Description |
---|---|---|
enabled |
yes | Set to true to enable upload purging. Default=true. |
age |
yes | Upload directories which are older than this age will be deleted. Default=168h (1 week) |
interval |
yes | The interval between upload directory purging. Default=24h. |
dryrun |
yes | dryrun can be set to true to obtain a summary of what directories will be deleted. Default=false. |
Note: age
and interval
are strings containing a number with optional fraction and a unit suffix: e.g. 45m, 2h10m, 168h (1 week).
auth
auth:
silly:
realm: silly-realm
service: silly-service
token:
realm: token-realm
service: token-service
issuer: registry-token-issuer
rootcertbundle: /root/certs/bundle
The auth
option is optional as there are use cases (i.e. a mirror that
only permits pulls) for which authentication may not be desired. There are
currently 2 possible auth providers, silly
and token
. You can configure only
one auth
provider.
silly
The silly
auth is only for development purposes. It simply checks for the
existence of the Authorization
header in the HTTP request. It has no regard for
the header's value. If the header does not exist, the silly
auth responds with a
challenge response, echoing back the realm, service, and scope that access was
denied for.
The following values are used to configure the response:
Parameter | Required | Description |
---|---|---|
realm
|
yes | The realm in which the registry server authenticates. |
service
|
yes | The service being authenticated. |
token
Token based authentication allows the authentication system to be decoupled from the registry. It is a well established authentication paradigm with a high degree of security.
Parameter | Required | Description |
---|---|---|
realm
|
yes | The realm in which the registry server authenticates. |
service
|
yes | The service being authenticated. |
issuer
|
yes | The name of the token issuer. The issuer inserts this into the token so it must match the value configured for the issuer. |
rootcertbundle
|
yes | The absolute path to the root certificate bundle. This bundle contains the public part of the certificates that is used to sign authentication tokens. |
For more information about Token based authentication configuration, see the [specification.]
middleware
The middleware
option is optional. Use this option to inject middleware at
named hook points. All middlewares must implement the same interface as the
object they're wrapping. This means a registry middleware must implement the
distribution.Namespace
interface, repository middleware must implement
distribution.Respository
, and storage middleware must implement
driver.StorageDriver
.
Currently only one middleware, cloudfront
, a storage middleware, is supported
in the registry implementation.
middleware:
registry:
- name: ARegistryMiddleware
options:
foo: bar
repository:
- name: ARepositoryMiddleware
options:
foo: bar
storage:
- name: cloudfront
options:
baseurl: https://my.cloudfronted.domain.com/
privatekey: /path/to/pem
keypairid: cloudfrontkeypairid
duration: 3000
Each middleware entry has name
and options
entries. The name
must
correspond to the name under which the middleware registers itself. The
options
field is a map that details custom configuration required to
initialize the middleware. It is treated as a map[string]interface{}
. As such,
it supports any interesting structures desired, leaving it up to the middleware
initialization function to best determine how to handle the specific
interpretation of the options.
cloudfront
Parameter | Required | Description |
---|---|---|
baseurl
|
yes |
SCHEME://HOST[/PATH] at which Cloudfront is served.
|
privatekey
|
yes | Private Key for Cloudfront provided by AWS. |
keypairid
|
yes | Key pair ID provided by AWS. |
duration
|
no | Duration for which a signed URL should be valid. |
reporting
reporting:
bugsnag:
apikey: bugsnagapikey
releasestage: bugsnagreleasestage
endpoint: bugsnagendpoint
newrelic:
licensekey: newreliclicensekey
name: newrelicname
verbose: true
The reporting
option is optional and configures error and metrics
reporting tools. At the moment only two services are supported, New
Relic and Bugsnag, a valid
configuration may contain both.
bugsnag
Parameter | Required | Description |
---|---|---|
apikey
|
yes | API Key provided by Bugsnag |
releasestage
|
no | Tracks where the registry is deployed, for example, production,staging, or development. |
endpoint
|
no | Specify the enterprise Bugsnag endpoint. |
newrelic
Parameter | Required | Description |
---|---|---|
licensekey
|
yes | License key provided by New Relic. |
name
|
no | New Relic application name. |
verbose
|
no | Enable New Relic debugging output on stdout. |
http
http:
addr: localhost:5000
net: tcp
prefix: /my/nested/registry/
secret: asecretforlocaldevelopment
tls:
certificate: /path/to/x509/public
key: /path/to/x509/private
clientcas:
- /path/to/ca.pem
- /path/to/another/ca.pem
debug:
addr: localhost:5001
The http
option details the configuration for the HTTP server that hosts the registry.
Parameter | Required | Description |
---|---|---|
addr
|
yes |
The address for which the server should accept connections. The form depends on a network type (see net option):
HOST:PORT for tcp and FILE for a unix socket.
|
net
|
no |
The network which is used to create a listening socket. Known networks are unix and tcp .
The default empty value means tcp.
|
prefix
|
no |
If the server does not run at the root path use this value to specify the
prefix. The root path is the section before v2 . It
should have both preceding and trailing slashes, for example /path/ .
|
secret
|
yes | A random piece of data. This is used to sign state that may be stored with the client to protect against tampering. For production environments you should generate a random piece of data using a cryptographically secure random generator. |
tls
The tls
struct within http
is optional. Use this to configure TLS
for the server. If you already have a server such as Nginx or Apache running on
the same host as the registry, you may prefer to configure TLS termination there
and proxy connections to the registry server.
Parameter | Required | Description |
---|---|---|
certificate
|
yes | Absolute path to x509 cert file |
key
|
yes | Absolute path to x509 private key file. |
clientcas
|
no | An array of absolute paths to a x509 CA file |
debug
The debug
option is optional . Use it to configure a debug server that can
be helpful in diagnosing problems. Contributors to the distribution repository
should find the debug server useful. Docker recommends disabling it in
production environments.
The debug
section takes a single, required addr
parameter. This parameter
specifies the HOST:PORT
on which the debug server should accept connections.
notifications
notifications:
endpoints:
- name: alistener
disabled: false
url: https://my.listener.com/event
headers: <http.Header>
timeout: 500
threshold: 5
backoff: 1000
The notifications option is optional and currently may contain a single
option, endpoints
.
endpoints
Endpoints is a list of named services (URLs) that can accept event notifications.
Parameter | Required | Description |
---|---|---|
name
|
yes | A human readable name for the service. |
disabled
|
no | A boolean to enable/disable notifications for a service. |
url
|
yes | The URL to which events should be published. |
headers
|
yes | Static headers to add to each request. |
timeout
|
yes |
An HTTP timeout value. This field takes a positive integer and an optional
suffix indicating the unit of time. Possible units are:
|
threshold
|
yes | An integer specifying how long to wait before backing off a failure. |
backoff
|
yes |
How long the system backs off before retrying. This field takes a positive
integer and an optional suffix indicating the unit of time. Possible units
are:
|
redis
redis:
addr: localhost:6379
password: asecret
db: 0
dialtimeout: 10ms
readtimeout: 10ms
writetimeout: 10ms
pool:
maxidle: 16
maxactive: 64
idletimeout: 300s
Declare parameters for constructing the redis connections. Registry instances may use the Redis instance for several applications. The current purpose is caching information about immutable blobs. Most of the options below control how the registry connects to redis. You can control the pool's behavior with the pool subsection.
Parameter | Required | Description |
---|---|---|
addr
|
yes | Address (host and port) of redis instance. |
password
|
no | A password used to authenticate to the redis instance. |
db
|
no | Selects the db for each connection. |
dialtimeout
|
no | Timeout for connecting to a redis instance. |
readtimeout
|
no | Timeout for reading from redis connections. |
writetimeout
|
no | Timeout for writing to redis connections. |
pool
pool:
maxidle: 16
maxactive: 64
idletimeout: 300s
Configure the behavior of the Redis connection pool.
Parameter | Required | Description |
---|---|---|
maxidle
|
no | Sets the maximum number of idle connections. |
maxactive
|
no | sets the maximum number of connections that should be opened before blocking a connection request. |
idletimeout
|
no | sets the amount time to wait before closing inactive connections. |
Example: Development configuration
The following is a simple example you can use for local development:
version: 0.1
log:
level: debug
storage:
filesystem:
rootdirectory: /tmp/registry-dev
http:
addr: localhost:5000
secret: asecretforlocaldevelopment
debug:
addr: localhost:5001
The above configures the registry instance to run on port 5000
, binding to
localhost
, with the debug
server enabled. Registry data storage is in the
/tmp/registry-dev
directory. Logging is in debug
mode, which is the most
verbose.
A similar simple configuration is available at config.yml. Both are generally useful for local development.
Example: Middleware configuration
This example illustrates how to configure storage middleware in a registry. Middleware allows the registry to serve layers via a content delivery network (CDN). This is useful for reducing requests to the storage layer.
Currently, the registry supports Amazon Cloudfront. You can only use Cloudfront in conjunction with the S3 storage driver.
Parameter | Description |
---|---|
name |
The storage middleware name. Currently cloudfront is an accepted value. |
disabled |
Set to false to easily disable the middleware. |
options: |
A set of key/value options to configure the middleware.
|
The following example illustrates these values:
middleware:
storage:
- name: cloudfront
disabled: false
options:
baseurl: http://d111111abcdef8.cloudfront.net
privatekey: /path/to/asecret.pem
keypairid: asecret
duration: 60
Note
: Cloudfront keys exist separately to other AWS keys. See the documentation on AWS credentials for more information.