distribution/docs/configuration.md
Stephen J Day 593bbccdb5 Refactor Blob Service API
This PR refactors the blob service API to be oriented around blob descriptors.
Identified by digests, blobs become an abstract entity that can be read and
written using a descriptor as a handle. This allows blobs to take many forms,
such as a ReadSeekCloser or a simple byte buffer, allowing blob oriented
operations to better integrate with blob agnostic APIs (such as the `io`
package). The error definitions are now better organized to reflect conditions
that can only be seen when interacting with the blob API.

The main benefit of this is to separate the much smaller metadata from large
file storage. Many benefits also follow from this. Reading and writing has
been separated into discrete services. Backend implementation is also
simplified, by reducing the amount of metadata that needs to be picked up to
simply serve a read. This also improves cacheability.

"Opening" a blob simply consists of an access check (Stat) and a path
calculation. Caching is greatly simplified and we've made the mapping of
provisional to canonical hashes a first-class concept. BlobDescriptorService
and BlobProvider can be combined in different ways to achieve varying effects.

Recommend Review Approach
-------------------------

This is a very large patch. While apologies are in order, we are getting a
considerable amount of refactoring. Most changes follow from the changes to
the root package (distribution), so start there. From there, the main changes
are in storage. Looking at (*repository).Blobs will help to understand the how
the linkedBlobStore is wired. One can explore the internals within and also
branch out into understanding the changes to the caching layer. Following the
descriptions below will also help to guide you.

To reduce the chances for regressions, it was critical that major changes to
unit tests were avoided. Where possible, they are left untouched and where
not, the spirit is hopefully captured. Pay particular attention to where
behavior may have changed.

Storage
-------

The primary changes to the `storage` package, other than the interface
updates, were to merge the layerstore and blobstore. Blob access is now
layered even further. The first layer, blobStore, exposes a global
`BlobStatter` and `BlobProvider`. Operations here provide a fast path for most
read operations that don't take access control into account. The
`linkedBlobStore` layers on top of the `blobStore`, providing repository-
scoped blob link management in the backend. The `linkedBlobStore` implements
the full `BlobStore` suite, providing access-controlled, repository-local blob
writers. The abstraction between the two is slightly broken in that
`linkedBlobStore` is the only channel under which one can write into the global
blob store. The `linkedBlobStore` also provides flexibility in that it can act
over different link sets depending on configuration. This allows us to use the
same code for signature links, manifest links and blob links.  Eventually, we
will fully consolidate this storage.

The improved cache flow comes from the `linkedBlobStatter` component
of `linkedBlobStore`. Using a `cachedBlobStatter`, these combine together to
provide a simple cache hierarchy that should streamline access checks on read
and write operations, or at least provide a single path to optimize. The
metrics have been changed in a slightly incompatible way since the former
operations, Fetch and Exists, are no longer relevant.

The fileWriter and fileReader have been slightly modified to support the rest
of the changes. The most interesting is the removal of the `Stat` call from
`newFileReader`. This was the source of unnecessary round trips that were only
present to look up the size of the resulting reader. Now, one must simply pass
in the size, requiring the caller to decide whether or not the `Stat` call is
appropriate. In several cases, it turned out the caller already had the size
already. The `WriterAt` implementation has been removed from `fileWriter`,
since it is no longer required for `BlobWriter`, reducing the number of paths
which writes may take.

Cache
-----

Unfortunately, the `cache` package required a near full rewrite. It was pretty
mechanical in that the cache is oriented around the `BlobDescriptorService`
slightly modified to include the ability to set the values for individual
digests. While the implementation is oriented towards caching, it can act as a
primary store. Provisions are in place to have repository local metadata, in
addition to global metadata. Fallback is implemented as a part of the storage
package to maintain this flexibility.

One unfortunate side-effect is that caching is now repository-scoped, rather
than global. This should have little effect on performance but may increase
memory usage.

Handlers
--------

The `handlers` package has been updated to leverage the new API. For the most
part, the changes are superficial or mechanical based on the API changes. This
did expose a bug in the handling of provisional vs canonical digests that was
fixed in the unit tests.

Configuration
-------------

One user-facing change has been made to the configuration and is updated in
the associated documentation. The `layerinfo` cache parameter has been
deprecated by the `blobdescriptor` cache parameter. Both are equivalent and
configuration files should be backward compatible.

Notifications
-------------

Changes the `notification` package are simply to support the interface
changes.

Context
-------

A small change has been made to the tracing log-level. Traces have been moved
from "info" to "debug" level to reduce output when not needed.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-05-15 17:05:18 -07:00

28 KiB

Registry Configuration Reference

You configure a registry server using a YAML file. This page explains the configuration options and the values they can take. You'll also find examples of middleware and development environment configurations.

List of configuration options

This section lists all the registry configuration options. Some options in the list are mutually exclusive. So, make sure to read the detailed reference information about each option that appears later in this page.

version: 0.1
log:
	level: debug
	formatter: text
	fields:
		service: registry
		environment: staging
loglevel: debug # deprecated: use "log"
storage:
	filesystem:
		rootdirectory: /tmp/registry
	azure:
		accountname: accountname
		accountkey: base64encodedaccountkey
		container: containername
	s3:
		accesskey: awsaccesskey
		secretkey: awssecretkey
		region: us-west-1
		bucket: bucketname
		encrypt: true
		secure: true
		v4auth: true
		chunksize: 5242880
		rootdirectory: /s3/object/name/prefix
	cache:
		blobdescriptor: redis
	maintenance:
		uploadpurging:
			enabled: true
			age: 168h
			interval: 24h
			dryrun: false
auth:
	silly:
		realm: silly-realm
		service: silly-service
	token:
		realm: token-realm
		service: token-service
		issuer: registry-token-issuer
		rootcertbundle: /root/certs/bundle
middleware:
	registry:
		- name: ARegistryMiddleware
		  options:
			foo: bar
	repository:
		- name: ARepositoryMiddleware
		  options:
			foo: bar
	storage:
		- name: cloudfront
		  options:
			baseurl: https://my.cloudfronted.domain.com/
			privatekey: /path/to/pem
			keypairid: cloudfrontkeypairid
			duration: 3000
reporting:
	bugsnag:
		apikey: bugsnagapikey
		releasestage: bugsnagreleasestage
		endpoint: bugsnagendpoint
	newrelic:
		licensekey: newreliclicensekey
		name: newrelicname
		verbose: true
http:
	addr: localhost:5000
	prefix: /my/nested/registry/
	secret: asecretforlocaldevelopment
	tls:
		certificate: /path/to/x509/public
		key: /path/to/x509/private
    clientcas:
      - /path/to/ca.pem
      - /path/to/another/ca.pem
	debug:
		addr: localhost:5001
notifications:
	endpoints:
		- name: alistener
		  disabled: false
		  url: https://my.listener.com/event
		  headers: <http.Header>
		  timeout: 500
		  threshold: 5
		  backoff: 1000
redis:
	addr: localhost:6379
	password: asecret
	db: 0
	dialtimeout: 10ms
	readtimeout: 10ms
	writetimeout: 10ms
	pool:
		maxidle: 16
		maxactive: 64
		idletimeout: 300s

In some instances a configuration option is optional but it contains child options marked as required. This indicates that you can omit the parent with all its children. However, if the parent is included, you must also include all the children marked required.

Override configuration options

You can use environment variables to override most configuration parameters. The exception is the version variable which cannot be overridden. You can set environment variables on the command line using the -e flag on docker run or from within a Dockerfile using the ENV instruction.

To override a configuration option, create an environment variable named REGISTRY\variable_ where variable is the name of the configuration option and the _ (underscore) represents indention levels. For example, you can configure the rootdirectory of the filesystem storage backend:

storage:
	filesystem:
		rootdirectory: /tmp/registry

To override this value, set an environment variable like this:

REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY=/tmp/registry/test

This variable overrides the /tmp/registry value to the /tmp/registry/test directory.

Note

: If an environment variable changes a map value into a string, such as replacing the storage driver type with REGISTRY_STORAGE=filesystem, then all sub-fields will be erased. As such, specifying the storage type in the environment will remove all parameters related to the old storage configuration.

version

version: 0.1

The version option is required. It specifies the configuration's version. It is expected to remain a top-level field, to allow for a consistent version check before parsing the remainder of the configuration file.

log

The log subsection configures the behavior of the logging system. The logging system outputs everything to stdout. You can adjust the granularity and format with this configuration section.

log:
	level: debug
	formatter: text
	fields:
		service: registry
		environment: staging
Parameter Required Description
level no Sets the sensitivity of logging output. Permitted values are error, warn, info and debug. The default is info.
formatter no This selects the format of logging output. The format primarily affects how keyed attributes for a log line are encoded. Options are text, json or logstash. The default is text.
fields no A map of field names to values. These are added to every log line for the context. This is useful for identifying log messages source after being mixed in other systems.

loglevel

DEPRECATED: Please use log instead.

loglevel: debug

Permitted values are error, warn, info and debug. The default is info.

storage

storage:
	filesystem:
		rootdirectory: /tmp/registry
	azure:
		accountname: accountname
		accountkey: base64encodedaccountkey
		container: containername
	s3:
		accesskey: awsaccesskey
		secretkey: awssecretkey
		region: us-west-1
		bucket: bucketname
		encrypt: true
		secure: true
		v4auth: true
		chunksize: 5242880
		rootdirectory: /s3/object/name/prefix
	cache:
		blobdescriptor: inmemory
	maintenance:
		uploadpurging:
			enabled: true
			age: 168h
			interval: 24h
			dryrun: false

The storage option is required and defines which storage backend is in use. You must configure one backend; if you configure more, the registry returns an error.

cache

Use the cache subsection to enable caching of data accessed in the storage backend. Currently, the only available cache provides fast access to layer metadata. This, if configured, uses the blobdescriptor field.

You can set blobdescriptor field to redis or inmemory. The redis value uses a Redis pool to cache layer metadata. The inmemory value uses an in memory map.

NOTE: Formerly, blobdescriptor was known as layerinfo. While these are equivalent, layerinfo has been deprecated, in favor or blobdescriptor.

filesystem

The filesystem storage backend uses the local disk to store registry files. It is ideal for development and may be appropriate for some small-scale production applications.

This backend has a single, required rootdirectory parameter. The parameter specifies the absolute path to a directory. The registry stores all its data here so make sure there is adequate space available.

azure

This storage backend uses Microsoft's Azure Storage platform.

Parameter Required Description
accountname yes Azure account name.
accountkey yes Azure account key.
container yes Name of the Azure container into which to store data.

S3

This storage backend uses Amazon's Simple Storage Service (S3).

Parameter Required Description
accesskey yes Your AWS Access Key.
secretkey yes Your AWS Secret Key.
region yes The AWS region in which your bucket exists. For the moment, the Go AWS library in use does not use the newer DNS based bucket routing.
bucket yes The bucket name in which you want to store the registry's data.
encrypt no Specifies whether the registry stores the image in encrypted format or not. A boolean value. The default is false.
secure no Indicates whether to use HTTPS instead of HTTP. A boolean value. The default is false.
v4auth no Indicates whether the registry uses Version 4 of AWS's authentication. Generally, you should set this to true. By default, this is false.
chunksize no The S3 API requires multipart upload chunks to be at least 5MB. This value should be a number that is larger than 5*1024*1024.
rootdirectory no This is a prefix that will be applied to all S3 keys to allow you to segment data in your bucket if necessary.

Maintenance

Currently the registry can perform one maintenance function: upload purging. This and future maintenance functions which are related to storage can be configured under the maintenance section.

Upload Purging

Upload purging is a background process that periodically removes orphaned files from the upload directories of the registry. Upload purging is enabled by default. To configure upload directory purging, the following parameters must be set.

Parameter Required Description
enabled yes Set to true to enable upload purging. Default=true.
age yes Upload directories which are older than this age will be deleted. Default=168h (1 week)
interval yes The interval between upload directory purging. Default=24h.
dryrun yes dryrun can be set to true to obtain a summary of what directories will be deleted. Default=false.

Note: age and interval are strings containing a number with optional fraction and a unit suffix: e.g. 45m, 2h10m, 168h (1 week).

auth

auth:
	silly:
		realm: silly-realm
		service: silly-service
	token:
		realm: token-realm
		service: token-service
		issuer: registry-token-issuer
		rootcertbundle: /root/certs/bundle

The auth option is optional as there are use cases (i.e. a mirror that only permits pulls) for which authentication may not be desired. There are currently 2 possible auth providers, silly and token. You can configure only one auth provider.

silly

The silly auth is only for development purposes. It simply checks for the existence of the Authorization header in the HTTP request. It has no regard for the header's value. If the header does not exist, the silly auth responds with a challenge response, echoing back the realm, service, and scope that access was denied for.

The following values are used to configure the response:

Parameter Required Description
realm yes The realm in which the registry server authenticates.
service yes The service being authenticated.

token

Token based authentication allows the authentication system to be decoupled from the registry. It is a well established authentication paradigm with a high degree of security.

Parameter Required Description
realm yes The realm in which the registry server authenticates.
service yes The service being authenticated.
issuer yes The name of the token issuer. The issuer inserts this into the token so it must match the value configured for the issuer.
rootcertbundle yes The absolute path to the root certificate bundle. This bundle contains the public part of the certificates that is used to sign authentication tokens.

For more information about Token based authentication configuration, see the [specification.]

middleware

The middleware option is optional. Use this option to inject middleware at named hook points. All middlewares must implement the same interface as the object they're wrapping. This means a registry middleware must implement the distribution.Namespace interface, repository middleware must implement distribution.Respository, and storage middleware must implement driver.StorageDriver.

Currently only one middleware, cloudfront, a storage middleware, is supported in the registry implementation.

middleware:
	registry:
		- name: ARegistryMiddleware
		  options:
			foo: bar
	repository:
		- name: ARepositoryMiddleware
		  options:
			foo: bar
	storage:
		- name: cloudfront
		  options:
			baseurl: https://my.cloudfronted.domain.com/
			privatekey: /path/to/pem
			keypairid: cloudfrontkeypairid
			duration: 3000

Each middleware entry has name and options entries. The name must correspond to the name under which the middleware registers itself. The options field is a map that details custom configuration required to initialize the middleware. It is treated as a map[string]interface{}. As such, it supports any interesting structures desired, leaving it up to the middleware initialization function to best determine how to handle the specific interpretation of the options.

cloudfront

Parameter Required Description
baseurl yes SCHEME://HOST[/PATH] at which Cloudfront is served.
privatekey yes Private Key for Cloudfront provided by AWS.
keypairid yes Key pair ID provided by AWS.
duration no Duration for which a signed URL should be valid.

reporting

reporting:
	bugsnag:
		apikey: bugsnagapikey
		releasestage: bugsnagreleasestage
		endpoint: bugsnagendpoint
	newrelic:
		licensekey: newreliclicensekey
		name: newrelicname
		verbose: true

The reporting option is optional and configures error and metrics reporting tools. At the moment only two services are supported, New Relic and Bugsnag, a valid configuration may contain both.

bugsnag

Parameter Required Description
apikey yes API Key provided by Bugsnag
releasestage no Tracks where the registry is deployed, for example, production,staging, or development.
endpoint no Specify the enterprise Bugsnag endpoint.

newrelic

Parameter Required Description
licensekey yes License key provided by New Relic.
name no New Relic application name.
verbose no Enable New Relic debugging output on stdout.

http

http:
	addr: localhost:5000
	net: tcp
	prefix: /my/nested/registry/
	secret: asecretforlocaldevelopment
	tls:
		certificate: /path/to/x509/public
		key: /path/to/x509/private
    clientcas:
      - /path/to/ca.pem
      - /path/to/another/ca.pem
	debug:
		addr: localhost:5001

The http option details the configuration for the HTTP server that hosts the registry.

Parameter Required Description
addr yes The address for which the server should accept connections. The form depends on a network type (see net option): HOST:PORT for tcp and FILE for a unix socket.
net no The network which is used to create a listening socket. Known networks are unix and tcp. The default empty value means tcp.
prefix no If the server does not run at the root path use this value to specify the prefix. The root path is the section before v2. It should have both preceding and trailing slashes, for example /path/.
secret yes A random piece of data. This is used to sign state that may be stored with the client to protect against tampering. For production environments you should generate a random piece of data using a cryptographically secure random generator.

tls

The tls struct within http is optional. Use this to configure TLS for the server. If you already have a server such as Nginx or Apache running on the same host as the registry, you may prefer to configure TLS termination there and proxy connections to the registry server.

Parameter Required Description
certificate yes Absolute path to x509 cert file
key yes Absolute path to x509 private key file.
clientcas no An array of absolute paths to a x509 CA file

debug

The debug option is optional . Use it to configure a debug server that can be helpful in diagnosing problems. Contributors to the distribution repository should find the debug server useful. Docker recommends disabling it in production environments.

The debug section takes a single, required addr parameter. This parameter specifies the HOST:PORT on which the debug server should accept connections.

notifications

notifications:
	endpoints:
		- name: alistener
		  disabled: false
		  url: https://my.listener.com/event
		  headers: <http.Header>
		  timeout: 500
		  threshold: 5
		  backoff: 1000

The notifications option is optional and currently may contain a single option, endpoints.

endpoints

Endpoints is a list of named services (URLs) that can accept event notifications.

Parameter Required Description
name yes A human readable name for the service.
disabled no A boolean to enable/disable notifications for a service.
url yes The URL to which events should be published.
headers yes Static headers to add to each request.
timeout yes An HTTP timeout value. This field takes a positive integer and an optional suffix indicating the unit of time. Possible units are:
  • ns (nanoseconds)
  • us (microseconds)
  • ms (milliseconds)
  • s (seconds)
  • m (minutes)
  • h (hours)
If you omit the suffix, the system interprets the value as nanoseconds.
threshold yes An integer specifying how long to wait before backing off a failure.
backoff yes How long the system backs off before retrying. This field takes a positive integer and an optional suffix indicating the unit of time. Possible units are:
  • ns (nanoseconds)
  • us (microseconds)
  • ms (milliseconds)
  • s (seconds)
  • m (minutes)
  • h (hours)
If you omit the suffix, the system interprets the value as nanoseconds.

redis

redis:
	addr: localhost:6379
	password: asecret
	db: 0
	dialtimeout: 10ms
	readtimeout: 10ms
	writetimeout: 10ms
	pool:
		maxidle: 16
		maxactive: 64
		idletimeout: 300s

Declare parameters for constructing the redis connections. Registry instances may use the Redis instance for several applications. The current purpose is caching information about immutable blobs. Most of the options below control how the registry connects to redis. You can control the pool's behavior with the pool subsection.

Parameter Required Description
addr yes Address (host and port) of redis instance.
password no A password used to authenticate to the redis instance.
db no Selects the db for each connection.
dialtimeout no Timeout for connecting to a redis instance.
readtimeout no Timeout for reading from redis connections.
writetimeout no Timeout for writing to redis connections.

pool

pool:
	maxidle: 16
	maxactive: 64
	idletimeout: 300s

Configure the behavior of the Redis connection pool.

Parameter Required Description
maxidle no Sets the maximum number of idle connections.
maxactive no sets the maximum number of connections that should be opened before blocking a connection request.
idletimeout no sets the amount time to wait before closing inactive connections.

Example: Development configuration

The following is a simple example you can use for local development:

version: 0.1
log:
	level: debug
storage:
    filesystem:
        rootdirectory: /tmp/registry-dev
http:
    addr: localhost:5000
    secret: asecretforlocaldevelopment
    debug:
        addr: localhost:5001

The above configures the registry instance to run on port 5000, binding to localhost, with the debug server enabled. Registry data storage is in the /tmp/registry-dev directory. Logging is in debug mode, which is the most verbose.

A similar simple configuration is available at config.yml. Both are generally useful for local development.

Example: Middleware configuration

This example illustrates how to configure storage middleware in a registry. Middleware allows the registry to serve layers via a content delivery network (CDN). This is useful for reducing requests to the storage layer.

Currently, the registry supports Amazon Cloudfront. You can only use Cloudfront in conjunction with the S3 storage driver.

Parameter Description
name The storage middleware name. Currently cloudfront is an accepted value.
disabled Set to false to easily disable the middleware.
options: A set of key/value options to configure the middleware.
  • baseurl: The Cloudfront base URL.
  • privatekey: The location of your AWS private key on the filesystem.
  • keypairid: The ID of your Cloudfront keypair.
  • duration: The duration in minutes for which the URL is valid. Default is 20.

The following example illustrates these values:

middleware:
    storage:
        - name: cloudfront
          disabled: false
          options:
             baseurl: http://d111111abcdef8.cloudfront.net
             privatekey: /path/to/asecret.pem
             keypairid: asecret
             duration: 60

Note

: Cloudfront keys exist separately to other AWS keys. See the documentation on AWS credentials for more information.