distribution/registry.go

173 lines
5.6 KiB
Go
Raw Permalink Normal View History

package distribution
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
import (
"io"
"net/http"
"time"
"github.com/docker/distribution/digest"
"github.com/docker/distribution/manifest"
"golang.org/x/net/context"
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
)
// Registry represents a collection of repositories, addressable by name.
type Registry interface {
// Repository should return a reference to the named repository. The
// registry may or may not have the repository but should always return a
// reference.
Repository(ctx context.Context, name string) (Repository, error)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// Repository is a named collection of manifests and layers.
type Repository interface {
// Name returns the name of the repository.
Name() string
// Manifests returns a reference to this repository's manifest service.
Manifests() ManifestService
// Layers returns a reference to this repository's layers service.
Layers() LayerService
// Signatures returns a reference to this repository's signatures service.
Signatures() SignatureService
}
// TODO(stevvooe): Must add close methods to all these. May want to change the
// way instances are created to better reflect internal dependency
// relationships.
// ManifestService provides operations on image manifests.
type ManifestService interface {
// Exists returns true if the manifest exists.
Exists(dgst digest.Digest) (bool, error)
// Get retrieves the identified by the digest, if it exists.
Get(dgst digest.Digest) (*manifest.SignedManifest, error)
// Delete removes the manifest, if it exists.
Delete(dgst digest.Digest) error
// Put creates or updates the manifest.
Put(manifest *manifest.SignedManifest) error
// TODO(stevvooe): The methods after this message should be moved to a
// discrete TagService, per active proposals.
// Tags lists the tags under the named repository.
Tags() ([]string, error)
// ExistsByTag returns true if the manifest exists.
ExistsByTag(tag string) (bool, error)
// GetByTag retrieves the named manifest, if it exists.
GetByTag(tag string) (*manifest.SignedManifest, error)
// TODO(stevvooe): There are several changes that need to be done to this
// interface:
//
// 1. Allow explicit tagging with Tag(digest digest.Digest, tag string)
// 2. Support reading tags with a re-entrant reader to avoid large
// allocations in the registry.
// 3. Long-term: Provide All() method that lets one scroll through all of
// the manifest entries.
// 4. Long-term: break out concept of signing from manifests. This is
// really a part of the distribution sprint.
// 5. Long-term: Manifest should be an interface. This code shouldn't
// really be concerned with the storage format.
}
// LayerService provides operations on layer files in a backend storage.
type LayerService interface {
// Exists returns true if the layer exists.
Exists(digest digest.Digest) (bool, error)
// Fetch the layer identifed by TarSum.
Fetch(digest digest.Digest) (Layer, error)
// Upload begins a layer upload to repository identified by name,
// returning a handle.
Upload() (LayerUpload, error)
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// Resume continues an in progress layer upload, returning a handle to the
// upload. The caller should seek to the latest desired upload location
// before proceeding.
Resume(uuid string) (LayerUpload, error)
}
// Layer provides a readable and seekable layer object. Typically,
// implementations are *not* goroutine safe.
type Layer interface {
// http.ServeContent requires an efficient implementation of
// ReadSeeker.Seek(0, os.SEEK_END).
io.ReadSeeker
io.Closer
// Digest returns the unique digest of the blob.
Digest() digest.Digest
// Length returns the length in bytes of the blob.
Length() int64
// CreatedAt returns the time this layer was created.
CreatedAt() time.Time
// Handler returns an HTTP handler which serves the layer content, whether
// by providing a redirect directly to the content, or by serving the
// content itself.
Handler(r *http.Request) (http.Handler, error)
}
// LayerUpload provides a handle for working with in-progress uploads.
// Instances can be obtained from the LayerService.Upload and
// LayerService.Resume.
type LayerUpload interface {
io.WriteSeeker
io.ReaderFrom
io.Closer
// UUID returns the identifier for this upload.
UUID() string
// StartedAt returns the time this layer upload was started.
StartedAt() time.Time
// Finish marks the upload as completed, returning a valid handle to the
// uploaded layer. The digest is validated against the contents of the
// uploaded layer.
Finish(digest digest.Digest) (Layer, error)
// Cancel the layer upload process.
Cancel() error
}
// SignatureService provides operations on signatures.
type SignatureService interface {
// Get retrieves all of the signature blobs for the specified digest.
Get(dgst digest.Digest) ([][]byte, error)
// Put stores the signature for the provided digest.
Put(dgst digest.Digest, signatures ...[]byte) error
}
// Descriptor describes targeted content. Used in conjunction with a blob
// store, a descriptor can be used to fetch, store and target any kind of
// blob. The struct also describes the wire protocol format. Fields should
// only be added but never changed.
type Descriptor struct {
// MediaType describe the type of the content. All text based formats are
// encoded as utf-8.
MediaType string `json:"mediaType,omitempty"`
// Length in bytes of content.
Length int64 `json:"length,omitempty"`
// Digest uniquely identifies the content. A byte stream can be verified
// against against this digest.
Digest digest.Digest `json:"digest,omitempty"`
// NOTE: Before adding a field here, please ensure that all
// other options have been exhausted. Much of the type relationships
// depend on the simplicity of this type.
}