From 593bbccdb5820a6afb1e8e17931a5210ebf0cf6a Mon Sep 17 00:00:00 2001 From: Stephen J Day Date: Tue, 12 May 2015 00:10:29 -0700 Subject: [PATCH] Refactor Blob Service API This PR refactors the blob service API to be oriented around blob descriptors. Identified by digests, blobs become an abstract entity that can be read and written using a descriptor as a handle. This allows blobs to take many forms, such as a ReadSeekCloser or a simple byte buffer, allowing blob oriented operations to better integrate with blob agnostic APIs (such as the `io` package). The error definitions are now better organized to reflect conditions that can only be seen when interacting with the blob API. The main benefit of this is to separate the much smaller metadata from large file storage. Many benefits also follow from this. Reading and writing has been separated into discrete services. Backend implementation is also simplified, by reducing the amount of metadata that needs to be picked up to simply serve a read. This also improves cacheability. "Opening" a blob simply consists of an access check (Stat) and a path calculation. Caching is greatly simplified and we've made the mapping of provisional to canonical hashes a first-class concept. BlobDescriptorService and BlobProvider can be combined in different ways to achieve varying effects. Recommend Review Approach ------------------------- This is a very large patch. While apologies are in order, we are getting a considerable amount of refactoring. Most changes follow from the changes to the root package (distribution), so start there. From there, the main changes are in storage. Looking at (*repository).Blobs will help to understand the how the linkedBlobStore is wired. One can explore the internals within and also branch out into understanding the changes to the caching layer. Following the descriptions below will also help to guide you. To reduce the chances for regressions, it was critical that major changes to unit tests were avoided. Where possible, they are left untouched and where not, the spirit is hopefully captured. Pay particular attention to where behavior may have changed. Storage ------- The primary changes to the `storage` package, other than the interface updates, were to merge the layerstore and blobstore. Blob access is now layered even further. The first layer, blobStore, exposes a global `BlobStatter` and `BlobProvider`. Operations here provide a fast path for most read operations that don't take access control into account. The `linkedBlobStore` layers on top of the `blobStore`, providing repository- scoped blob link management in the backend. The `linkedBlobStore` implements the full `BlobStore` suite, providing access-controlled, repository-local blob writers. The abstraction between the two is slightly broken in that `linkedBlobStore` is the only channel under which one can write into the global blob store. The `linkedBlobStore` also provides flexibility in that it can act over different link sets depending on configuration. This allows us to use the same code for signature links, manifest links and blob links. Eventually, we will fully consolidate this storage. The improved cache flow comes from the `linkedBlobStatter` component of `linkedBlobStore`. Using a `cachedBlobStatter`, these combine together to provide a simple cache hierarchy that should streamline access checks on read and write operations, or at least provide a single path to optimize. The metrics have been changed in a slightly incompatible way since the former operations, Fetch and Exists, are no longer relevant. The fileWriter and fileReader have been slightly modified to support the rest of the changes. The most interesting is the removal of the `Stat` call from `newFileReader`. This was the source of unnecessary round trips that were only present to look up the size of the resulting reader. Now, one must simply pass in the size, requiring the caller to decide whether or not the `Stat` call is appropriate. In several cases, it turned out the caller already had the size already. The `WriterAt` implementation has been removed from `fileWriter`, since it is no longer required for `BlobWriter`, reducing the number of paths which writes may take. Cache ----- Unfortunately, the `cache` package required a near full rewrite. It was pretty mechanical in that the cache is oriented around the `BlobDescriptorService` slightly modified to include the ability to set the values for individual digests. While the implementation is oriented towards caching, it can act as a primary store. Provisions are in place to have repository local metadata, in addition to global metadata. Fallback is implemented as a part of the storage package to maintain this flexibility. One unfortunate side-effect is that caching is now repository-scoped, rather than global. This should have little effect on performance but may increase memory usage. Handlers -------- The `handlers` package has been updated to leverage the new API. For the most part, the changes are superficial or mechanical based on the API changes. This did expose a bug in the handling of provisional vs canonical digests that was fixed in the unit tests. Configuration ------------- One user-facing change has been made to the configuration and is updated in the associated documentation. The `layerinfo` cache parameter has been deprecated by the `blobdescriptor` cache parameter. Both are equivalent and configuration files should be backward compatible. Notifications ------------- Changes the `notification` package are simply to support the interface changes. Context ------- A small change has been made to the tracing log-level. Traces have been moved from "info" to "debug" level to reduce output when not needed. Signed-off-by: Stephen J Day --- blobs.go | 190 +++++++ cmd/registry/config.yml | 2 +- context/trace.go | 11 +- docs/configuration.md | 12 +- docs/deploying.md | 4 +- errors.go | 41 +- notifications/bridge.go | 27 +- notifications/listener.go | 138 +++-- notifications/listener_test.go | 29 +- registry.go | 99 +--- registry/handlers/api_test.go | 21 +- registry/handlers/app.go | 24 +- registry/handlers/app_test.go | 2 +- registry/handlers/blob.go | 69 +++ registry/handlers/blobupload.go | 355 +++++++++++++ registry/handlers/hmac.go | 14 +- registry/handlers/hmac_test.go | 12 +- registry/handlers/images.go | 6 +- registry/handlers/layer.go | 74 --- registry/handlers/layerupload.go | 344 ------------- .../storage/{layer_test.go => blob_test.go} | 225 ++++----- registry/storage/blobserver.go | 72 +++ registry/storage/blobstore.go | 231 +++++---- registry/storage/blobwriter.go | 469 +++++++++++++++++ registry/storage/blobwriter_nonresumable.go | 6 + registry/storage/blobwriter_resumable.go | 9 + registry/storage/cache/cache.go | 106 +--- registry/storage/cache/cache_test.go | 179 ++++--- registry/storage/cache/memory.go | 174 +++++-- registry/storage/cache/memory_test.go | 6 +- registry/storage/cache/redis.go | 238 +++++++-- registry/storage/cache/redis_test.go | 4 +- registry/storage/cachedblobdescriptorstore.go | 84 +++ registry/storage/filereader.go | 53 +- registry/storage/filereader_test.go | 6 +- registry/storage/filewriter.go | 60 +-- registry/storage/filewriter_test.go | 24 +- registry/storage/layercache.go | 202 -------- registry/storage/layerreader.go | 104 ---- registry/storage/layerstore.go | 178 ------- registry/storage/layerwriter.go | 478 ------------------ registry/storage/layerwriter_nonresumable.go | 6 - registry/storage/layerwriter_resumable.go | 9 - registry/storage/linkedblobstore.go | 258 ++++++++++ registry/storage/manifeststore.go | 62 +-- registry/storage/manifeststore_test.go | 28 +- registry/storage/paths.go | 44 +- registry/storage/paths_test.go | 4 +- registry/storage/purgeuploads_test.go | 6 +- registry/storage/registry.go | 121 +++-- registry/storage/revisionstore.go | 118 ++--- registry/storage/signaturestore.go | 80 ++- registry/storage/tagstore.go | 110 ++-- registry/storage/util.go | 21 + 54 files changed, 2770 insertions(+), 2479 deletions(-) create mode 100644 blobs.go create mode 100644 registry/handlers/blob.go create mode 100644 registry/handlers/blobupload.go delete mode 100644 registry/handlers/layer.go delete mode 100644 registry/handlers/layerupload.go rename registry/storage/{layer_test.go => blob_test.go} (56%) create mode 100644 registry/storage/blobserver.go create mode 100644 registry/storage/blobwriter.go create mode 100644 registry/storage/blobwriter_nonresumable.go create mode 100644 registry/storage/blobwriter_resumable.go create mode 100644 registry/storage/cachedblobdescriptorstore.go delete mode 100644 registry/storage/layercache.go delete mode 100644 registry/storage/layerreader.go delete mode 100644 registry/storage/layerstore.go delete mode 100644 registry/storage/layerwriter.go delete mode 100644 registry/storage/layerwriter_nonresumable.go delete mode 100644 registry/storage/layerwriter_resumable.go create mode 100644 registry/storage/linkedblobstore.go create mode 100644 registry/storage/util.go diff --git a/blobs.go b/blobs.go new file mode 100644 index 000000000..c606d9149 --- /dev/null +++ b/blobs.go @@ -0,0 +1,190 @@ +package distribution + +import ( + "errors" + "fmt" + "io" + "net/http" + "time" + + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" +) + +var ( + // ErrBlobExists returned when blob already exists + ErrBlobExists = errors.New("blob exists") + + // ErrBlobDigestUnsupported when blob digest is an unsupported version. + ErrBlobDigestUnsupported = errors.New("unsupported blob digest") + + // ErrBlobUnknown when blob is not found. + ErrBlobUnknown = errors.New("unknown blob") + + // ErrBlobUploadUnknown returned when upload is not found. + ErrBlobUploadUnknown = errors.New("blob upload unknown") + + // ErrBlobInvalidLength returned when the blob has an expected length on + // commit, meaning mismatched with the descriptor or an invalid value. + ErrBlobInvalidLength = errors.New("blob invalid length") +) + +// ErrBlobInvalidDigest returned when digest check fails. +type ErrBlobInvalidDigest struct { + Digest digest.Digest + Reason error +} + +func (err ErrBlobInvalidDigest) Error() string { + return fmt.Sprintf("invalid digest for referenced layer: %v, %v", + err.Digest, err.Reason) +} + +// Descriptor describes targeted content. Used in conjunction with a blob +// store, a descriptor can be used to fetch, store and target any kind of +// blob. The struct also describes the wire protocol format. Fields should +// only be added but never changed. +type Descriptor struct { + // MediaType describe the type of the content. All text based formats are + // encoded as utf-8. + MediaType string `json:"mediaType,omitempty"` + + // Length in bytes of content. + Length int64 `json:"length,omitempty"` + + // Digest uniquely identifies the content. A byte stream can be verified + // against against this digest. + Digest digest.Digest `json:"digest,omitempty"` + + // NOTE: Before adding a field here, please ensure that all + // other options have been exhausted. Much of the type relationships + // depend on the simplicity of this type. +} + +// BlobStatter makes blob descriptors available by digest. The service may +// provide a descriptor of a different digest if the provided digest is not +// canonical. +type BlobStatter interface { + // Stat provides metadata about a blob identified by the digest. If the + // blob is unknown to the describer, ErrBlobUnknown will be returned. + Stat(ctx context.Context, dgst digest.Digest) (Descriptor, error) +} + +// BlobDescriptorService manages metadata about a blob by digest. Most +// implementations will not expose such an interface explicitly. Such mappings +// should be maintained by interacting with the BlobIngester. Hence, this is +// left off of BlobService and BlobStore. +type BlobDescriptorService interface { + BlobStatter + + // SetDescriptor assigns the descriptor to the digest. The provided digest and + // the digest in the descriptor must map to identical content but they may + // differ on their algorithm. The descriptor must have the canonical + // digest of the content and the digest algorithm must match the + // annotators canonical algorithm. + // + // Such a facility can be used to map blobs between digest domains, with + // the restriction that the algorithm of the descriptor must match the + // canonical algorithm (ie sha256) of the annotator. + SetDescriptor(ctx context.Context, dgst digest.Digest, desc Descriptor) error +} + +// ReadSeekCloser is the primary reader type for blob data, combining +// io.ReadSeeker with io.Closer. +type ReadSeekCloser interface { + io.ReadSeeker + io.Closer +} + +// BlobProvider describes operations for getting blob data. +type BlobProvider interface { + // Get returns the entire blob identified by digest along with the descriptor. + Get(ctx context.Context, dgst digest.Digest) ([]byte, error) + + // Open provides a ReadSeekCloser to the blob identified by the provided + // descriptor. If the blob is not known to the service, an error will be + // returned. + Open(ctx context.Context, dgst digest.Digest) (ReadSeekCloser, error) +} + +// BlobServer can serve blobs via http. +type BlobServer interface { + // ServeBlob attempts to serve the blob, identifed by dgst, via http. The + // service may decide to redirect the client elsewhere or serve the data + // directly. + // + // This handler only issues successful responses, such as 2xx or 3xx, + // meaning it serves data or issues a redirect. If the blob is not + // available, an error will be returned and the caller may still issue a + // response. + // + // The implementation may serve the same blob from a different digest + // domain. The appropriate headers will be set for the blob, unless they + // have already been set by the caller. + ServeBlob(ctx context.Context, w http.ResponseWriter, r *http.Request, dgst digest.Digest) error +} + +// BlobIngester ingests blob data. +type BlobIngester interface { + // Put inserts the content p into the blob service, returning a descriptor + // or an error. + Put(ctx context.Context, mediaType string, p []byte) (Descriptor, error) + + // Create allocates a new blob writer to add a blob to this service. The + // returned handle can be written to and later resumed using an opaque + // identifier. With this approach, one can Close and Resume a BlobWriter + // multiple times until the BlobWriter is committed or cancelled. + Create(ctx context.Context) (BlobWriter, error) + + // Resume attempts to resume a write to a blob, identified by an id. + Resume(ctx context.Context, id string) (BlobWriter, error) +} + +// BlobWriter provides a handle for inserting data into a blob store. +// Instances should be obtained from BlobWriteService.Writer and +// BlobWriteService.Resume. If supported by the store, a writer can be +// recovered with the id. +type BlobWriter interface { + io.WriteSeeker + io.ReaderFrom + io.Closer + + // ID returns the identifier for this writer. The ID can be used with the + // Blob service to later resume the write. + ID() string + + // StartedAt returns the time this blob write was started. + StartedAt() time.Time + + // Commit completes the blob writer process. The content is verified + // against the provided provisional descriptor, which may result in an + // error. Depending on the implementation, written data may be validated + // against the provisional descriptor fields. If MediaType is not present, + // the implementation may reject the commit or assign "application/octet- + // stream" to the blob. The returned descriptor may have a different + // digest depending on the blob store, referred to as the canonical + // descriptor. + Commit(ctx context.Context, provisional Descriptor) (canonical Descriptor, err error) + + // Cancel ends the blob write without storing any data and frees any + // associated resources. Any data written thus far will be lost. Cancel + // implementations should allow multiple calls even after a commit that + // result in a no-op. This allows use of Cancel in a defer statement, + // increasing the assurance that it is correctly called. + Cancel(ctx context.Context) error +} + +// BlobService combines the operations to access, read and write blobs. This +// can be used to describe remote blob services. +type BlobService interface { + BlobStatter + BlobProvider + BlobIngester +} + +// BlobStore represent the entire suite of blob related operations. Such an +// implementation can access, read, write and serve blobs. +type BlobStore interface { + BlobService + BlobServer +} diff --git a/cmd/registry/config.yml b/cmd/registry/config.yml index b1a8f48dc..2e41c64d8 100644 --- a/cmd/registry/config.yml +++ b/cmd/registry/config.yml @@ -6,7 +6,7 @@ log: environment: development storage: cache: - layerinfo: inmemory + blobdescriptor: redis filesystem: rootdirectory: /tmp/registry-dev maintenance: diff --git a/context/trace.go b/context/trace.go index 1115fc1f6..b385277b9 100644 --- a/context/trace.go +++ b/context/trace.go @@ -54,9 +54,14 @@ func WithTrace(ctx Context) (Context, func(format string, a ...interface{})) { } return ctx, func(format string, a ...interface{}) { - GetLogger(ctx, "trace.duration", "trace.id", "trace.parent.id", - "trace.func", "trace.file", "trace.line"). - Infof(format, a...) // info may be too chatty. + GetLogger(ctx, + "trace.duration", + "trace.id", + "trace.parent.id", + "trace.func", + "trace.file", + "trace.line"). + Debugf(format, a...) } } diff --git a/docs/configuration.md b/docs/configuration.md index d7bd15d63..b8e8c89ed 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -44,7 +44,7 @@ storage: chunksize: 5242880 rootdirectory: /s3/object/name/prefix cache: - layerinfo: inmemory + blobdescriptor: redis maintenance: uploadpurging: enabled: true @@ -262,7 +262,7 @@ storage: chunksize: 5242880 rootdirectory: /s3/object/name/prefix cache: - layerinfo: inmemory + blobdescriptor: inmemory maintenance: uploadpurging: enabled: true @@ -278,12 +278,16 @@ You must configure one backend; if you configure more, the registry returns an e Use the `cache` subsection to enable caching of data accessed in the storage backend. Currently, the only available cache provides fast access to layer -metadata. This, if configured, uses the `layerinfo` field. +metadata. This, if configured, uses the `blobdescriptor` field. -You can set `layerinfo` field to `redis` or `inmemory`. The `redis` value uses +You can set `blobdescriptor` field to `redis` or `inmemory`. The `redis` value uses a Redis pool to cache layer metadata. The `inmemory` value uses an in memory map. +>**NOTE**: Formerly, `blobdescriptor` was known as `layerinfo`. While these +>are equivalent, `layerinfo` has been deprecated, in favor or +>`blobdescriptor`. + ### filesystem The `filesystem` storage backend uses the local disk to store registry files. It diff --git a/docs/deploying.md b/docs/deploying.md index 9e715e65a..39e491ec6 100644 --- a/docs/deploying.md +++ b/docs/deploying.md @@ -205,7 +205,7 @@ log: environment: development storage: cache: - layerinfo: inmemory + blobdescriptor: inmemory filesystem: rootdirectory: /tmp/registry-dev maintenance: @@ -337,7 +337,7 @@ support. $ docker run -p 5000:5000 secure_registry:latest time="2015-04-12T03:06:18.616502588Z" level=info msg="endpoint local-8082 disabled, skipping" environment=development instance.id=bf33c9dc-2564-406b-97c3-6ee69dc20ec6 service=registry time="2015-04-12T03:06:18.617012948Z" level=info msg="endpoint local-8083 disabled, skipping" environment=development instance.id=bf33c9dc-2564-406b-97c3-6ee69dc20ec6 service=registry - time="2015-04-12T03:06:18.617190113Z" level=info msg="using inmemory layerinfo cache" environment=development instance.id=bf33c9dc-2564-406b-97c3-6ee69dc20ec6 service=registry + time="2015-04-12T03:06:18.617190113Z" level=info msg="using inmemory blob descriptor cache" environment=development instance.id=bf33c9dc-2564-406b-97c3-6ee69dc20ec6 service=registry time="2015-04-12T03:06:18.617349067Z" level=info msg="listening on :5000, tls" environment=development instance.id=bf33c9dc-2564-406b-97c3-6ee69dc20ec6 service=registry time="2015-04-12T03:06:18.628589577Z" level=info msg="debug server listening localhost:5001" 2015/04/12 03:06:28 http: TLS handshake error from 172.17.42.1:44261: remote error: unknown certificate authority diff --git a/errors.go b/errors.go index 7883b9f73..a4af23e57 100644 --- a/errors.go +++ b/errors.go @@ -5,22 +5,6 @@ import ( "strings" "github.com/docker/distribution/digest" - "github.com/docker/distribution/manifest" -) - -var ( - // ErrLayerExists returned when layer already exists - ErrLayerExists = fmt.Errorf("layer exists") - - // ErrLayerTarSumVersionUnsupported when tarsum is unsupported version. - ErrLayerTarSumVersionUnsupported = fmt.Errorf("unsupported tarsum version") - - // ErrLayerUploadUnknown returned when upload is not found. - ErrLayerUploadUnknown = fmt.Errorf("layer upload unknown") - - // ErrLayerClosed returned when an operation is attempted on a closed - // Layer or LayerUpload. - ErrLayerClosed = fmt.Errorf("layer closed") ) // ErrRepositoryUnknown is returned if the named repository is not known by @@ -55,14 +39,14 @@ func (err ErrManifestUnknown) Error() string { return fmt.Sprintf("unknown manifest name=%s tag=%s", err.Name, err.Tag) } -// ErrUnknownManifestRevision is returned when a manifest cannot be found by +// ErrManifestUnknownRevision is returned when a manifest cannot be found by // revision within a repository. -type ErrUnknownManifestRevision struct { +type ErrManifestUnknownRevision struct { Name string Revision digest.Digest } -func (err ErrUnknownManifestRevision) Error() string { +func (err ErrManifestUnknownRevision) Error() string { return fmt.Sprintf("unknown manifest name=%s revision=%s", err.Name, err.Revision) } @@ -88,22 +72,11 @@ func (errs ErrManifestVerification) Error() string { return fmt.Sprintf("errors verifying manifest: %v", strings.Join(parts, ",")) } -// ErrUnknownLayer returned when layer cannot be found. -type ErrUnknownLayer struct { - FSLayer manifest.FSLayer -} - -func (err ErrUnknownLayer) Error() string { - return fmt.Sprintf("unknown layer %v", err.FSLayer.BlobSum) -} - -// ErrLayerInvalidDigest returned when tarsum check fails. -type ErrLayerInvalidDigest struct { +// ErrManifestBlobUnknown returned when a referenced blob cannot be found. +type ErrManifestBlobUnknown struct { Digest digest.Digest - Reason error } -func (err ErrLayerInvalidDigest) Error() string { - return fmt.Sprintf("invalid digest for referenced layer: %v, %v", - err.Digest, err.Reason) +func (err ErrManifestBlobUnknown) Error() string { + return fmt.Sprintf("unknown blob %v on manifest", err.Digest) } diff --git a/notifications/bridge.go b/notifications/bridge.go index baa90a5bf..9a9a803e5 100644 --- a/notifications/bridge.go +++ b/notifications/bridge.go @@ -65,16 +65,16 @@ func (b *bridge) ManifestDeleted(repo distribution.Repository, sm *manifest.Sign return b.createManifestEventAndWrite(EventActionDelete, repo, sm) } -func (b *bridge) LayerPushed(repo distribution.Repository, layer distribution.Layer) error { - return b.createLayerEventAndWrite(EventActionPush, repo, layer) +func (b *bridge) BlobPushed(repo distribution.Repository, desc distribution.Descriptor) error { + return b.createBlobEventAndWrite(EventActionPush, repo, desc) } -func (b *bridge) LayerPulled(repo distribution.Repository, layer distribution.Layer) error { - return b.createLayerEventAndWrite(EventActionPull, repo, layer) +func (b *bridge) BlobPulled(repo distribution.Repository, desc distribution.Descriptor) error { + return b.createBlobEventAndWrite(EventActionPull, repo, desc) } -func (b *bridge) LayerDeleted(repo distribution.Repository, layer distribution.Layer) error { - return b.createLayerEventAndWrite(EventActionDelete, repo, layer) +func (b *bridge) BlobDeleted(repo distribution.Repository, desc distribution.Descriptor) error { + return b.createBlobEventAndWrite(EventActionDelete, repo, desc) } func (b *bridge) createManifestEventAndWrite(action string, repo distribution.Repository, sm *manifest.SignedManifest) error { @@ -113,8 +113,8 @@ func (b *bridge) createManifestEvent(action string, repo distribution.Repository return event, nil } -func (b *bridge) createLayerEventAndWrite(action string, repo distribution.Repository, layer distribution.Layer) error { - event, err := b.createLayerEvent(action, repo, layer) +func (b *bridge) createBlobEventAndWrite(action string, repo distribution.Repository, desc distribution.Descriptor) error { + event, err := b.createBlobEvent(action, repo, desc) if err != nil { return err } @@ -122,18 +122,13 @@ func (b *bridge) createLayerEventAndWrite(action string, repo distribution.Repos return b.sink.Write(*event) } -func (b *bridge) createLayerEvent(action string, repo distribution.Repository, layer distribution.Layer) (*Event, error) { +func (b *bridge) createBlobEvent(action string, repo distribution.Repository, desc distribution.Descriptor) (*Event, error) { event := b.createEvent(action) - event.Target.MediaType = layerMediaType + event.Target.Descriptor = desc event.Target.Repository = repo.Name() - event.Target.Length = layer.Length() - - dgst := layer.Digest() - event.Target.Digest = dgst - var err error - event.Target.URL, err = b.ub.BuildBlobURL(repo.Name(), dgst) + event.Target.URL, err = b.ub.BuildBlobURL(repo.Name(), desc.Digest) if err != nil { return nil, err } diff --git a/notifications/listener.go b/notifications/listener.go index 48c42af2f..5d83af5b5 100644 --- a/notifications/listener.go +++ b/notifications/listener.go @@ -1,8 +1,11 @@ package notifications import ( + "net/http" + "github.com/Sirupsen/logrus" "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" ) @@ -18,21 +21,21 @@ type ManifestListener interface { ManifestDeleted(repo distribution.Repository, sm *manifest.SignedManifest) error } -// LayerListener describes a listener that can respond to layer related events. -type LayerListener interface { - LayerPushed(repo distribution.Repository, layer distribution.Layer) error - LayerPulled(repo distribution.Repository, layer distribution.Layer) error +// BlobListener describes a listener that can respond to layer related events. +type BlobListener interface { + BlobPushed(repo distribution.Repository, desc distribution.Descriptor) error + BlobPulled(repo distribution.Repository, desc distribution.Descriptor) error // TODO(stevvooe): Please note that delete support is still a little shaky // and we'll need to propagate these in the future. - LayerDeleted(repo distribution.Repository, layer distribution.Layer) error + BlobDeleted(repo distribution.Repository, desc distribution.Descriptor) error } // Listener combines all repository events into a single interface. type Listener interface { ManifestListener - LayerListener + BlobListener } type repositoryListener struct { @@ -55,10 +58,10 @@ func (rl *repositoryListener) Manifests() distribution.ManifestService { } } -func (rl *repositoryListener) Layers() distribution.LayerService { - return &layerServiceListener{ - LayerService: rl.Repository.Layers(), - parent: rl, +func (rl *repositoryListener) Blobs(ctx context.Context) distribution.BlobStore { + return &blobServiceListener{ + BlobStore: rl.Repository.Blobs(ctx), + parent: rl, } } @@ -101,51 +104,98 @@ func (msl *manifestServiceListener) GetByTag(tag string) (*manifest.SignedManife return sm, err } -type layerServiceListener struct { - distribution.LayerService +type blobServiceListener struct { + distribution.BlobStore parent *repositoryListener } -func (lsl *layerServiceListener) Fetch(dgst digest.Digest) (distribution.Layer, error) { - layer, err := lsl.LayerService.Fetch(dgst) +var _ distribution.BlobStore = &blobServiceListener{} + +func (bsl *blobServiceListener) Get(ctx context.Context, dgst digest.Digest) ([]byte, error) { + p, err := bsl.BlobStore.Get(ctx, dgst) if err == nil { - if err := lsl.parent.listener.LayerPulled(lsl.parent.Repository, layer); err != nil { - logrus.Errorf("error dispatching layer pull to listener: %v", err) + if desc, err := bsl.Stat(ctx, dgst); err != nil { + context.GetLogger(ctx).Errorf("error resolving descriptor in ServeBlob listener: %v", err) + } else { + if err := bsl.parent.listener.BlobPulled(bsl.parent.Repository, desc); err != nil { + context.GetLogger(ctx).Errorf("error dispatching layer pull to listener: %v", err) + } } } - return layer, err + return p, err } -func (lsl *layerServiceListener) Upload() (distribution.LayerUpload, error) { - lu, err := lsl.LayerService.Upload() - return lsl.decorateUpload(lu), err -} - -func (lsl *layerServiceListener) Resume(uuid string) (distribution.LayerUpload, error) { - lu, err := lsl.LayerService.Resume(uuid) - return lsl.decorateUpload(lu), err -} - -func (lsl *layerServiceListener) decorateUpload(lu distribution.LayerUpload) distribution.LayerUpload { - return &layerUploadListener{ - LayerUpload: lu, - parent: lsl, - } -} - -type layerUploadListener struct { - distribution.LayerUpload - parent *layerServiceListener -} - -func (lul *layerUploadListener) Finish(dgst digest.Digest) (distribution.Layer, error) { - layer, err := lul.LayerUpload.Finish(dgst) +func (bsl *blobServiceListener) Open(ctx context.Context, dgst digest.Digest) (distribution.ReadSeekCloser, error) { + rc, err := bsl.BlobStore.Open(ctx, dgst) if err == nil { - if err := lul.parent.parent.listener.LayerPushed(lul.parent.parent.Repository, layer); err != nil { - logrus.Errorf("error dispatching layer push to listener: %v", err) + if desc, err := bsl.Stat(ctx, dgst); err != nil { + context.GetLogger(ctx).Errorf("error resolving descriptor in ServeBlob listener: %v", err) + } else { + if err := bsl.parent.listener.BlobPulled(bsl.parent.Repository, desc); err != nil { + context.GetLogger(ctx).Errorf("error dispatching layer pull to listener: %v", err) + } } } - return layer, err + return rc, err +} + +func (bsl *blobServiceListener) ServeBlob(ctx context.Context, w http.ResponseWriter, r *http.Request, dgst digest.Digest) error { + err := bsl.BlobStore.ServeBlob(ctx, w, r, dgst) + if err == nil { + if desc, err := bsl.Stat(ctx, dgst); err != nil { + context.GetLogger(ctx).Errorf("error resolving descriptor in ServeBlob listener: %v", err) + } else { + if err := bsl.parent.listener.BlobPulled(bsl.parent.Repository, desc); err != nil { + context.GetLogger(ctx).Errorf("error dispatching layer pull to listener: %v", err) + } + } + } + + return err +} + +func (bsl *blobServiceListener) Put(ctx context.Context, mediaType string, p []byte) (distribution.Descriptor, error) { + desc, err := bsl.BlobStore.Put(ctx, mediaType, p) + if err == nil { + if err := bsl.parent.listener.BlobPushed(bsl.parent.Repository, desc); err != nil { + context.GetLogger(ctx).Errorf("error dispatching layer pull to listener: %v", err) + } + } + + return desc, err +} + +func (bsl *blobServiceListener) Create(ctx context.Context) (distribution.BlobWriter, error) { + wr, err := bsl.BlobStore.Create(ctx) + return bsl.decorateWriter(wr), err +} + +func (bsl *blobServiceListener) Resume(ctx context.Context, id string) (distribution.BlobWriter, error) { + wr, err := bsl.BlobStore.Resume(ctx, id) + return bsl.decorateWriter(wr), err +} + +func (bsl *blobServiceListener) decorateWriter(wr distribution.BlobWriter) distribution.BlobWriter { + return &blobWriterListener{ + BlobWriter: wr, + parent: bsl, + } +} + +type blobWriterListener struct { + distribution.BlobWriter + parent *blobServiceListener +} + +func (bwl *blobWriterListener) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { + committed, err := bwl.BlobWriter.Commit(ctx, desc) + if err == nil { + if err := bwl.parent.parent.listener.BlobPushed(bwl.parent.parent.Repository, committed); err != nil { + context.GetLogger(ctx).Errorf("error dispatching blob push to listener: %v", err) + } + } + + return committed, err } diff --git a/notifications/listener_test.go b/notifications/listener_test.go index e046c3972..0b20140d2 100644 --- a/notifications/listener_test.go +++ b/notifications/listener_test.go @@ -6,6 +6,7 @@ import ( "testing" "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" "github.com/docker/distribution/registry/storage" @@ -13,12 +14,11 @@ import ( "github.com/docker/distribution/registry/storage/driver/inmemory" "github.com/docker/distribution/testutil" "github.com/docker/libtrust" - "golang.org/x/net/context" ) func TestListener(t *testing.T) { ctx := context.Background() - registry := storage.NewRegistryWithDriver(ctx, inmemory.New(), cache.NewInMemoryLayerInfoCache()) + registry := storage.NewRegistryWithDriver(ctx, inmemory.New(), cache.NewInMemoryBlobDescriptorCacheProvider()) tl := &testListener{ ops: make(map[string]int), } @@ -67,17 +67,17 @@ func (tl *testListener) ManifestDeleted(repo distribution.Repository, sm *manife return nil } -func (tl *testListener) LayerPushed(repo distribution.Repository, layer distribution.Layer) error { +func (tl *testListener) BlobPushed(repo distribution.Repository, desc distribution.Descriptor) error { tl.ops["layer:push"]++ return nil } -func (tl *testListener) LayerPulled(repo distribution.Repository, layer distribution.Layer) error { +func (tl *testListener) BlobPulled(repo distribution.Repository, desc distribution.Descriptor) error { tl.ops["layer:pull"]++ return nil } -func (tl *testListener) LayerDeleted(repo distribution.Repository, layer distribution.Layer) error { +func (tl *testListener) BlobDeleted(repo distribution.Repository, desc distribution.Descriptor) error { tl.ops["layer:delete"]++ return nil } @@ -89,7 +89,7 @@ func checkExerciseRepository(t *testing.T, repository distribution.Repository) { // takes the registry through a common set of operations. This could be // used to make cross-cutting updates by changing internals that affect // update counts. Basically, it would make writing tests a lot easier. - + ctx := context.Background() tag := "thetag" m := manifest.Manifest{ Versioned: manifest.Versioned{ @@ -99,27 +99,28 @@ func checkExerciseRepository(t *testing.T, repository distribution.Repository) { Tag: tag, } - layers := repository.Layers() + blobs := repository.Blobs(ctx) for i := 0; i < 2; i++ { rs, ds, err := testutil.CreateRandomTarFile() if err != nil { t.Fatalf("error creating test layer: %v", err) } dgst := digest.Digest(ds) - upload, err := layers.Upload() + + wr, err := blobs.Create(ctx) if err != nil { t.Fatalf("error creating layer upload: %v", err) } // Use the resumes, as well! - upload, err = layers.Resume(upload.UUID()) + wr, err = blobs.Resume(ctx, wr.ID()) if err != nil { t.Fatalf("error resuming layer upload: %v", err) } - io.Copy(upload, rs) + io.Copy(wr, rs) - if _, err := upload.Finish(dgst); err != nil { + if _, err := wr.Commit(ctx, distribution.Descriptor{Digest: dgst}); err != nil { t.Fatalf("unexpected error finishing upload: %v", err) } @@ -127,9 +128,11 @@ func checkExerciseRepository(t *testing.T, repository distribution.Repository) { BlobSum: dgst, }) - // Then fetch the layers - if _, err := layers.Fetch(dgst); err != nil { + // Then fetch the blobs + if rc, err := blobs.Open(ctx, dgst); err != nil { t.Fatalf("error fetching layer: %v", err) + } else { + defer rc.Close() } } diff --git a/registry.go b/registry.go index 374d8ca59..bdca8bc49 100644 --- a/registry.go +++ b/registry.go @@ -1,13 +1,9 @@ package distribution import ( - "io" - "net/http" - "time" - + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" - "golang.org/x/net/context" ) // Scope defines the set of items that match a namespace. @@ -49,8 +45,12 @@ type Repository interface { // Manifests returns a reference to this repository's manifest service. Manifests() ManifestService - // Layers returns a reference to this repository's layers service. - Layers() LayerService + // Blobs returns a reference to this repository's blob service. + Blobs(ctx context.Context) BlobStore + + // TODO(stevvooe): The above BlobStore return can probably be relaxed to + // be a BlobService for use with clients. This will allow such + // implementations to avoid implementing ServeBlob. // Signatures returns a reference to this repository's signatures service. Signatures() SignatureService @@ -100,70 +100,6 @@ type ManifestService interface { // really be concerned with the storage format. } -// LayerService provides operations on layer files in a backend storage. -type LayerService interface { - // Exists returns true if the layer exists. - Exists(digest digest.Digest) (bool, error) - - // Fetch the layer identifed by TarSum. - Fetch(digest digest.Digest) (Layer, error) - - // Upload begins a layer upload to repository identified by name, - // returning a handle. - Upload() (LayerUpload, error) - - // Resume continues an in progress layer upload, returning a handle to the - // upload. The caller should seek to the latest desired upload location - // before proceeding. - Resume(uuid string) (LayerUpload, error) -} - -// Layer provides a readable and seekable layer object. Typically, -// implementations are *not* goroutine safe. -type Layer interface { - // http.ServeContent requires an efficient implementation of - // ReadSeeker.Seek(0, os.SEEK_END). - io.ReadSeeker - io.Closer - - // Digest returns the unique digest of the blob. - Digest() digest.Digest - - // Length returns the length in bytes of the blob. - Length() int64 - - // CreatedAt returns the time this layer was created. - CreatedAt() time.Time - - // Handler returns an HTTP handler which serves the layer content, whether - // by providing a redirect directly to the content, or by serving the - // content itself. - Handler(r *http.Request) (http.Handler, error) -} - -// LayerUpload provides a handle for working with in-progress uploads. -// Instances can be obtained from the LayerService.Upload and -// LayerService.Resume. -type LayerUpload interface { - io.WriteSeeker - io.ReaderFrom - io.Closer - - // UUID returns the identifier for this upload. - UUID() string - - // StartedAt returns the time this layer upload was started. - StartedAt() time.Time - - // Finish marks the upload as completed, returning a valid handle to the - // uploaded layer. The digest is validated against the contents of the - // uploaded layer. - Finish(digest digest.Digest) (Layer, error) - - // Cancel the layer upload process. - Cancel() error -} - // SignatureService provides operations on signatures. type SignatureService interface { // Get retrieves all of the signature blobs for the specified digest. @@ -172,24 +108,3 @@ type SignatureService interface { // Put stores the signature for the provided digest. Put(dgst digest.Digest, signatures ...[]byte) error } - -// Descriptor describes targeted content. Used in conjunction with a blob -// store, a descriptor can be used to fetch, store and target any kind of -// blob. The struct also describes the wire protocol format. Fields should -// only be added but never changed. -type Descriptor struct { - // MediaType describe the type of the content. All text based formats are - // encoded as utf-8. - MediaType string `json:"mediaType,omitempty"` - - // Length in bytes of content. - Length int64 `json:"length,omitempty"` - - // Digest uniquely identifies the content. A byte stream can be verified - // against against this digest. - Digest digest.Digest `json:"digest,omitempty"` - - // NOTE: Before adding a field here, please ensure that all - // other options have been exhausted. Much of the type relationships - // depend on the simplicity of this type. -} diff --git a/registry/handlers/api_test.go b/registry/handlers/api_test.go index 6dc7a4228..9b5027ba1 100644 --- a/registry/handlers/api_test.go +++ b/registry/handlers/api_test.go @@ -93,8 +93,8 @@ func TestURLPrefix(t *testing.T) { } -// TestLayerAPI conducts a full test of the of the layer api. -func TestLayerAPI(t *testing.T) { +// TestBlobAPI conducts a full test of the of the blob api. +func TestBlobAPI(t *testing.T) { // TODO(stevvooe): This test code is complete junk but it should cover the // complete flow. This must be broken down and checked against the // specification *before* we submit the final to docker core. @@ -213,6 +213,13 @@ func TestLayerAPI(t *testing.T) { // Now, push just a chunk layerFile.Seek(0, 0) + canonicalDigester := digest.NewCanonicalDigester() + if _, err := io.Copy(canonicalDigester, layerFile); err != nil { + t.Fatalf("error copying to digest: %v", err) + } + canonicalDigest := canonicalDigester.Digest() + + layerFile.Seek(0, 0) uploadURLBase, uploadUUID = startPushLayer(t, env.builder, imageName) uploadURLBase, dgst := pushChunk(t, env.builder, imageName, uploadURLBase, layerFile, layerLength) finishUpload(t, env.builder, imageName, uploadURLBase, dgst) @@ -226,7 +233,7 @@ func TestLayerAPI(t *testing.T) { checkResponse(t, "checking head on existing layer", resp, http.StatusOK) checkHeaders(t, resp, http.Header{ "Content-Length": []string{fmt.Sprint(layerLength)}, - "Docker-Content-Digest": []string{layerDigest.String()}, + "Docker-Content-Digest": []string{canonicalDigest.String()}, }) // ---------------- @@ -239,7 +246,7 @@ func TestLayerAPI(t *testing.T) { checkResponse(t, "fetching layer", resp, http.StatusOK) checkHeaders(t, resp, http.Header{ "Content-Length": []string{fmt.Sprint(layerLength)}, - "Docker-Content-Digest": []string{layerDigest.String()}, + "Docker-Content-Digest": []string{canonicalDigest.String()}, }) // Verify the body @@ -272,9 +279,9 @@ func TestLayerAPI(t *testing.T) { checkResponse(t, "fetching layer", resp, http.StatusOK) checkHeaders(t, resp, http.Header{ "Content-Length": []string{fmt.Sprint(layerLength)}, - "Docker-Content-Digest": []string{layerDigest.String()}, - "ETag": []string{layerDigest.String()}, - "Cache-Control": []string{"max-age=86400"}, + "Docker-Content-Digest": []string{canonicalDigest.String()}, + "ETag": []string{canonicalDigest.String()}, + "Cache-Control": []string{"max-age=31536000"}, }) // Matching etag, gives 304 diff --git a/registry/handlers/app.go b/registry/handlers/app.go index 40181afa3..22c0b6def 100644 --- a/registry/handlers/app.go +++ b/registry/handlers/app.go @@ -67,9 +67,9 @@ func NewApp(ctx context.Context, configuration configuration.Configuration) *App }) app.register(v2.RouteNameManifest, imageManifestDispatcher) app.register(v2.RouteNameTags, tagsDispatcher) - app.register(v2.RouteNameBlob, layerDispatcher) - app.register(v2.RouteNameBlobUpload, layerUploadDispatcher) - app.register(v2.RouteNameBlobUploadChunk, layerUploadDispatcher) + app.register(v2.RouteNameBlob, blobDispatcher) + app.register(v2.RouteNameBlobUpload, blobUploadDispatcher) + app.register(v2.RouteNameBlobUploadChunk, blobUploadDispatcher) var err error app.driver, err = factory.Create(configuration.Storage.Type(), configuration.Storage.Parameters()) @@ -103,18 +103,24 @@ func NewApp(ctx context.Context, configuration configuration.Configuration) *App // configure storage caches if cc, ok := configuration.Storage["cache"]; ok { - switch cc["layerinfo"] { + v, ok := cc["blobdescriptor"] + if !ok { + // Backwards compatible: "layerinfo" == "blobdescriptor" + v = cc["layerinfo"] + } + + switch v { case "redis": if app.redis == nil { panic("redis configuration required to use for layerinfo cache") } - app.registry = storage.NewRegistryWithDriver(app, app.driver, cache.NewRedisLayerInfoCache(app.redis)) - ctxu.GetLogger(app).Infof("using redis layerinfo cache") + app.registry = storage.NewRegistryWithDriver(app, app.driver, cache.NewRedisBlobDescriptorCacheProvider(app.redis)) + ctxu.GetLogger(app).Infof("using redis blob descriptor cache") case "inmemory": - app.registry = storage.NewRegistryWithDriver(app, app.driver, cache.NewInMemoryLayerInfoCache()) - ctxu.GetLogger(app).Infof("using inmemory layerinfo cache") + app.registry = storage.NewRegistryWithDriver(app, app.driver, cache.NewInMemoryBlobDescriptorCacheProvider()) + ctxu.GetLogger(app).Infof("using inmemory blob descriptor cache") default: - if cc["layerinfo"] != "" { + if v != "" { ctxu.GetLogger(app).Warnf("unkown cache type %q, caching disabled", configuration.Storage["cache"]) } } diff --git a/registry/handlers/app_test.go b/registry/handlers/app_test.go index 8ea5b1e55..03ea0c9ce 100644 --- a/registry/handlers/app_test.go +++ b/registry/handlers/app_test.go @@ -30,7 +30,7 @@ func TestAppDispatcher(t *testing.T) { Context: ctx, router: v2.Router(), driver: driver, - registry: storage.NewRegistryWithDriver(ctx, driver, cache.NewInMemoryLayerInfoCache()), + registry: storage.NewRegistryWithDriver(ctx, driver, cache.NewInMemoryBlobDescriptorCacheProvider()), } server := httptest.NewServer(app) router := v2.Router() diff --git a/registry/handlers/blob.go b/registry/handlers/blob.go new file mode 100644 index 000000000..3237b1951 --- /dev/null +++ b/registry/handlers/blob.go @@ -0,0 +1,69 @@ +package handlers + +import ( + "net/http" + + "github.com/docker/distribution" + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + "github.com/docker/distribution/registry/api/v2" + "github.com/gorilla/handlers" +) + +// blobDispatcher uses the request context to build a blobHandler. +func blobDispatcher(ctx *Context, r *http.Request) http.Handler { + dgst, err := getDigest(ctx) + if err != nil { + + if err == errDigestNotAvailable { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusNotFound) + ctx.Errors.Push(v2.ErrorCodeDigestInvalid, err) + }) + } + + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + ctx.Errors.Push(v2.ErrorCodeDigestInvalid, err) + }) + } + + blobHandler := &blobHandler{ + Context: ctx, + Digest: dgst, + } + + return handlers.MethodHandler{ + "GET": http.HandlerFunc(blobHandler.GetBlob), + "HEAD": http.HandlerFunc(blobHandler.GetBlob), + } +} + +// blobHandler serves http blob requests. +type blobHandler struct { + *Context + + Digest digest.Digest +} + +// GetBlob fetches the binary data from backend storage returns it in the +// response. +func (bh *blobHandler) GetBlob(w http.ResponseWriter, r *http.Request) { + context.GetLogger(bh).Debug("GetBlob") + blobs := bh.Repository.Blobs(bh) + desc, err := blobs.Stat(bh, bh.Digest) + if err != nil { + if err == distribution.ErrBlobUnknown { + w.WriteHeader(http.StatusNotFound) + bh.Errors.Push(v2.ErrorCodeBlobUnknown, bh.Digest) + } else { + bh.Errors.Push(v2.ErrorCodeUnknown, err) + } + return + } + + if err := blobs.ServeBlob(bh, w, r, desc.Digest); err != nil { + context.GetLogger(bh).Debugf("unexpected error getting blob HTTP handler: %v", err) + bh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } +} diff --git a/registry/handlers/blobupload.go b/registry/handlers/blobupload.go new file mode 100644 index 000000000..99a75698d --- /dev/null +++ b/registry/handlers/blobupload.go @@ -0,0 +1,355 @@ +package handlers + +import ( + "fmt" + "io" + "net/http" + "net/url" + "os" + + "github.com/docker/distribution" + ctxu "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + "github.com/docker/distribution/registry/api/v2" + "github.com/gorilla/handlers" +) + +// blobUploadDispatcher constructs and returns the blob upload handler for the +// given request context. +func blobUploadDispatcher(ctx *Context, r *http.Request) http.Handler { + buh := &blobUploadHandler{ + Context: ctx, + UUID: getUploadUUID(ctx), + } + + handler := http.Handler(handlers.MethodHandler{ + "POST": http.HandlerFunc(buh.StartBlobUpload), + "GET": http.HandlerFunc(buh.GetUploadStatus), + "HEAD": http.HandlerFunc(buh.GetUploadStatus), + "PATCH": http.HandlerFunc(buh.PatchBlobData), + "PUT": http.HandlerFunc(buh.PutBlobUploadComplete), + "DELETE": http.HandlerFunc(buh.CancelBlobUpload), + }) + + if buh.UUID != "" { + state, err := hmacKey(ctx.Config.HTTP.Secret).unpackUploadState(r.FormValue("_state")) + if err != nil { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + ctxu.GetLogger(ctx).Infof("error resolving upload: %v", err) + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + }) + } + buh.State = state + + if state.Name != ctx.Repository.Name() { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + ctxu.GetLogger(ctx).Infof("mismatched repository name in upload state: %q != %q", state.Name, buh.Repository.Name()) + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + }) + } + + if state.UUID != buh.UUID { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + ctxu.GetLogger(ctx).Infof("mismatched uuid in upload state: %q != %q", state.UUID, buh.UUID) + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + }) + } + + blobs := ctx.Repository.Blobs(buh) + upload, err := blobs.Resume(buh, buh.UUID) + if err != nil { + ctxu.GetLogger(ctx).Errorf("error resolving upload: %v", err) + if err == distribution.ErrBlobUploadUnknown { + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeBlobUploadUnknown, err) + }) + } + + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusInternalServerError) + buh.Errors.Push(v2.ErrorCodeUnknown, err) + }) + } + buh.Upload = upload + + if state.Offset > 0 { + // Seek the blob upload to the correct spot if it's non-zero. + // These error conditions should be rare and demonstrate really + // problems. We basically cancel the upload and tell the client to + // start over. + if nn, err := upload.Seek(buh.State.Offset, os.SEEK_SET); err != nil { + defer upload.Close() + ctxu.GetLogger(ctx).Infof("error seeking blob upload: %v", err) + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + upload.Cancel(buh) + }) + } else if nn != buh.State.Offset { + defer upload.Close() + ctxu.GetLogger(ctx).Infof("seek to wrong offest: %d != %d", nn, buh.State.Offset) + return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + upload.Cancel(buh) + }) + } + } + + handler = closeResources(handler, buh.Upload) + } + + return handler +} + +// blobUploadHandler handles the http blob upload process. +type blobUploadHandler struct { + *Context + + // UUID identifies the upload instance for the current request. Using UUID + // to key blob writers since this implementation uses UUIDs. + UUID string + + Upload distribution.BlobWriter + + State blobUploadState +} + +// StartBlobUpload begins the blob upload process and allocates a server-side +// blob writer session. +func (buh *blobUploadHandler) StartBlobUpload(w http.ResponseWriter, r *http.Request) { + blobs := buh.Repository.Blobs(buh) + upload, err := blobs.Create(buh) + if err != nil { + w.WriteHeader(http.StatusInternalServerError) // Error conditions here? + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + buh.Upload = upload + defer buh.Upload.Close() + + if err := buh.blobUploadResponse(w, r, true); err != nil { + w.WriteHeader(http.StatusInternalServerError) // Error conditions here? + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + w.Header().Set("Docker-Upload-UUID", buh.Upload.ID()) + w.WriteHeader(http.StatusAccepted) +} + +// GetUploadStatus returns the status of a given upload, identified by id. +func (buh *blobUploadHandler) GetUploadStatus(w http.ResponseWriter, r *http.Request) { + if buh.Upload == nil { + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) + return + } + + // TODO(dmcgowan): Set last argument to false in blobUploadResponse when + // resumable upload is supported. This will enable returning a non-zero + // range for clients to begin uploading at an offset. + if err := buh.blobUploadResponse(w, r, true); err != nil { + w.WriteHeader(http.StatusInternalServerError) // Error conditions here? + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + w.Header().Set("Docker-Upload-UUID", buh.UUID) + w.WriteHeader(http.StatusNoContent) +} + +// PatchBlobData writes data to an upload. +func (buh *blobUploadHandler) PatchBlobData(w http.ResponseWriter, r *http.Request) { + if buh.Upload == nil { + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) + return + } + + ct := r.Header.Get("Content-Type") + if ct != "" && ct != "application/octet-stream" { + w.WriteHeader(http.StatusBadRequest) + // TODO(dmcgowan): encode error + return + } + + // TODO(dmcgowan): support Content-Range header to seek and write range + + // Copy the data + if _, err := io.Copy(buh.Upload, r.Body); err != nil { + ctxu.GetLogger(buh).Errorf("unknown error copying into upload: %v", err) + w.WriteHeader(http.StatusInternalServerError) + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + if err := buh.blobUploadResponse(w, r, false); err != nil { + w.WriteHeader(http.StatusInternalServerError) // Error conditions here? + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + w.WriteHeader(http.StatusAccepted) +} + +// PutBlobUploadComplete takes the final request of a blob upload. The +// request may include all the blob data or no blob data. Any data +// provided is received and verified. If successful, the blob is linked +// into the blob store and 201 Created is returned with the canonical +// url of the blob. +func (buh *blobUploadHandler) PutBlobUploadComplete(w http.ResponseWriter, r *http.Request) { + if buh.Upload == nil { + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) + return + } + + dgstStr := r.FormValue("digest") // TODO(stevvooe): Support multiple digest parameters! + + if dgstStr == "" { + // no digest? return error, but allow retry. + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeDigestInvalid, "digest missing") + return + } + + dgst, err := digest.ParseDigest(dgstStr) + if err != nil { + // no digest? return error, but allow retry. + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeDigestInvalid, "digest parsing failed") + return + } + + // Read in the data, if any. + if _, err := io.Copy(buh.Upload, r.Body); err != nil { + ctxu.GetLogger(buh).Errorf("unknown error copying into upload: %v", err) + w.WriteHeader(http.StatusInternalServerError) + buh.Errors.Push(v2.ErrorCodeUnknown, err) + return + } + + desc, err := buh.Upload.Commit(buh, distribution.Descriptor{ + Digest: dgst, + + // TODO(stevvooe): This isn't wildly important yet, but we should + // really set the length and mediatype. For now, we can let the + // backend take care of this. + }) + + if err != nil { + switch err := err.(type) { + case distribution.ErrBlobInvalidDigest: + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeDigestInvalid, err) + default: + switch err { + case distribution.ErrBlobInvalidLength, distribution.ErrBlobDigestUnsupported: + w.WriteHeader(http.StatusBadRequest) + buh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) + default: + ctxu.GetLogger(buh).Errorf("unknown error completing upload: %#v", err) + w.WriteHeader(http.StatusInternalServerError) + buh.Errors.Push(v2.ErrorCodeUnknown, err) + } + + } + + // Clean up the backend blob data if there was an error. + if err := buh.Upload.Cancel(buh); err != nil { + // If the cleanup fails, all we can do is observe and report. + ctxu.GetLogger(buh).Errorf("error canceling upload after error: %v", err) + } + + return + } + + // Build our canonical blob url + blobURL, err := buh.urlBuilder.BuildBlobURL(buh.Repository.Name(), desc.Digest) + if err != nil { + buh.Errors.Push(v2.ErrorCodeUnknown, err) + w.WriteHeader(http.StatusInternalServerError) + return + } + + w.Header().Set("Location", blobURL) + w.Header().Set("Content-Length", "0") + w.Header().Set("Docker-Content-Digest", desc.Digest.String()) + w.WriteHeader(http.StatusCreated) +} + +// CancelBlobUpload cancels an in-progress upload of a blob. +func (buh *blobUploadHandler) CancelBlobUpload(w http.ResponseWriter, r *http.Request) { + if buh.Upload == nil { + w.WriteHeader(http.StatusNotFound) + buh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) + return + } + + w.Header().Set("Docker-Upload-UUID", buh.UUID) + if err := buh.Upload.Cancel(buh); err != nil { + ctxu.GetLogger(buh).Errorf("error encountered canceling upload: %v", err) + w.WriteHeader(http.StatusInternalServerError) + buh.Errors.PushErr(err) + } + + w.WriteHeader(http.StatusNoContent) +} + +// blobUploadResponse provides a standard request for uploading blobs and +// chunk responses. This sets the correct headers but the response status is +// left to the caller. The fresh argument is used to ensure that new blob +// uploads always start at a 0 offset. This allows disabling resumable push by +// always returning a 0 offset on check status. +func (buh *blobUploadHandler) blobUploadResponse(w http.ResponseWriter, r *http.Request, fresh bool) error { + + var offset int64 + if !fresh { + var err error + offset, err = buh.Upload.Seek(0, os.SEEK_CUR) + if err != nil { + ctxu.GetLogger(buh).Errorf("unable get current offset of blob upload: %v", err) + return err + } + } + + // TODO(stevvooe): Need a better way to manage the upload state automatically. + buh.State.Name = buh.Repository.Name() + buh.State.UUID = buh.Upload.ID() + buh.State.Offset = offset + buh.State.StartedAt = buh.Upload.StartedAt() + + token, err := hmacKey(buh.Config.HTTP.Secret).packUploadState(buh.State) + if err != nil { + ctxu.GetLogger(buh).Infof("error building upload state token: %s", err) + return err + } + + uploadURL, err := buh.urlBuilder.BuildBlobUploadChunkURL( + buh.Repository.Name(), buh.Upload.ID(), + url.Values{ + "_state": []string{token}, + }) + if err != nil { + ctxu.GetLogger(buh).Infof("error building upload url: %s", err) + return err + } + + endRange := offset + if endRange > 0 { + endRange = endRange - 1 + } + + w.Header().Set("Docker-Upload-UUID", buh.UUID) + w.Header().Set("Location", uploadURL) + w.Header().Set("Content-Length", "0") + w.Header().Set("Range", fmt.Sprintf("0-%d", endRange)) + + return nil +} diff --git a/registry/handlers/hmac.go b/registry/handlers/hmac.go index e17ececa2..1725d240b 100644 --- a/registry/handlers/hmac.go +++ b/registry/handlers/hmac.go @@ -9,9 +9,9 @@ import ( "time" ) -// layerUploadState captures the state serializable state of the layer upload. -type layerUploadState struct { - // name is the primary repository under which the layer will be linked. +// blobUploadState captures the state serializable state of the blob upload. +type blobUploadState struct { + // name is the primary repository under which the blob will be linked. Name string // UUID identifies the upload. @@ -26,10 +26,10 @@ type layerUploadState struct { type hmacKey string -// unpackUploadState unpacks and validates the layer upload state from the +// unpackUploadState unpacks and validates the blob upload state from the // token, using the hmacKey secret. -func (secret hmacKey) unpackUploadState(token string) (layerUploadState, error) { - var state layerUploadState +func (secret hmacKey) unpackUploadState(token string) (blobUploadState, error) { + var state blobUploadState tokenBytes, err := base64.URLEncoding.DecodeString(token) if err != nil { @@ -59,7 +59,7 @@ func (secret hmacKey) unpackUploadState(token string) (layerUploadState, error) // packUploadState packs the upload state signed with and hmac digest using // the hmacKey secret, encoding to url safe base64. The resulting token can be // used to share data with minimized risk of external tampering. -func (secret hmacKey) packUploadState(lus layerUploadState) (string, error) { +func (secret hmacKey) packUploadState(lus blobUploadState) (string, error) { mac := hmac.New(sha256.New, []byte(secret)) p, err := json.Marshal(lus) if err != nil { diff --git a/registry/handlers/hmac_test.go b/registry/handlers/hmac_test.go index cce2cd492..366c7279e 100644 --- a/registry/handlers/hmac_test.go +++ b/registry/handlers/hmac_test.go @@ -2,7 +2,7 @@ package handlers import "testing" -var layerUploadStates = []layerUploadState{ +var blobUploadStates = []blobUploadState{ { Name: "hello", UUID: "abcd-1234-qwer-0987", @@ -45,7 +45,7 @@ var secrets = []string{ func TestLayerUploadTokens(t *testing.T) { secret := hmacKey("supersecret") - for _, testcase := range layerUploadStates { + for _, testcase := range blobUploadStates { token, err := secret.packUploadState(testcase) if err != nil { t.Fatal(err) @@ -56,7 +56,7 @@ func TestLayerUploadTokens(t *testing.T) { t.Fatal(err) } - assertLayerUploadStateEquals(t, testcase, lus) + assertBlobUploadStateEquals(t, testcase, lus) } } @@ -68,7 +68,7 @@ func TestHMACValidation(t *testing.T) { secret2 := hmacKey(secret) badSecret := hmacKey("DifferentSecret") - for _, testcase := range layerUploadStates { + for _, testcase := range blobUploadStates { token, err := secret1.packUploadState(testcase) if err != nil { t.Fatal(err) @@ -79,7 +79,7 @@ func TestHMACValidation(t *testing.T) { t.Fatal(err) } - assertLayerUploadStateEquals(t, testcase, lus) + assertBlobUploadStateEquals(t, testcase, lus) _, err = badSecret.unpackUploadState(token) if err == nil { @@ -104,7 +104,7 @@ func TestHMACValidation(t *testing.T) { } } -func assertLayerUploadStateEquals(t *testing.T, expected layerUploadState, received layerUploadState) { +func assertBlobUploadStateEquals(t *testing.T, expected blobUploadState, received blobUploadState) { if expected.Name != received.Name { t.Fatalf("Expected Name=%q, Received Name=%q", expected.Name, received.Name) } diff --git a/registry/handlers/images.go b/registry/handlers/images.go index 174bd3d94..45029da51 100644 --- a/registry/handlers/images.go +++ b/registry/handlers/images.go @@ -136,14 +136,12 @@ func (imh *imageManifestHandler) PutImageManifest(w http.ResponseWriter, r *http case distribution.ErrManifestVerification: for _, verificationError := range err { switch verificationError := verificationError.(type) { - case distribution.ErrUnknownLayer: - imh.Errors.Push(v2.ErrorCodeBlobUnknown, verificationError.FSLayer) + case distribution.ErrManifestBlobUnknown: + imh.Errors.Push(v2.ErrorCodeBlobUnknown, verificationError.Digest) case distribution.ErrManifestUnverified: imh.Errors.Push(v2.ErrorCodeManifestUnverified) default: if verificationError == digest.ErrDigestInvalidFormat { - // TODO(stevvooe): We need to really need to move all - // errors to types. Its much more straightforward. imh.Errors.Push(v2.ErrorCodeDigestInvalid) } else { imh.Errors.PushErr(verificationError) diff --git a/registry/handlers/layer.go b/registry/handlers/layer.go deleted file mode 100644 index 13ee8560c..000000000 --- a/registry/handlers/layer.go +++ /dev/null @@ -1,74 +0,0 @@ -package handlers - -import ( - "net/http" - - "github.com/docker/distribution" - "github.com/docker/distribution/context" - "github.com/docker/distribution/digest" - "github.com/docker/distribution/registry/api/v2" - "github.com/gorilla/handlers" -) - -// layerDispatcher uses the request context to build a layerHandler. -func layerDispatcher(ctx *Context, r *http.Request) http.Handler { - dgst, err := getDigest(ctx) - if err != nil { - - if err == errDigestNotAvailable { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusNotFound) - ctx.Errors.Push(v2.ErrorCodeDigestInvalid, err) - }) - } - - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - ctx.Errors.Push(v2.ErrorCodeDigestInvalid, err) - }) - } - - layerHandler := &layerHandler{ - Context: ctx, - Digest: dgst, - } - - return handlers.MethodHandler{ - "GET": http.HandlerFunc(layerHandler.GetLayer), - "HEAD": http.HandlerFunc(layerHandler.GetLayer), - } -} - -// layerHandler serves http layer requests. -type layerHandler struct { - *Context - - Digest digest.Digest -} - -// GetLayer fetches the binary data from backend storage returns it in the -// response. -func (lh *layerHandler) GetLayer(w http.ResponseWriter, r *http.Request) { - context.GetLogger(lh).Debug("GetImageLayer") - layers := lh.Repository.Layers() - layer, err := layers.Fetch(lh.Digest) - - if err != nil { - switch err := err.(type) { - case distribution.ErrUnknownLayer: - w.WriteHeader(http.StatusNotFound) - lh.Errors.Push(v2.ErrorCodeBlobUnknown, err.FSLayer) - default: - lh.Errors.Push(v2.ErrorCodeUnknown, err) - } - return - } - - handler, err := layer.Handler(r) - if err != nil { - context.GetLogger(lh).Debugf("unexpected error getting layer HTTP handler: %s", err) - lh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - handler.ServeHTTP(w, r) -} diff --git a/registry/handlers/layerupload.go b/registry/handlers/layerupload.go deleted file mode 100644 index 1591d98dc..000000000 --- a/registry/handlers/layerupload.go +++ /dev/null @@ -1,344 +0,0 @@ -package handlers - -import ( - "fmt" - "io" - "net/http" - "net/url" - "os" - - "github.com/docker/distribution" - ctxu "github.com/docker/distribution/context" - "github.com/docker/distribution/digest" - "github.com/docker/distribution/registry/api/v2" - "github.com/gorilla/handlers" -) - -// layerUploadDispatcher constructs and returns the layer upload handler for -// the given request context. -func layerUploadDispatcher(ctx *Context, r *http.Request) http.Handler { - luh := &layerUploadHandler{ - Context: ctx, - UUID: getUploadUUID(ctx), - } - - handler := http.Handler(handlers.MethodHandler{ - "POST": http.HandlerFunc(luh.StartLayerUpload), - "GET": http.HandlerFunc(luh.GetUploadStatus), - "HEAD": http.HandlerFunc(luh.GetUploadStatus), - "PATCH": http.HandlerFunc(luh.PatchLayerData), - "PUT": http.HandlerFunc(luh.PutLayerUploadComplete), - "DELETE": http.HandlerFunc(luh.CancelLayerUpload), - }) - - if luh.UUID != "" { - state, err := hmacKey(ctx.Config.HTTP.Secret).unpackUploadState(r.FormValue("_state")) - if err != nil { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - ctxu.GetLogger(ctx).Infof("error resolving upload: %v", err) - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) - }) - } - luh.State = state - - if state.Name != ctx.Repository.Name() { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - ctxu.GetLogger(ctx).Infof("mismatched repository name in upload state: %q != %q", state.Name, luh.Repository.Name()) - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) - }) - } - - if state.UUID != luh.UUID { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - ctxu.GetLogger(ctx).Infof("mismatched uuid in upload state: %q != %q", state.UUID, luh.UUID) - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) - }) - } - - layers := ctx.Repository.Layers() - upload, err := layers.Resume(luh.UUID) - if err != nil { - ctxu.GetLogger(ctx).Errorf("error resolving upload: %v", err) - if err == distribution.ErrLayerUploadUnknown { - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeBlobUploadUnknown, err) - }) - } - - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusInternalServerError) - luh.Errors.Push(v2.ErrorCodeUnknown, err) - }) - } - luh.Upload = upload - - if state.Offset > 0 { - // Seek the layer upload to the correct spot if it's non-zero. - // These error conditions should be rare and demonstrate really - // problems. We basically cancel the upload and tell the client to - // start over. - if nn, err := upload.Seek(luh.State.Offset, os.SEEK_SET); err != nil { - defer upload.Close() - ctxu.GetLogger(ctx).Infof("error seeking layer upload: %v", err) - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) - upload.Cancel() - }) - } else if nn != luh.State.Offset { - defer upload.Close() - ctxu.GetLogger(ctx).Infof("seek to wrong offest: %d != %d", nn, luh.State.Offset) - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeBlobUploadInvalid, err) - upload.Cancel() - }) - } - } - - handler = closeResources(handler, luh.Upload) - } - - return handler -} - -// layerUploadHandler handles the http layer upload process. -type layerUploadHandler struct { - *Context - - // UUID identifies the upload instance for the current request. - UUID string - - Upload distribution.LayerUpload - - State layerUploadState -} - -// StartLayerUpload begins the layer upload process and allocates a server- -// side upload session. -func (luh *layerUploadHandler) StartLayerUpload(w http.ResponseWriter, r *http.Request) { - layers := luh.Repository.Layers() - upload, err := layers.Upload() - if err != nil { - w.WriteHeader(http.StatusInternalServerError) // Error conditions here? - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - luh.Upload = upload - defer luh.Upload.Close() - - if err := luh.layerUploadResponse(w, r, true); err != nil { - w.WriteHeader(http.StatusInternalServerError) // Error conditions here? - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - w.Header().Set("Docker-Upload-UUID", luh.Upload.UUID()) - w.WriteHeader(http.StatusAccepted) -} - -// GetUploadStatus returns the status of a given upload, identified by uuid. -func (luh *layerUploadHandler) GetUploadStatus(w http.ResponseWriter, r *http.Request) { - if luh.Upload == nil { - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) - return - } - - // TODO(dmcgowan): Set last argument to false in layerUploadResponse when - // resumable upload is supported. This will enable returning a non-zero - // range for clients to begin uploading at an offset. - if err := luh.layerUploadResponse(w, r, true); err != nil { - w.WriteHeader(http.StatusInternalServerError) // Error conditions here? - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - w.Header().Set("Docker-Upload-UUID", luh.UUID) - w.WriteHeader(http.StatusNoContent) -} - -// PatchLayerData writes data to an upload. -func (luh *layerUploadHandler) PatchLayerData(w http.ResponseWriter, r *http.Request) { - if luh.Upload == nil { - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) - return - } - - ct := r.Header.Get("Content-Type") - if ct != "" && ct != "application/octet-stream" { - w.WriteHeader(http.StatusBadRequest) - // TODO(dmcgowan): encode error - return - } - - // TODO(dmcgowan): support Content-Range header to seek and write range - - // Copy the data - if _, err := io.Copy(luh.Upload, r.Body); err != nil { - ctxu.GetLogger(luh).Errorf("unknown error copying into upload: %v", err) - w.WriteHeader(http.StatusInternalServerError) - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - if err := luh.layerUploadResponse(w, r, false); err != nil { - w.WriteHeader(http.StatusInternalServerError) // Error conditions here? - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - w.WriteHeader(http.StatusAccepted) -} - -// PutLayerUploadComplete takes the final request of a layer upload. The -// request may include all the layer data or no layer data. Any data -// provided is received and verified. If successful, the layer is linked -// into the blob store and 201 Created is returned with the canonical -// url of the layer. -func (luh *layerUploadHandler) PutLayerUploadComplete(w http.ResponseWriter, r *http.Request) { - if luh.Upload == nil { - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) - return - } - - dgstStr := r.FormValue("digest") // TODO(stevvooe): Support multiple digest parameters! - - if dgstStr == "" { - // no digest? return error, but allow retry. - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeDigestInvalid, "digest missing") - return - } - - dgst, err := digest.ParseDigest(dgstStr) - if err != nil { - // no digest? return error, but allow retry. - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeDigestInvalid, "digest parsing failed") - return - } - - // TODO(stevvooe): Consider checking the error on this copy. - // Theoretically, problems should be detected during verification but we - // may miss a root cause. - - // Read in the data, if any. - if _, err := io.Copy(luh.Upload, r.Body); err != nil { - ctxu.GetLogger(luh).Errorf("unknown error copying into upload: %v", err) - w.WriteHeader(http.StatusInternalServerError) - luh.Errors.Push(v2.ErrorCodeUnknown, err) - return - } - - layer, err := luh.Upload.Finish(dgst) - if err != nil { - switch err := err.(type) { - case distribution.ErrLayerInvalidDigest: - w.WriteHeader(http.StatusBadRequest) - luh.Errors.Push(v2.ErrorCodeDigestInvalid, err) - default: - ctxu.GetLogger(luh).Errorf("unknown error completing upload: %#v", err) - w.WriteHeader(http.StatusInternalServerError) - luh.Errors.Push(v2.ErrorCodeUnknown, err) - } - - // Clean up the backend layer data if there was an error. - if err := luh.Upload.Cancel(); err != nil { - // If the cleanup fails, all we can do is observe and report. - ctxu.GetLogger(luh).Errorf("error canceling upload after error: %v", err) - } - - return - } - - // Build our canonical layer url - layerURL, err := luh.urlBuilder.BuildBlobURL(luh.Repository.Name(), layer.Digest()) - if err != nil { - luh.Errors.Push(v2.ErrorCodeUnknown, err) - w.WriteHeader(http.StatusInternalServerError) - return - } - - w.Header().Set("Location", layerURL) - w.Header().Set("Content-Length", "0") - w.Header().Set("Docker-Content-Digest", layer.Digest().String()) - w.WriteHeader(http.StatusCreated) -} - -// CancelLayerUpload cancels an in-progress upload of a layer. -func (luh *layerUploadHandler) CancelLayerUpload(w http.ResponseWriter, r *http.Request) { - if luh.Upload == nil { - w.WriteHeader(http.StatusNotFound) - luh.Errors.Push(v2.ErrorCodeBlobUploadUnknown) - return - } - - w.Header().Set("Docker-Upload-UUID", luh.UUID) - if err := luh.Upload.Cancel(); err != nil { - ctxu.GetLogger(luh).Errorf("error encountered canceling upload: %v", err) - w.WriteHeader(http.StatusInternalServerError) - luh.Errors.PushErr(err) - } - - w.WriteHeader(http.StatusNoContent) -} - -// layerUploadResponse provides a standard request for uploading layers and -// chunk responses. This sets the correct headers but the response status is -// left to the caller. The fresh argument is used to ensure that new layer -// uploads always start at a 0 offset. This allows disabling resumable push -// by always returning a 0 offset on check status. -func (luh *layerUploadHandler) layerUploadResponse(w http.ResponseWriter, r *http.Request, fresh bool) error { - - var offset int64 - if !fresh { - var err error - offset, err = luh.Upload.Seek(0, os.SEEK_CUR) - if err != nil { - ctxu.GetLogger(luh).Errorf("unable get current offset of layer upload: %v", err) - return err - } - } - - // TODO(stevvooe): Need a better way to manage the upload state automatically. - luh.State.Name = luh.Repository.Name() - luh.State.UUID = luh.Upload.UUID() - luh.State.Offset = offset - luh.State.StartedAt = luh.Upload.StartedAt() - - token, err := hmacKey(luh.Config.HTTP.Secret).packUploadState(luh.State) - if err != nil { - ctxu.GetLogger(luh).Infof("error building upload state token: %s", err) - return err - } - - uploadURL, err := luh.urlBuilder.BuildBlobUploadChunkURL( - luh.Repository.Name(), luh.Upload.UUID(), - url.Values{ - "_state": []string{token}, - }) - if err != nil { - ctxu.GetLogger(luh).Infof("error building upload url: %s", err) - return err - } - - endRange := offset - if endRange > 0 { - endRange = endRange - 1 - } - - w.Header().Set("Docker-Upload-UUID", luh.UUID) - w.Header().Set("Location", uploadURL) - w.Header().Set("Content-Length", "0") - w.Header().Set("Range", fmt.Sprintf("0-%d", endRange)) - - return nil -} diff --git a/registry/storage/layer_test.go b/registry/storage/blob_test.go similarity index 56% rename from registry/storage/layer_test.go rename to registry/storage/blob_test.go index 2ea998131..6843922ac 100644 --- a/registry/storage/layer_test.go +++ b/registry/storage/blob_test.go @@ -13,14 +13,13 @@ import ( "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/registry/storage/cache" - storagedriver "github.com/docker/distribution/registry/storage/driver" "github.com/docker/distribution/registry/storage/driver/inmemory" "github.com/docker/distribution/testutil" ) -// TestSimpleLayerUpload covers the layer upload process, exercising common +// TestSimpleBlobUpload covers the blob upload process, exercising common // error paths that might be seen during an upload. -func TestSimpleLayerUpload(t *testing.T) { +func TestSimpleBlobUpload(t *testing.T) { randomDataReader, tarSumStr, err := testutil.CreateRandomTarFile() if err != nil { @@ -36,35 +35,35 @@ func TestSimpleLayerUpload(t *testing.T) { ctx := context.Background() imageName := "foo/bar" driver := inmemory.New() - registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryLayerInfoCache()) + registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryBlobDescriptorCacheProvider()) repository, err := registry.Repository(ctx, imageName) if err != nil { t.Fatalf("unexpected error getting repo: %v", err) } - ls := repository.Layers() + bs := repository.Blobs(ctx) h := sha256.New() rd := io.TeeReader(randomDataReader, h) - layerUpload, err := ls.Upload() + blobUpload, err := bs.Create(ctx) if err != nil { t.Fatalf("unexpected error starting layer upload: %s", err) } // Cancel the upload then restart it - if err := layerUpload.Cancel(); err != nil { + if err := blobUpload.Cancel(ctx); err != nil { t.Fatalf("unexpected error during upload cancellation: %v", err) } // Do a resume, get unknown upload - layerUpload, err = ls.Resume(layerUpload.UUID()) - if err != distribution.ErrLayerUploadUnknown { + blobUpload, err = bs.Resume(ctx, blobUpload.ID()) + if err != distribution.ErrBlobUploadUnknown { t.Fatalf("unexpected error resuming upload, should be unkown: %v", err) } // Restart! - layerUpload, err = ls.Upload() + blobUpload, err = bs.Create(ctx) if err != nil { t.Fatalf("unexpected error starting layer upload: %s", err) } @@ -75,7 +74,7 @@ func TestSimpleLayerUpload(t *testing.T) { t.Fatalf("error getting seeker size of random data: %v", err) } - nn, err := io.Copy(layerUpload, rd) + nn, err := io.Copy(blobUpload, rd) if err != nil { t.Fatalf("unexpected error uploading layer data: %v", err) } @@ -84,46 +83,51 @@ func TestSimpleLayerUpload(t *testing.T) { t.Fatalf("layer data write incomplete") } - offset, err := layerUpload.Seek(0, os.SEEK_CUR) + offset, err := blobUpload.Seek(0, os.SEEK_CUR) if err != nil { t.Fatalf("unexpected error seeking layer upload: %v", err) } if offset != nn { - t.Fatalf("layerUpload not updated with correct offset: %v != %v", offset, nn) + t.Fatalf("blobUpload not updated with correct offset: %v != %v", offset, nn) } - layerUpload.Close() + blobUpload.Close() // Do a resume, for good fun - layerUpload, err = ls.Resume(layerUpload.UUID()) + blobUpload, err = bs.Resume(ctx, blobUpload.ID()) if err != nil { t.Fatalf("unexpected error resuming upload: %v", err) } sha256Digest := digest.NewDigest("sha256", h) - layer, err := layerUpload.Finish(dgst) - + desc, err := blobUpload.Commit(ctx, distribution.Descriptor{Digest: dgst}) if err != nil { t.Fatalf("unexpected error finishing layer upload: %v", err) } // After finishing an upload, it should no longer exist. - if _, err := ls.Resume(layerUpload.UUID()); err != distribution.ErrLayerUploadUnknown { + if _, err := bs.Resume(ctx, blobUpload.ID()); err != distribution.ErrBlobUploadUnknown { t.Fatalf("expected layer upload to be unknown, got %v", err) } // Test for existence. - exists, err := ls.Exists(layer.Digest()) + statDesc, err := bs.Stat(ctx, desc.Digest) if err != nil { - t.Fatalf("unexpected error checking for existence: %v", err) + t.Fatalf("unexpected error checking for existence: %v, %#v", err, bs) } - if !exists { - t.Fatalf("layer should now exist") + if statDesc != desc { + t.Fatalf("descriptors not equal: %v != %v", statDesc, desc) } + rc, err := bs.Open(ctx, desc.Digest) + if err != nil { + t.Fatalf("unexpected error opening blob for read: %v", err) + } + defer rc.Close() + h.Reset() - nn, err = io.Copy(h, layer) + nn, err = io.Copy(h, rc) if err != nil { t.Fatalf("error reading layer: %v", err) } @@ -137,21 +141,21 @@ func TestSimpleLayerUpload(t *testing.T) { } } -// TestSimpleLayerRead just creates a simple layer file and ensures that basic +// TestSimpleBlobRead just creates a simple blob file and ensures that basic // open, read, seek, read works. More specific edge cases should be covered in // other tests. -func TestSimpleLayerRead(t *testing.T) { +func TestSimpleBlobRead(t *testing.T) { ctx := context.Background() imageName := "foo/bar" driver := inmemory.New() - registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryLayerInfoCache()) + registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryBlobDescriptorCacheProvider()) repository, err := registry.Repository(ctx, imageName) if err != nil { t.Fatalf("unexpected error getting repo: %v", err) } - ls := repository.Layers() + bs := repository.Blobs(ctx) - randomLayerReader, tarSumStr, err := testutil.CreateRandomTarFile() + randomLayerReader, tarSumStr, err := testutil.CreateRandomTarFile() // TODO(stevvooe): Consider using just a random string. if err != nil { t.Fatalf("error creating random data: %v", err) } @@ -159,31 +163,14 @@ func TestSimpleLayerRead(t *testing.T) { dgst := digest.Digest(tarSumStr) // Test for existence. - exists, err := ls.Exists(dgst) - if err != nil { - t.Fatalf("unexpected error checking for existence: %v", err) + desc, err := bs.Stat(ctx, dgst) + if err != distribution.ErrBlobUnknown { + t.Fatalf("expected not found error when testing for existence: %v", err) } - if exists { - t.Fatalf("layer should not exist") - } - - // Try to get the layer and make sure we get a not found error - layer, err := ls.Fetch(dgst) - if err == nil { - t.Fatalf("error expected fetching unknown layer") - } - - switch err.(type) { - case distribution.ErrUnknownLayer: - err = nil - default: - t.Fatalf("unexpected error fetching non-existent layer: %v", err) - } - - randomLayerDigest, err := writeTestLayer(driver, defaultPathMapper, imageName, dgst, randomLayerReader) - if err != nil { - t.Fatalf("unexpected error writing test layer: %v", err) + rc, err := bs.Open(ctx, dgst) + if err != distribution.ErrBlobUnknown { + t.Fatalf("expected not found error when opening non-existent blob: %v", err) } randomLayerSize, err := seekerSize(randomLayerReader) @@ -191,45 +178,57 @@ func TestSimpleLayerRead(t *testing.T) { t.Fatalf("error getting seeker size for random layer: %v", err) } - layer, err = ls.Fetch(dgst) + descBefore := distribution.Descriptor{Digest: dgst, MediaType: "application/octet-stream", Length: randomLayerSize} + t.Logf("desc: %v", descBefore) + + desc, err = addBlob(ctx, bs, descBefore, randomLayerReader) if err != nil { - t.Fatal(err) + t.Fatalf("error adding blob to blobservice: %v", err) } - defer layer.Close() + + if desc.Length != randomLayerSize { + t.Fatalf("committed blob has incorrect length: %v != %v", desc.Length, randomLayerSize) + } + + rc, err = bs.Open(ctx, desc.Digest) // note that we are opening with original digest. + if err != nil { + t.Fatalf("error opening blob with %v: %v", dgst, err) + } + defer rc.Close() // Now check the sha digest and ensure its the same h := sha256.New() - nn, err := io.Copy(h, layer) - if err != nil && err != io.EOF { + nn, err := io.Copy(h, rc) + if err != nil { t.Fatalf("unexpected error copying to hash: %v", err) } if nn != randomLayerSize { - t.Fatalf("stored incorrect number of bytes in layer: %d != %d", nn, randomLayerSize) + t.Fatalf("stored incorrect number of bytes in blob: %d != %d", nn, randomLayerSize) } sha256Digest := digest.NewDigest("sha256", h) - if sha256Digest != randomLayerDigest { - t.Fatalf("fetched digest does not match: %q != %q", sha256Digest, randomLayerDigest) + if sha256Digest != desc.Digest { + t.Fatalf("fetched digest does not match: %q != %q", sha256Digest, desc.Digest) } - // Now seek back the layer, read the whole thing and check against randomLayerData - offset, err := layer.Seek(0, os.SEEK_SET) + // Now seek back the blob, read the whole thing and check against randomLayerData + offset, err := rc.Seek(0, os.SEEK_SET) if err != nil { - t.Fatalf("error seeking layer: %v", err) + t.Fatalf("error seeking blob: %v", err) } if offset != 0 { t.Fatalf("seek failed: expected 0 offset, got %d", offset) } - p, err := ioutil.ReadAll(layer) + p, err := ioutil.ReadAll(rc) if err != nil { - t.Fatalf("error reading all of layer: %v", err) + t.Fatalf("error reading all of blob: %v", err) } if len(p) != int(randomLayerSize) { - t.Fatalf("layer data read has different length: %v != %v", len(p), randomLayerSize) + t.Fatalf("blob data read has different length: %v != %v", len(p), randomLayerSize) } // Reset the randomLayerReader and read back the buffer @@ -253,19 +252,26 @@ func TestLayerUploadZeroLength(t *testing.T) { ctx := context.Background() imageName := "foo/bar" driver := inmemory.New() - registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryLayerInfoCache()) + registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryBlobDescriptorCacheProvider()) repository, err := registry.Repository(ctx, imageName) if err != nil { t.Fatalf("unexpected error getting repo: %v", err) } - ls := repository.Layers() + bs := repository.Blobs(ctx) - upload, err := ls.Upload() + wr, err := bs.Create(ctx) if err != nil { t.Fatalf("unexpected error starting upload: %v", err) } - io.Copy(upload, bytes.NewReader([]byte{})) + nn, err := io.Copy(wr, bytes.NewReader([]byte{})) + if err != nil { + t.Fatalf("error copying into blob writer: %v", err) + } + + if nn != 0 { + t.Fatalf("unexpected number of bytes copied: %v > 0", nn) + } dgst, err := digest.FromReader(bytes.NewReader([]byte{})) if err != nil { @@ -277,37 +283,16 @@ func TestLayerUploadZeroLength(t *testing.T) { t.Fatalf("digest not as expected: %v != %v", dgst, digest.DigestTarSumV1EmptyTar) } - layer, err := upload.Finish(dgst) + desc, err := wr.Commit(ctx, distribution.Descriptor{Digest: dgst}) if err != nil { - t.Fatalf("unexpected error finishing upload: %v", err) + t.Fatalf("unexpected error committing write: %v", err) } - if layer.Digest() != dgst { - t.Fatalf("unexpected digest: %v != %v", layer.Digest(), dgst) + if desc.Digest != dgst { + t.Fatalf("unexpected digest: %v != %v", desc.Digest, dgst) } } -// writeRandomLayer creates a random layer under name and tarSum using driver -// and pathMapper. An io.ReadSeeker with the data is returned, along with the -// sha256 hex digest. -func writeRandomLayer(driver storagedriver.StorageDriver, pathMapper *pathMapper, name string) (rs io.ReadSeeker, tarSum digest.Digest, sha256digest digest.Digest, err error) { - reader, tarSumStr, err := testutil.CreateRandomTarFile() - if err != nil { - return nil, "", "", err - } - - tarSum = digest.Digest(tarSumStr) - - // Now, actually create the layer. - randomLayerDigest, err := writeTestLayer(driver, pathMapper, name, tarSum, ioutil.NopCloser(reader)) - - if _, err := reader.Seek(0, os.SEEK_SET); err != nil { - return nil, "", "", err - } - - return reader, tarSum, randomLayerDigest, err -} - // seekerSize seeks to the end of seeker, checks the size and returns it to // the original state, returning the size. The state of the seeker should be // treated as unknown if an error is returned. @@ -334,46 +319,20 @@ func seekerSize(seeker io.ReadSeeker) (int64, error) { return end, nil } -// createTestLayer creates a simple test layer in the provided driver under -// tarsum dgst, returning the sha256 digest location. This is implemented -// piecemeal and should probably be replaced by the uploader when it's ready. -func writeTestLayer(driver storagedriver.StorageDriver, pathMapper *pathMapper, name string, dgst digest.Digest, content io.Reader) (digest.Digest, error) { - h := sha256.New() - rd := io.TeeReader(content, h) - - p, err := ioutil.ReadAll(rd) - +// addBlob simply consumes the reader and inserts into the blob service, +// returning a descriptor on success. +func addBlob(ctx context.Context, bs distribution.BlobIngester, desc distribution.Descriptor, rd io.Reader) (distribution.Descriptor, error) { + wr, err := bs.Create(ctx) if err != nil { - return "", nil + return distribution.Descriptor{}, err + } + defer wr.Cancel(ctx) + + if nn, err := io.Copy(wr, rd); err != nil { + return distribution.Descriptor{}, err + } else if nn != desc.Length { + return distribution.Descriptor{}, fmt.Errorf("incorrect number of bytes copied: %v != %v", nn, desc.Length) } - blobDigestSHA := digest.NewDigest("sha256", h) - - blobPath, err := pathMapper.path(blobDataPathSpec{ - digest: dgst, - }) - - ctx := context.Background() - if err := driver.PutContent(ctx, blobPath, p); err != nil { - return "", err - } - - if err != nil { - return "", err - } - - layerLinkPath, err := pathMapper.path(layerLinkPathSpec{ - name: name, - digest: dgst, - }) - - if err != nil { - return "", err - } - - if err := driver.PutContent(ctx, layerLinkPath, []byte(dgst)); err != nil { - return "", nil - } - - return blobDigestSHA, err + return wr.Commit(ctx, desc) } diff --git a/registry/storage/blobserver.go b/registry/storage/blobserver.go new file mode 100644 index 000000000..065453e60 --- /dev/null +++ b/registry/storage/blobserver.go @@ -0,0 +1,72 @@ +package storage + +import ( + "fmt" + "net/http" + "time" + + "github.com/docker/distribution" + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + "github.com/docker/distribution/registry/storage/driver" +) + +// TODO(stevvooe): This should configurable in the future. +const blobCacheControlMaxAge = 365 * 24 * time.Hour + +// blobServer simply serves blobs from a driver instance using a path function +// to identify paths and a descriptor service to fill in metadata. +type blobServer struct { + driver driver.StorageDriver + statter distribution.BlobStatter + pathFn func(dgst digest.Digest) (string, error) +} + +func (bs *blobServer) ServeBlob(ctx context.Context, w http.ResponseWriter, r *http.Request, dgst digest.Digest) error { + desc, err := bs.statter.Stat(ctx, dgst) + if err != nil { + return err + } + + path, err := bs.pathFn(desc.Digest) + if err != nil { + return err + } + + redirectURL, err := bs.driver.URLFor(ctx, path, map[string]interface{}{"method": r.Method}) + + switch err { + case nil: + // Redirect to storage URL. + http.Redirect(w, r, redirectURL, http.StatusTemporaryRedirect) + case driver.ErrUnsupportedMethod: + // Fallback to serving the content directly. + br, err := newFileReader(ctx, bs.driver, path, desc.Length) + if err != nil { + return err + } + defer br.Close() + + w.Header().Set("ETag", desc.Digest.String()) // If-None-Match handled by ServeContent + w.Header().Set("Cache-Control", fmt.Sprintf("max-age=%.f", blobCacheControlMaxAge.Seconds())) + + if w.Header().Get("Docker-Content-Digest") == "" { + w.Header().Set("Docker-Content-Digest", desc.Digest.String()) + } + + if w.Header().Get("Content-Type") == "" { + // Set the content type if not already set. + w.Header().Set("Content-Type", desc.MediaType) + } + + if w.Header().Get("Content-Length") == "" { + // Set the content length if not already set. + w.Header().Set("Content-Length", fmt.Sprint(desc.Length)) + } + + http.ServeContent(w, r, desc.Digest.String(), time.Time{}, br) + } + + // Some unexpected error. + return err +} diff --git a/registry/storage/blobstore.go b/registry/storage/blobstore.go index c0c869290..afe428479 100644 --- a/registry/storage/blobstore.go +++ b/registry/storage/blobstore.go @@ -1,133 +1,94 @@ package storage import ( - "fmt" - + "github.com/docker/distribution" "github.com/docker/distribution/context" "github.com/docker/distribution/digest" - storagedriver "github.com/docker/distribution/registry/storage/driver" + "github.com/docker/distribution/registry/storage/driver" ) -// TODO(stevvooe): Currently, the blobStore implementation used by the -// manifest store. The layer store should be refactored to better leverage the -// blobStore, reducing duplicated code. - -// blobStore implements a generalized blob store over a driver, supporting the -// read side and link management. This object is intentionally a leaky -// abstraction, providing utility methods that support creating and traversing -// backend links. +// blobStore implements a the read side of the blob store interface over a +// driver without enforcing per-repository membership. This object is +// intentionally a leaky abstraction, providing utility methods that support +// creating and traversing backend links. type blobStore struct { - driver storagedriver.StorageDriver - pm *pathMapper - ctx context.Context + driver driver.StorageDriver + pm *pathMapper + statter distribution.BlobStatter } -// exists reports whether or not the path exists. If the driver returns error -// other than storagedriver.PathNotFound, an error may be returned. -func (bs *blobStore) exists(dgst digest.Digest) (bool, error) { - path, err := bs.path(dgst) +var _ distribution.BlobProvider = &blobStore{} - if err != nil { - return false, err - } - - ok, err := exists(bs.ctx, bs.driver, path) - if err != nil { - return false, err - } - - return ok, nil -} - -// get retrieves the blob by digest, returning it a byte slice. This should -// only be used for small objects. -func (bs *blobStore) get(dgst digest.Digest) ([]byte, error) { +// Get implements the BlobReadService.Get call. +func (bs *blobStore) Get(ctx context.Context, dgst digest.Digest) ([]byte, error) { bp, err := bs.path(dgst) if err != nil { return nil, err } - return bs.driver.GetContent(bs.ctx, bp) -} + p, err := bs.driver.GetContent(ctx, bp) + if err != nil { + switch err.(type) { + case driver.PathNotFoundError: + return nil, distribution.ErrBlobUnknown + } -// link links the path to the provided digest by writing the digest into the -// target file. -func (bs *blobStore) link(path string, dgst digest.Digest) error { - if exists, err := bs.exists(dgst); err != nil { - return err - } else if !exists { - return fmt.Errorf("cannot link non-existent blob") + return nil, err } - // The contents of the "link" file are the exact string contents of the - // digest, which is specified in that package. - return bs.driver.PutContent(bs.ctx, path, []byte(dgst)) + return p, err } -// linked reads the link at path and returns the content. -func (bs *blobStore) linked(path string) ([]byte, error) { - linked, err := bs.readlink(path) +func (bs *blobStore) Open(ctx context.Context, dgst digest.Digest) (distribution.ReadSeekCloser, error) { + desc, err := bs.statter.Stat(ctx, dgst) if err != nil { return nil, err } - return bs.get(linked) + path, err := bs.path(desc.Digest) + if err != nil { + return nil, err + } + + return newFileReader(ctx, bs.driver, path, desc.Length) } -// readlink returns the linked digest at path. -func (bs *blobStore) readlink(path string) (digest.Digest, error) { - content, err := bs.driver.GetContent(bs.ctx, path) - if err != nil { - return "", err - } - - linked, err := digest.ParseDigest(string(content)) - if err != nil { - return "", err - } - - if exists, err := bs.exists(linked); err != nil { - return "", err - } else if !exists { - return "", fmt.Errorf("link %q invalid: blob %s does not exist", path, linked) - } - - return linked, nil -} - -// resolve reads the digest link at path and returns the blob store link. -func (bs *blobStore) resolve(path string) (string, error) { - dgst, err := bs.readlink(path) - if err != nil { - return "", err - } - - return bs.path(dgst) -} - -// put stores the content p in the blob store, calculating the digest. If the +// Put stores the content p in the blob store, calculating the digest. If the // content is already present, only the digest will be returned. This should -// only be used for small objects, such as manifests. -func (bs *blobStore) put(p []byte) (digest.Digest, error) { +// only be used for small objects, such as manifests. This implemented as a convenience for other Put implementations +func (bs *blobStore) Put(ctx context.Context, mediaType string, p []byte) (distribution.Descriptor, error) { dgst, err := digest.FromBytes(p) if err != nil { - context.GetLogger(bs.ctx).Errorf("error digesting content: %v, %s", err, string(p)) - return "", err + context.GetLogger(ctx).Errorf("blobStore: error digesting content: %v, %s", err, string(p)) + return distribution.Descriptor{}, err + } + + desc, err := bs.statter.Stat(ctx, dgst) + if err == nil { + // content already present + return desc, nil + } else if err != distribution.ErrBlobUnknown { + context.GetLogger(ctx).Errorf("blobStore: error stating content (%v): %#v", dgst, err) + // real error, return it + return distribution.Descriptor{}, err } bp, err := bs.path(dgst) if err != nil { - return "", err + return distribution.Descriptor{}, err } - // If the content already exists, just return the digest. - if exists, err := bs.exists(dgst); err != nil { - return "", err - } else if exists { - return dgst, nil - } + // TODO(stevvooe): Write out mediatype here, as well. - return dgst, bs.driver.PutContent(bs.ctx, bp, p) + return distribution.Descriptor{ + Length: int64(len(p)), + + // NOTE(stevvooe): The central blob store firewalls media types from + // other users. The caller should look this up and override the value + // for the specific repository. + MediaType: "application/octet-stream", + Digest: dgst, + }, bs.driver.PutContent(ctx, bp, p) } // path returns the canonical path for the blob identified by digest. The blob @@ -144,16 +105,86 @@ func (bs *blobStore) path(dgst digest.Digest) (string, error) { return bp, nil } -// exists provides a utility method to test whether or not a path exists -func exists(ctx context.Context, driver storagedriver.StorageDriver, path string) (bool, error) { - if _, err := driver.Stat(ctx, path); err != nil { +// link links the path to the provided digest by writing the digest into the +// target file. Caller must ensure that the blob actually exists. +func (bs *blobStore) link(ctx context.Context, path string, dgst digest.Digest) error { + // The contents of the "link" file are the exact string contents of the + // digest, which is specified in that package. + return bs.driver.PutContent(ctx, path, []byte(dgst)) +} + +// readlink returns the linked digest at path. +func (bs *blobStore) readlink(ctx context.Context, path string) (digest.Digest, error) { + content, err := bs.driver.GetContent(ctx, path) + if err != nil { + return "", err + } + + linked, err := digest.ParseDigest(string(content)) + if err != nil { + return "", err + } + + return linked, nil +} + +// resolve reads the digest link at path and returns the blob store path. +func (bs *blobStore) resolve(ctx context.Context, path string) (string, error) { + dgst, err := bs.readlink(ctx, path) + if err != nil { + return "", err + } + + return bs.path(dgst) +} + +type blobStatter struct { + driver driver.StorageDriver + pm *pathMapper +} + +var _ distribution.BlobStatter = &blobStatter{} + +// Stat implements BlobStatter.Stat by returning the descriptor for the blob +// in the main blob store. If this method returns successfully, there is +// strong guarantee that the blob exists and is available. +func (bs *blobStatter) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + path, err := bs.pm.path(blobDataPathSpec{ + digest: dgst, + }) + if err != nil { + return distribution.Descriptor{}, err + } + + fi, err := bs.driver.Stat(ctx, path) + if err != nil { switch err := err.(type) { - case storagedriver.PathNotFoundError: - return false, nil + case driver.PathNotFoundError: + return distribution.Descriptor{}, distribution.ErrBlobUnknown default: - return false, err + return distribution.Descriptor{}, err } } - return true, nil + if fi.IsDir() { + // NOTE(stevvooe): This represents a corruption situation. Somehow, we + // calculated a blob path and then detected a directory. We log the + // error and then error on the side of not knowing about the blob. + context.GetLogger(ctx).Warnf("blob path should not be a directory: %q", path) + return distribution.Descriptor{}, distribution.ErrBlobUnknown + } + + // TODO(stevvooe): Add method to resolve the mediatype. We can store and + // cache a "global" media type for the blob, even if a specific repo has a + // mediatype that overrides the main one. + + return distribution.Descriptor{ + Length: fi.Size(), + + // NOTE(stevvooe): The central blob store firewalls media types from + // other users. The caller should look this up and override the value + // for the specific repository. + MediaType: "application/octet-stream", + Digest: dgst, + }, nil } diff --git a/registry/storage/blobwriter.go b/registry/storage/blobwriter.go new file mode 100644 index 000000000..a9a625b69 --- /dev/null +++ b/registry/storage/blobwriter.go @@ -0,0 +1,469 @@ +package storage + +import ( + "fmt" + "io" + "os" + "path" + "strconv" + "time" + + "github.com/Sirupsen/logrus" + "github.com/docker/distribution" + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + storagedriver "github.com/docker/distribution/registry/storage/driver" +) + +// layerWriter is used to control the various aspects of resumable +// layer upload. It implements the LayerUpload interface. +type blobWriter struct { + blobStore *linkedBlobStore + + id string + startedAt time.Time + resumableDigester digest.ResumableDigester + + // implementes io.WriteSeeker, io.ReaderFrom and io.Closer to satisfy + // LayerUpload Interface + bufferedFileWriter +} + +var _ distribution.BlobWriter = &blobWriter{} + +// ID returns the identifier for this upload. +func (bw *blobWriter) ID() string { + return bw.id +} + +func (bw *blobWriter) StartedAt() time.Time { + return bw.startedAt +} + +// Commit marks the upload as completed, returning a valid descriptor. The +// final size and digest are checked against the first descriptor provided. +func (bw *blobWriter) Commit(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { + context.GetLogger(ctx).Debug("(*blobWriter).Commit") + + if err := bw.bufferedFileWriter.Close(); err != nil { + return distribution.Descriptor{}, err + } + + canonical, err := bw.validateBlob(ctx, desc) + if err != nil { + return distribution.Descriptor{}, err + } + + if err := bw.moveBlob(ctx, canonical); err != nil { + return distribution.Descriptor{}, err + } + + if err := bw.blobStore.linkBlob(ctx, canonical, desc.Digest); err != nil { + return distribution.Descriptor{}, err + } + + if err := bw.removeResources(ctx); err != nil { + return distribution.Descriptor{}, err + } + + return canonical, nil +} + +// Rollback the blob upload process, releasing any resources associated with +// the writer and canceling the operation. +func (bw *blobWriter) Cancel(ctx context.Context) error { + context.GetLogger(ctx).Debug("(*blobWriter).Rollback") + if err := bw.removeResources(ctx); err != nil { + return err + } + + bw.Close() + return nil +} + +func (bw *blobWriter) Write(p []byte) (int, error) { + if bw.resumableDigester == nil { + return bw.bufferedFileWriter.Write(p) + } + + // Ensure that the current write offset matches how many bytes have been + // written to the digester. If not, we need to update the digest state to + // match the current write position. + if err := bw.resumeHashAt(bw.blobStore.ctx, bw.offset); err != nil { + return 0, err + } + + return io.MultiWriter(&bw.bufferedFileWriter, bw.resumableDigester).Write(p) +} + +func (bw *blobWriter) ReadFrom(r io.Reader) (n int64, err error) { + if bw.resumableDigester == nil { + return bw.bufferedFileWriter.ReadFrom(r) + } + + // Ensure that the current write offset matches how many bytes have been + // written to the digester. If not, we need to update the digest state to + // match the current write position. + if err := bw.resumeHashAt(bw.blobStore.ctx, bw.offset); err != nil { + return 0, err + } + + return bw.bufferedFileWriter.ReadFrom(io.TeeReader(r, bw.resumableDigester)) +} + +func (bw *blobWriter) Close() error { + if bw.err != nil { + return bw.err + } + + if bw.resumableDigester != nil { + if err := bw.storeHashState(bw.blobStore.ctx); err != nil { + return err + } + } + + return bw.bufferedFileWriter.Close() +} + +// validateBlob checks the data against the digest, returning an error if it +// does not match. The canonical descriptor is returned. +func (bw *blobWriter) validateBlob(ctx context.Context, desc distribution.Descriptor) (distribution.Descriptor, error) { + var ( + verified, fullHash bool + canonical digest.Digest + ) + + if desc.Digest == "" { + // if no descriptors are provided, we have nothing to validate + // against. We don't really want to support this for the registry. + return distribution.Descriptor{}, distribution.ErrBlobInvalidDigest{ + Reason: fmt.Errorf("cannot validate against empty digest"), + } + } + + // Stat the on disk file + if fi, err := bw.bufferedFileWriter.driver.Stat(ctx, bw.path); err != nil { + switch err := err.(type) { + case storagedriver.PathNotFoundError: + // NOTE(stevvooe): We really don't care if the file is + // not actually present for the reader. We now assume + // that the desc length is zero. + desc.Length = 0 + default: + // Any other error we want propagated up the stack. + return distribution.Descriptor{}, err + } + } else { + if fi.IsDir() { + return distribution.Descriptor{}, fmt.Errorf("unexpected directory at upload location %q", bw.path) + } + + bw.size = fi.Size() + } + + if desc.Length > 0 { + if desc.Length != bw.size { + return distribution.Descriptor{}, distribution.ErrBlobInvalidLength + } + } else { + // if provided 0 or negative length, we can assume caller doesn't know or + // care about length. + desc.Length = bw.size + } + + if bw.resumableDigester != nil { + // Restore the hasher state to the end of the upload. + if err := bw.resumeHashAt(ctx, bw.size); err != nil { + return distribution.Descriptor{}, err + } + + canonical = bw.resumableDigester.Digest() + + if canonical.Algorithm() == desc.Digest.Algorithm() { + // Common case: client and server prefer the same canonical digest + // algorithm - currently SHA256. + verified = desc.Digest == canonical + } else { + // The client wants to use a different digest algorithm. They'll just + // have to be patient and wait for us to download and re-hash the + // uploaded content using that digest algorithm. + fullHash = true + } + } else { + // Not using resumable digests, so we need to hash the entire layer. + fullHash = true + } + + if fullHash { + digester := digest.NewCanonicalDigester() + + digestVerifier, err := digest.NewDigestVerifier(desc.Digest) + if err != nil { + return distribution.Descriptor{}, err + } + + // Read the file from the backend driver and validate it. + fr, err := newFileReader(ctx, bw.bufferedFileWriter.driver, bw.path, desc.Length) + if err != nil { + return distribution.Descriptor{}, err + } + + tr := io.TeeReader(fr, digester) + + if _, err := io.Copy(digestVerifier, tr); err != nil { + return distribution.Descriptor{}, err + } + + canonical = digester.Digest() + verified = digestVerifier.Verified() + } + + if !verified { + context.GetLoggerWithFields(ctx, + map[string]interface{}{ + "canonical": canonical, + "provided": desc.Digest, + }, "canonical", "provided"). + Errorf("canonical digest does match provided digest") + return distribution.Descriptor{}, distribution.ErrBlobInvalidDigest{ + Digest: desc.Digest, + Reason: fmt.Errorf("content does not match digest"), + } + } + + // update desc with canonical hash + desc.Digest = canonical + + if desc.MediaType == "" { + desc.MediaType = "application/octet-stream" + } + + return desc, nil +} + +// moveBlob moves the data into its final, hash-qualified destination, +// identified by dgst. The layer should be validated before commencing the +// move. +func (bw *blobWriter) moveBlob(ctx context.Context, desc distribution.Descriptor) error { + blobPath, err := bw.blobStore.pm.path(blobDataPathSpec{ + digest: desc.Digest, + }) + + if err != nil { + return err + } + + // Check for existence + if _, err := bw.blobStore.driver.Stat(ctx, blobPath); err != nil { + switch err := err.(type) { + case storagedriver.PathNotFoundError: + break // ensure that it doesn't exist. + default: + return err + } + } else { + // If the path exists, we can assume that the content has already + // been uploaded, since the blob storage is content-addressable. + // While it may be corrupted, detection of such corruption belongs + // elsewhere. + return nil + } + + // If no data was received, we may not actually have a file on disk. Check + // the size here and write a zero-length file to blobPath if this is the + // case. For the most part, this should only ever happen with zero-length + // tars. + if _, err := bw.blobStore.driver.Stat(ctx, bw.path); err != nil { + switch err := err.(type) { + case storagedriver.PathNotFoundError: + // HACK(stevvooe): This is slightly dangerous: if we verify above, + // get a hash, then the underlying file is deleted, we risk moving + // a zero-length blob into a nonzero-length blob location. To + // prevent this horrid thing, we employ the hack of only allowing + // to this happen for the zero tarsum. + if desc.Digest == digest.DigestSha256EmptyTar { + return bw.blobStore.driver.PutContent(ctx, blobPath, []byte{}) + } + + // We let this fail during the move below. + logrus. + WithField("upload.id", bw.ID()). + WithField("digest", desc.Digest).Warnf("attempted to move zero-length content with non-zero digest") + default: + return err // unrelated error + } + } + + // TODO(stevvooe): We should also write the mediatype when executing this move. + + return bw.blobStore.driver.Move(ctx, bw.path, blobPath) +} + +type hashStateEntry struct { + offset int64 + path string +} + +// getStoredHashStates returns a slice of hashStateEntries for this upload. +func (bw *blobWriter) getStoredHashStates(ctx context.Context) ([]hashStateEntry, error) { + uploadHashStatePathPrefix, err := bw.blobStore.pm.path(uploadHashStatePathSpec{ + name: bw.blobStore.repository.Name(), + id: bw.id, + alg: bw.resumableDigester.Digest().Algorithm(), + list: true, + }) + if err != nil { + return nil, err + } + + paths, err := bw.blobStore.driver.List(ctx, uploadHashStatePathPrefix) + if err != nil { + if _, ok := err.(storagedriver.PathNotFoundError); !ok { + return nil, err + } + // Treat PathNotFoundError as no entries. + paths = nil + } + + hashStateEntries := make([]hashStateEntry, 0, len(paths)) + + for _, p := range paths { + pathSuffix := path.Base(p) + // The suffix should be the offset. + offset, err := strconv.ParseInt(pathSuffix, 0, 64) + if err != nil { + logrus.Errorf("unable to parse offset from upload state path %q: %s", p, err) + } + + hashStateEntries = append(hashStateEntries, hashStateEntry{offset: offset, path: p}) + } + + return hashStateEntries, nil +} + +// resumeHashAt attempts to restore the state of the internal hash function +// by loading the most recent saved hash state less than or equal to the given +// offset. Any unhashed bytes remaining less than the given offset are hashed +// from the content uploaded so far. +func (bw *blobWriter) resumeHashAt(ctx context.Context, offset int64) error { + if offset < 0 { + return fmt.Errorf("cannot resume hash at negative offset: %d", offset) + } + + if offset == int64(bw.resumableDigester.Len()) { + // State of digester is already at the requested offset. + return nil + } + + // List hash states from storage backend. + var hashStateMatch hashStateEntry + hashStates, err := bw.getStoredHashStates(ctx) + if err != nil { + return fmt.Errorf("unable to get stored hash states with offset %d: %s", offset, err) + } + + // Find the highest stored hashState with offset less than or equal to + // the requested offset. + for _, hashState := range hashStates { + if hashState.offset == offset { + hashStateMatch = hashState + break // Found an exact offset match. + } else if hashState.offset < offset && hashState.offset > hashStateMatch.offset { + // This offset is closer to the requested offset. + hashStateMatch = hashState + } else if hashState.offset > offset { + // Remove any stored hash state with offsets higher than this one + // as writes to this resumed hasher will make those invalid. This + // is probably okay to skip for now since we don't expect anyone to + // use the API in this way. For that reason, we don't treat an + // an error here as a fatal error, but only log it. + if err := bw.driver.Delete(ctx, hashState.path); err != nil { + logrus.Errorf("unable to delete stale hash state %q: %s", hashState.path, err) + } + } + } + + if hashStateMatch.offset == 0 { + // No need to load any state, just reset the hasher. + bw.resumableDigester.Reset() + } else { + storedState, err := bw.driver.GetContent(ctx, hashStateMatch.path) + if err != nil { + return err + } + + if err = bw.resumableDigester.Restore(storedState); err != nil { + return err + } + } + + // Mind the gap. + if gapLen := offset - int64(bw.resumableDigester.Len()); gapLen > 0 { + // Need to read content from the upload to catch up to the desired offset. + fr, err := newFileReader(ctx, bw.driver, bw.path, bw.size) + if err != nil { + return err + } + + if _, err = fr.Seek(int64(bw.resumableDigester.Len()), os.SEEK_SET); err != nil { + return fmt.Errorf("unable to seek to layer reader offset %d: %s", bw.resumableDigester.Len(), err) + } + + if _, err := io.CopyN(bw.resumableDigester, fr, gapLen); err != nil { + return err + } + } + + return nil +} + +func (bw *blobWriter) storeHashState(ctx context.Context) error { + uploadHashStatePath, err := bw.blobStore.pm.path(uploadHashStatePathSpec{ + name: bw.blobStore.repository.Name(), + id: bw.id, + alg: bw.resumableDigester.Digest().Algorithm(), + offset: int64(bw.resumableDigester.Len()), + }) + if err != nil { + return err + } + + hashState, err := bw.resumableDigester.State() + if err != nil { + return err + } + + return bw.driver.PutContent(ctx, uploadHashStatePath, hashState) +} + +// removeResources should clean up all resources associated with the upload +// instance. An error will be returned if the clean up cannot proceed. If the +// resources are already not present, no error will be returned. +func (bw *blobWriter) removeResources(ctx context.Context) error { + dataPath, err := bw.blobStore.pm.path(uploadDataPathSpec{ + name: bw.blobStore.repository.Name(), + id: bw.id, + }) + + if err != nil { + return err + } + + // Resolve and delete the containing directory, which should include any + // upload related files. + dirPath := path.Dir(dataPath) + if err := bw.blobStore.driver.Delete(ctx, dirPath); err != nil { + switch err := err.(type) { + case storagedriver.PathNotFoundError: + break // already gone! + default: + // This should be uncommon enough such that returning an error + // should be okay. At this point, the upload should be mostly + // complete, but perhaps the backend became unaccessible. + context.GetLogger(ctx).Errorf("unable to delete layer upload resources %q: %v", dirPath, err) + return err + } + } + + return nil +} diff --git a/registry/storage/blobwriter_nonresumable.go b/registry/storage/blobwriter_nonresumable.go new file mode 100644 index 000000000..ac2d78778 --- /dev/null +++ b/registry/storage/blobwriter_nonresumable.go @@ -0,0 +1,6 @@ +// +build noresumabledigest + +package storage + +func (bw *blobWriter) setupResumableDigester() { +} diff --git a/registry/storage/blobwriter_resumable.go b/registry/storage/blobwriter_resumable.go new file mode 100644 index 000000000..f20a6c36b --- /dev/null +++ b/registry/storage/blobwriter_resumable.go @@ -0,0 +1,9 @@ +// +build !noresumabledigest + +package storage + +import "github.com/docker/distribution/digest" + +func (bw *blobWriter) setupResumableDigester() { + bw.resumableDigester = digest.NewCanonicalResumableDigester() +} diff --git a/registry/storage/cache/cache.go b/registry/storage/cache/cache.go index a21cefd57..e7471c270 100644 --- a/registry/storage/cache/cache.go +++ b/registry/storage/cache/cache.go @@ -1,98 +1,38 @@ // Package cache provides facilities to speed up access to the storage -// backend. Typically cache implementations deal with internal implementation -// details at the backend level, rather than generalized caches for -// distribution related interfaces. In other words, unless the cache is -// specific to the storage package, it belongs in another package. +// backend. package cache import ( "fmt" + "github.com/docker/distribution" "github.com/docker/distribution/digest" - "golang.org/x/net/context" ) -// ErrNotFound is returned when a meta item is not found. -var ErrNotFound = fmt.Errorf("not found") +// BlobDescriptorCacheProvider provides repository scoped +// BlobDescriptorService cache instances and a global descriptor cache. +type BlobDescriptorCacheProvider interface { + distribution.BlobDescriptorService -// LayerMeta describes the backend location and length of layer data. -type LayerMeta struct { - Path string - Length int64 + RepositoryScoped(repo string) (distribution.BlobDescriptorService, error) } -// LayerInfoCache is a driver-aware cache of layer metadata. Basically, it -// provides a fast cache for checks against repository metadata, avoiding -// round trips to backend storage. Note that this is different from a pure -// layer cache, which would also provide access to backing data, as well. Such -// a cache should be implemented as a middleware, rather than integrated with -// the storage backend. -// -// Note that most implementations rely on the caller to do strict checks on on -// repo and dgst arguments, since these are mostly used behind existing -// implementations. -type LayerInfoCache interface { - // Contains returns true if the repository with name contains the layer. - Contains(ctx context.Context, repo string, dgst digest.Digest) (bool, error) - - // Add includes the layer in the given repository cache. - Add(ctx context.Context, repo string, dgst digest.Digest) error - - // Meta provides the location of the layer on the backend and its size. Membership of a - // repository should be tested before using the result, if required. - Meta(ctx context.Context, dgst digest.Digest) (LayerMeta, error) - - // SetMeta sets the meta data for the given layer. - SetMeta(ctx context.Context, dgst digest.Digest, meta LayerMeta) error +func validateDigest(dgst digest.Digest) error { + return dgst.Validate() } -// base implements common checks between cache implementations. Note that -// these are not full checks of input, since that should be done by the -// caller. -type base struct { - LayerInfoCache -} - -func (b *base) Contains(ctx context.Context, repo string, dgst digest.Digest) (bool, error) { - if repo == "" { - return false, fmt.Errorf("cache: cannot check for empty repository name") - } - - if dgst == "" { - return false, fmt.Errorf("cache: cannot check for empty digests") - } - - return b.LayerInfoCache.Contains(ctx, repo, dgst) -} - -func (b *base) Add(ctx context.Context, repo string, dgst digest.Digest) error { - if repo == "" { - return fmt.Errorf("cache: cannot add empty repository name") - } - - if dgst == "" { - return fmt.Errorf("cache: cannot add empty digest") - } - - return b.LayerInfoCache.Add(ctx, repo, dgst) -} - -func (b *base) Meta(ctx context.Context, dgst digest.Digest) (LayerMeta, error) { - if dgst == "" { - return LayerMeta{}, fmt.Errorf("cache: cannot get meta for empty digest") - } - - return b.LayerInfoCache.Meta(ctx, dgst) -} - -func (b *base) SetMeta(ctx context.Context, dgst digest.Digest, meta LayerMeta) error { - if dgst == "" { - return fmt.Errorf("cache: cannot set meta for empty digest") - } - - if meta.Path == "" { - return fmt.Errorf("cache: cannot set empty path for meta") - } - - return b.LayerInfoCache.SetMeta(ctx, dgst, meta) +func validateDescriptor(desc distribution.Descriptor) error { + if err := validateDigest(desc.Digest); err != nil { + return err + } + + if desc.Length < 0 { + return fmt.Errorf("cache: invalid length in descriptor: %v < 0", desc.Length) + } + + if desc.MediaType == "" { + return fmt.Errorf("cache: empty mediatype on descriptor: %v", desc) + } + + return nil } diff --git a/registry/storage/cache/cache_test.go b/registry/storage/cache/cache_test.go index 48cef955e..e923367a1 100644 --- a/registry/storage/cache/cache_test.go +++ b/registry/storage/cache/cache_test.go @@ -3,84 +3,139 @@ package cache import ( "testing" - "golang.org/x/net/context" + "github.com/docker/distribution" + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" ) -// checkLayerInfoCache takes a cache implementation through a common set of -// operations. If adding new tests, please add them here so new +// checkBlobDescriptorCache takes a cache implementation through a common set +// of operations. If adding new tests, please add them here so new // implementations get the benefit. -func checkLayerInfoCache(t *testing.T, lic LayerInfoCache) { +func checkBlobDescriptorCache(t *testing.T, provider BlobDescriptorCacheProvider) { ctx := context.Background() - exists, err := lic.Contains(ctx, "", "fake:abc") + checkBlobDescriptorCacheEmptyRepository(t, ctx, provider) + checkBlobDescriptorCacheSetAndRead(t, ctx, provider) +} + +func checkBlobDescriptorCacheEmptyRepository(t *testing.T, ctx context.Context, provider BlobDescriptorCacheProvider) { + if _, err := provider.Stat(ctx, "sha384:abc"); err != distribution.ErrBlobUnknown { + t.Fatalf("expected unknown blob error with empty store: %v", err) + } + + cache, err := provider.RepositoryScoped("") if err == nil { - t.Fatalf("expected error checking for cache item with empty repo") + t.Fatalf("expected an error when asking for invalid repo") } - exists, err = lic.Contains(ctx, "foo/bar", "") - if err == nil { - t.Fatalf("expected error checking for cache item with empty digest") - } - - exists, err = lic.Contains(ctx, "foo/bar", "fake:abc") + cache, err = provider.RepositoryScoped("foo/bar") if err != nil { - t.Fatalf("unexpected error checking for cache item: %v", err) + t.Fatalf("unexpected error getting repository: %v", err) } - if exists { - t.Fatalf("item should not exist") + if err := cache.SetDescriptor(ctx, "", distribution.Descriptor{ + Digest: "sha384:abc", + Length: 10, + MediaType: "application/octet-stream"}); err != digest.ErrDigestInvalidFormat { + t.Fatalf("expected error with invalid digest: %v", err) } - if err := lic.Add(ctx, "", "fake:abc"); err == nil { - t.Fatalf("expected error adding cache item with empty name") + if err := cache.SetDescriptor(ctx, "sha384:abc", distribution.Descriptor{ + Digest: "", + Length: 10, + MediaType: "application/octet-stream"}); err == nil { + t.Fatalf("expected error setting value on invalid descriptor") } - if err := lic.Add(ctx, "foo/bar", ""); err == nil { - t.Fatalf("expected error adding cache item with empty digest") + if _, err := cache.Stat(ctx, ""); err != digest.ErrDigestInvalidFormat { + t.Fatalf("expected error checking for cache item with empty digest: %v", err) } - if err := lic.Add(ctx, "foo/bar", "fake:abc"); err != nil { - t.Fatalf("unexpected error adding item: %v", err) - } - - exists, err = lic.Contains(ctx, "foo/bar", "fake:abc") - if err != nil { - t.Fatalf("unexpected error checking for cache item: %v", err) - } - - if !exists { - t.Fatalf("item should exist") - } - - _, err = lic.Meta(ctx, "") - if err == nil || err == ErrNotFound { - t.Fatalf("expected error getting meta for cache item with empty digest") - } - - _, err = lic.Meta(ctx, "fake:abc") - if err != ErrNotFound { - t.Fatalf("expected unknown layer error getting meta for cache item with empty digest") - } - - if err = lic.SetMeta(ctx, "", LayerMeta{}); err == nil { - t.Fatalf("expected error setting meta for cache item with empty digest") - } - - if err = lic.SetMeta(ctx, "foo/bar", LayerMeta{}); err == nil { - t.Fatalf("expected error setting meta for cache item with empty meta") - } - - expected := LayerMeta{Path: "/foo/bar", Length: 20} - if err := lic.SetMeta(ctx, "foo/bar", expected); err != nil { - t.Fatalf("unexpected error setting meta: %v", err) - } - - meta, err := lic.Meta(ctx, "foo/bar") - if err != nil { - t.Fatalf("unexpected error getting meta: %v", err) - } - - if meta != expected { - t.Fatalf("retrieved meta data did not match: %v", err) + if _, err := cache.Stat(ctx, "sha384:abc"); err != distribution.ErrBlobUnknown { + t.Fatalf("expected unknown blob error with empty repo: %v", err) + } +} + +func checkBlobDescriptorCacheSetAndRead(t *testing.T, ctx context.Context, provider BlobDescriptorCacheProvider) { + localDigest := digest.Digest("sha384:abc") + expected := distribution.Descriptor{ + Digest: "sha256:abc", + Length: 10, + MediaType: "application/octet-stream"} + + cache, err := provider.RepositoryScoped("foo/bar") + if err != nil { + t.Fatalf("unexpected error getting scoped cache: %v", err) + } + + if err := cache.SetDescriptor(ctx, localDigest, expected); err != nil { + t.Fatalf("error setting descriptor: %v", err) + } + + desc, err := cache.Stat(ctx, localDigest) + if err != nil { + t.Fatalf("unexpected error statting fake2:abc: %v", err) + } + + if expected != desc { + t.Fatalf("unexpected descriptor: %#v != %#v", expected, desc) + } + + // also check that we set the canonical key ("fake:abc") + desc, err = cache.Stat(ctx, localDigest) + if err != nil { + t.Fatalf("descriptor not returned for canonical key: %v", err) + } + + if expected != desc { + t.Fatalf("unexpected descriptor: %#v != %#v", expected, desc) + } + + // ensure that global gets extra descriptor mapping + desc, err = provider.Stat(ctx, localDigest) + if err != nil { + t.Fatalf("expected blob unknown in global cache: %v, %v", err, desc) + } + + if desc != expected { + t.Fatalf("unexpected descriptor: %#v != %#v", expected, desc) + } + + // get at it through canonical descriptor + desc, err = provider.Stat(ctx, expected.Digest) + if err != nil { + t.Fatalf("unexpected error checking glboal descriptor: %v", err) + } + + if desc != expected { + t.Fatalf("unexpected descriptor: %#v != %#v", expected, desc) + } + + // now, we set the repo local mediatype to something else and ensure it + // doesn't get changed in the provider cache. + expected.MediaType = "application/json" + + if err := cache.SetDescriptor(ctx, localDigest, expected); err != nil { + t.Fatalf("unexpected error setting descriptor: %v", err) + } + + desc, err = cache.Stat(ctx, localDigest) + if err != nil { + t.Fatalf("unexpected error getting descriptor: %v", err) + } + + if desc != expected { + t.Fatalf("unexpected descriptor: %#v != %#v", desc, expected) + } + + desc, err = provider.Stat(ctx, localDigest) + if err != nil { + t.Fatalf("unexpected error getting global descriptor: %v", err) + } + + expected.MediaType = "application/octet-stream" // expect original mediatype in global + + if desc != expected { + t.Fatalf("unexpected descriptor: %#v != %#v", desc, expected) } } diff --git a/registry/storage/cache/memory.go b/registry/storage/cache/memory.go index 6d9497925..40ab0d941 100644 --- a/registry/storage/cache/memory.go +++ b/registry/storage/cache/memory.go @@ -1,63 +1,149 @@ package cache import ( + "sync" + + "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" - "golang.org/x/net/context" + "github.com/docker/distribution/registry/api/v2" ) -// inmemoryLayerInfoCache is a map-based implementation of LayerInfoCache. -type inmemoryLayerInfoCache struct { - membership map[string]map[digest.Digest]struct{} - meta map[digest.Digest]LayerMeta +type inMemoryBlobDescriptorCacheProvider struct { + global *mapBlobDescriptorCache + repositories map[string]*mapBlobDescriptorCache + mu sync.RWMutex } -// NewInMemoryLayerInfoCache provides an implementation of LayerInfoCache that -// stores results in memory. -func NewInMemoryLayerInfoCache() LayerInfoCache { - return &base{&inmemoryLayerInfoCache{ - membership: make(map[string]map[digest.Digest]struct{}), - meta: make(map[digest.Digest]LayerMeta), - }} +// NewInMemoryBlobDescriptorCacheProvider returns a new mapped-based cache for +// storing blob descriptor data. +func NewInMemoryBlobDescriptorCacheProvider() BlobDescriptorCacheProvider { + return &inMemoryBlobDescriptorCacheProvider{ + global: newMapBlobDescriptorCache(), + repositories: make(map[string]*mapBlobDescriptorCache), + } } -func (ilic *inmemoryLayerInfoCache) Contains(ctx context.Context, repo string, dgst digest.Digest) (bool, error) { - members, ok := ilic.membership[repo] - if !ok { - return false, nil +func (imbdcp *inMemoryBlobDescriptorCacheProvider) RepositoryScoped(repo string) (distribution.BlobDescriptorService, error) { + if err := v2.ValidateRespositoryName(repo); err != nil { + return nil, err } - _, ok = members[dgst] - return ok, nil + imbdcp.mu.RLock() + defer imbdcp.mu.RUnlock() + + return &repositoryScopedInMemoryBlobDescriptorCache{ + repo: repo, + parent: imbdcp, + repository: imbdcp.repositories[repo], + }, nil } -// Add adds the layer to the redis repository blob set. -func (ilic *inmemoryLayerInfoCache) Add(ctx context.Context, repo string, dgst digest.Digest) error { - members, ok := ilic.membership[repo] - if !ok { - members = make(map[digest.Digest]struct{}) - ilic.membership[repo] = members +func (imbdcp *inMemoryBlobDescriptorCacheProvider) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + return imbdcp.global.Stat(ctx, dgst) +} + +func (imbdcp *inMemoryBlobDescriptorCacheProvider) SetDescriptor(ctx context.Context, dgst digest.Digest, desc distribution.Descriptor) error { + _, err := imbdcp.Stat(ctx, dgst) + if err == distribution.ErrBlobUnknown { + + if dgst.Algorithm() != desc.Digest.Algorithm() && dgst != desc.Digest { + // if the digests differ, set the other canonical mapping + if err := imbdcp.global.SetDescriptor(ctx, desc.Digest, desc); err != nil { + return err + } + } + + // unknown, just set it + return imbdcp.global.SetDescriptor(ctx, dgst, desc) } - members[dgst] = struct{}{} + // we already know it, do nothing + return err +} - return nil -} - -// Meta retrieves the layer meta data from the redis hash, returning -// ErrUnknownLayer if not found. -func (ilic *inmemoryLayerInfoCache) Meta(ctx context.Context, dgst digest.Digest) (LayerMeta, error) { - meta, ok := ilic.meta[dgst] - if !ok { - return LayerMeta{}, ErrNotFound - } - - return meta, nil -} - -// SetMeta sets the meta data for the given digest using a redis hash. A hash -// is used here since we may store unrelated fields about a layer in the -// future. -func (ilic *inmemoryLayerInfoCache) SetMeta(ctx context.Context, dgst digest.Digest, meta LayerMeta) error { - ilic.meta[dgst] = meta +// repositoryScopedInMemoryBlobDescriptorCache provides the request scoped +// repository cache. Instances are not thread-safe but the delegated +// operations are. +type repositoryScopedInMemoryBlobDescriptorCache struct { + repo string + parent *inMemoryBlobDescriptorCacheProvider // allows lazy allocation of repo's map + repository *mapBlobDescriptorCache +} + +func (rsimbdcp *repositoryScopedInMemoryBlobDescriptorCache) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + if rsimbdcp.repository == nil { + return distribution.Descriptor{}, distribution.ErrBlobUnknown + } + + return rsimbdcp.repository.Stat(ctx, dgst) +} + +func (rsimbdcp *repositoryScopedInMemoryBlobDescriptorCache) SetDescriptor(ctx context.Context, dgst digest.Digest, desc distribution.Descriptor) error { + if rsimbdcp.repository == nil { + // allocate map since we are setting it now. + rsimbdcp.parent.mu.Lock() + var ok bool + // have to read back value since we may have allocated elsewhere. + rsimbdcp.repository, ok = rsimbdcp.parent.repositories[rsimbdcp.repo] + if !ok { + rsimbdcp.repository = newMapBlobDescriptorCache() + rsimbdcp.parent.repositories[rsimbdcp.repo] = rsimbdcp.repository + } + + rsimbdcp.parent.mu.Unlock() + } + + if err := rsimbdcp.repository.SetDescriptor(ctx, dgst, desc); err != nil { + return err + } + + return rsimbdcp.parent.SetDescriptor(ctx, dgst, desc) +} + +// mapBlobDescriptorCache provides a simple map-based implementation of the +// descriptor cache. +type mapBlobDescriptorCache struct { + descriptors map[digest.Digest]distribution.Descriptor + mu sync.RWMutex +} + +var _ distribution.BlobDescriptorService = &mapBlobDescriptorCache{} + +func newMapBlobDescriptorCache() *mapBlobDescriptorCache { + return &mapBlobDescriptorCache{ + descriptors: make(map[digest.Digest]distribution.Descriptor), + } +} + +func (mbdc *mapBlobDescriptorCache) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + if err := validateDigest(dgst); err != nil { + return distribution.Descriptor{}, err + } + + mbdc.mu.RLock() + defer mbdc.mu.RUnlock() + + desc, ok := mbdc.descriptors[dgst] + if !ok { + return distribution.Descriptor{}, distribution.ErrBlobUnknown + } + + return desc, nil +} + +func (mbdc *mapBlobDescriptorCache) SetDescriptor(ctx context.Context, dgst digest.Digest, desc distribution.Descriptor) error { + if err := validateDigest(dgst); err != nil { + return err + } + + if err := validateDescriptor(desc); err != nil { + return err + } + + mbdc.mu.Lock() + defer mbdc.mu.Unlock() + + mbdc.descriptors[dgst] = desc return nil } diff --git a/registry/storage/cache/memory_test.go b/registry/storage/cache/memory_test.go index 417e982e2..9f2ce460e 100644 --- a/registry/storage/cache/memory_test.go +++ b/registry/storage/cache/memory_test.go @@ -2,8 +2,8 @@ package cache import "testing" -// TestInMemoryLayerInfoCache checks the in memory implementation is working +// TestInMemoryBlobInfoCache checks the in memory implementation is working // correctly. -func TestInMemoryLayerInfoCache(t *testing.T) { - checkLayerInfoCache(t, NewInMemoryLayerInfoCache()) +func TestInMemoryBlobInfoCache(t *testing.T) { + checkBlobDescriptorCache(t, NewInMemoryBlobDescriptorCacheProvider()) } diff --git a/registry/storage/cache/redis.go b/registry/storage/cache/redis.go index 6b8f7679a..c0e542bc5 100644 --- a/registry/storage/cache/redis.go +++ b/registry/storage/cache/redis.go @@ -1,20 +1,28 @@ package cache import ( - ctxu "github.com/docker/distribution/context" + "fmt" + + "github.com/docker/distribution/registry/api/v2" + + "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/garyburd/redigo/redis" - "golang.org/x/net/context" ) -// redisLayerInfoCache provides an implementation of storage.LayerInfoCache -// based on redis. Layer info is stored in two parts. The first provide fast -// access to repository membership through a redis set for each repo. The -// second is a redis hash keyed by the digest of the layer, providing path and -// length information. Note that there is no implied relationship between -// these two caches. The layer may exist in one, both or none and the code -// must be written this way. -type redisLayerInfoCache struct { +// redisBlobStatService provides an implementation of +// BlobDescriptorCacheProvider based on redis. Blob descritors are stored in +// two parts. The first provide fast access to repository membership through a +// redis set for each repo. The second is a redis hash keyed by the digest of +// the layer, providing path, length and mediatype information. There is also +// a per-repository redis hash of the blob descriptor, allowing override of +// data. This is currently used to override the mediatype on a per-repository +// basis. +// +// Note that there is no implied relationship between these two caches. The +// layer may exist in one, both or none and the code must be written this way. +type redisBlobDescriptorService struct { pool *redis.Pool // TODO(stevvooe): We use a pool because we don't have great control over @@ -23,76 +31,194 @@ type redisLayerInfoCache struct { // request objects, we can change this to a connection. } -// NewRedisLayerInfoCache returns a new redis-based LayerInfoCache using the -// provided redis connection pool. -func NewRedisLayerInfoCache(pool *redis.Pool) LayerInfoCache { - return &base{&redisLayerInfoCache{ +var _ BlobDescriptorCacheProvider = &redisBlobDescriptorService{} + +// NewRedisBlobDescriptorCacheProvider returns a new redis-based +// BlobDescriptorCacheProvider using the provided redis connection pool. +func NewRedisBlobDescriptorCacheProvider(pool *redis.Pool) BlobDescriptorCacheProvider { + return &redisBlobDescriptorService{ pool: pool, - }} + } } -// Contains does a membership check on the repository blob set in redis. This -// is used as an access check before looking up global path information. If -// false is returned, the caller should still check the backend to if it -// exists elsewhere. -func (rlic *redisLayerInfoCache) Contains(ctx context.Context, repo string, dgst digest.Digest) (bool, error) { - conn := rlic.pool.Get() - defer conn.Close() +// RepositoryScoped returns the scoped cache. +func (rbds *redisBlobDescriptorService) RepositoryScoped(repo string) (distribution.BlobDescriptorService, error) { + if err := v2.ValidateRespositoryName(repo); err != nil { + return nil, err + } - ctxu.GetLogger(ctx).Debugf("(*redisLayerInfoCache).Contains(%q, %q)", repo, dgst) - return redis.Bool(conn.Do("SISMEMBER", rlic.repositoryBlobSetKey(repo), dgst)) + return &repositoryScopedRedisBlobDescriptorService{ + repo: repo, + upstream: rbds, + }, nil } -// Add adds the layer to the redis repository blob set. -func (rlic *redisLayerInfoCache) Add(ctx context.Context, repo string, dgst digest.Digest) error { - conn := rlic.pool.Get() +// Stat retrieves the descriptor data from the redis hash entry. +func (rbds *redisBlobDescriptorService) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + if err := validateDigest(dgst); err != nil { + return distribution.Descriptor{}, err + } + + conn := rbds.pool.Get() defer conn.Close() - ctxu.GetLogger(ctx).Debugf("(*redisLayerInfoCache).Add(%q, %q)", repo, dgst) - _, err := conn.Do("SADD", rlic.repositoryBlobSetKey(repo), dgst) - return err + return rbds.stat(ctx, conn, dgst) } -// Meta retrieves the layer meta data from the redis hash, returning -// ErrUnknownLayer if not found. -func (rlic *redisLayerInfoCache) Meta(ctx context.Context, dgst digest.Digest) (LayerMeta, error) { - conn := rlic.pool.Get() - defer conn.Close() - - reply, err := redis.Values(conn.Do("HMGET", rlic.blobMetaHashKey(dgst), "path", "length")) +// stat provides an internal stat call that takes a connection parameter. This +// allows some internal management of the connection scope. +func (rbds *redisBlobDescriptorService) stat(ctx context.Context, conn redis.Conn, dgst digest.Digest) (distribution.Descriptor, error) { + reply, err := redis.Values(conn.Do("HMGET", rbds.blobDescriptorHashKey(dgst), "digest", "length", "mediatype")) if err != nil { - return LayerMeta{}, err + return distribution.Descriptor{}, err } - if len(reply) < 2 || reply[0] == nil || reply[1] == nil { - return LayerMeta{}, ErrNotFound + if len(reply) < 2 || reply[0] == nil || reply[1] == nil { // don't care if mediatype is nil + return distribution.Descriptor{}, distribution.ErrBlobUnknown } - var meta LayerMeta - if _, err := redis.Scan(reply, &meta.Path, &meta.Length); err != nil { - return LayerMeta{}, err + var desc distribution.Descriptor + if _, err := redis.Scan(reply, &desc.Digest, &desc.Length, &desc.MediaType); err != nil { + return distribution.Descriptor{}, err } - return meta, nil + return desc, nil } -// SetMeta sets the meta data for the given digest using a redis hash. A hash -// is used here since we may store unrelated fields about a layer in the -// future. -func (rlic *redisLayerInfoCache) SetMeta(ctx context.Context, dgst digest.Digest, meta LayerMeta) error { - conn := rlic.pool.Get() +// SetDescriptor sets the descriptor data for the given digest using a redis +// hash. A hash is used here since we may store unrelated fields about a layer +// in the future. +func (rbds *redisBlobDescriptorService) SetDescriptor(ctx context.Context, dgst digest.Digest, desc distribution.Descriptor) error { + if err := validateDigest(dgst); err != nil { + return err + } + + if err := validateDescriptor(desc); err != nil { + return err + } + + conn := rbds.pool.Get() defer conn.Close() - _, err := conn.Do("HMSET", rlic.blobMetaHashKey(dgst), "path", meta.Path, "length", meta.Length) - return err + return rbds.setDescriptor(ctx, conn, dgst, desc) } -// repositoryBlobSetKey returns the key for the blob set in the cache. -func (rlic *redisLayerInfoCache) repositoryBlobSetKey(repo string) string { - return "repository::" + repo + "::blobs" +func (rbds *redisBlobDescriptorService) setDescriptor(ctx context.Context, conn redis.Conn, dgst digest.Digest, desc distribution.Descriptor) error { + if _, err := conn.Do("HMSET", rbds.blobDescriptorHashKey(dgst), + "digest", desc.Digest, + "length", desc.Length); err != nil { + return err + } + + // Only set mediatype if not already set. + if _, err := conn.Do("HSETNX", rbds.blobDescriptorHashKey(dgst), + "mediatype", desc.MediaType); err != nil { + return err + } + + return nil } -// blobMetaHashKey returns the cache key for immutable blob meta data. -func (rlic *redisLayerInfoCache) blobMetaHashKey(dgst digest.Digest) string { +func (rbds *redisBlobDescriptorService) blobDescriptorHashKey(dgst digest.Digest) string { return "blobs::" + dgst.String() } + +type repositoryScopedRedisBlobDescriptorService struct { + repo string + upstream *redisBlobDescriptorService +} + +var _ distribution.BlobDescriptorService = &repositoryScopedRedisBlobDescriptorService{} + +// Stat ensures that the digest is a member of the specified repository and +// forwards the descriptor request to the global blob store. If the media type +// differs for the repository, we override it. +func (rsrbds *repositoryScopedRedisBlobDescriptorService) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + if err := validateDigest(dgst); err != nil { + return distribution.Descriptor{}, err + } + + conn := rsrbds.upstream.pool.Get() + defer conn.Close() + + // Check membership to repository first + member, err := redis.Bool(conn.Do("SISMEMBER", rsrbds.repositoryBlobSetKey(rsrbds.repo), dgst)) + if err != nil { + return distribution.Descriptor{}, err + } + + if !member { + return distribution.Descriptor{}, distribution.ErrBlobUnknown + } + + upstream, err := rsrbds.upstream.stat(ctx, conn, dgst) + if err != nil { + return distribution.Descriptor{}, err + } + + // We allow a per repository mediatype, let's look it up here. + mediatype, err := redis.String(conn.Do("HGET", rsrbds.blobDescriptorHashKey(dgst), "mediatype")) + if err != nil { + return distribution.Descriptor{}, err + } + + if mediatype != "" { + upstream.MediaType = mediatype + } + + return upstream, nil +} + +func (rsrbds *repositoryScopedRedisBlobDescriptorService) SetDescriptor(ctx context.Context, dgst digest.Digest, desc distribution.Descriptor) error { + if err := validateDigest(dgst); err != nil { + return err + } + + if err := validateDescriptor(desc); err != nil { + return err + } + + if dgst != desc.Digest { + if dgst.Algorithm() == desc.Digest.Algorithm() { + return fmt.Errorf("redis cache: digest for descriptors differ but algorthim does not: %q != %q", dgst, desc.Digest) + } + } + + conn := rsrbds.upstream.pool.Get() + defer conn.Close() + + return rsrbds.setDescriptor(ctx, conn, dgst, desc) +} + +func (rsrbds *repositoryScopedRedisBlobDescriptorService) setDescriptor(ctx context.Context, conn redis.Conn, dgst digest.Digest, desc distribution.Descriptor) error { + if _, err := conn.Do("SADD", rsrbds.repositoryBlobSetKey(rsrbds.repo), dgst); err != nil { + return err + } + + if err := rsrbds.upstream.setDescriptor(ctx, conn, dgst, desc); err != nil { + return err + } + + // Override repository mediatype. + if _, err := conn.Do("HSET", rsrbds.blobDescriptorHashKey(dgst), "mediatype", desc.MediaType); err != nil { + return err + } + + // Also set the values for the primary descriptor, if they differ by + // algorithm (ie sha256 vs tarsum). + if desc.Digest != "" && dgst != desc.Digest && dgst.Algorithm() != desc.Digest.Algorithm() { + if err := rsrbds.setDescriptor(ctx, conn, desc.Digest, desc); err != nil { + return err + } + } + + return nil +} + +func (rsrbds *repositoryScopedRedisBlobDescriptorService) blobDescriptorHashKey(dgst digest.Digest) string { + return "repository::" + rsrbds.repo + "::blobs::" + dgst.String() +} + +func (rsrbds *repositoryScopedRedisBlobDescriptorService) repositoryBlobSetKey(repo string) string { + return "repository::" + rsrbds.repo + "::blobs" +} diff --git a/registry/storage/cache/redis_test.go b/registry/storage/cache/redis_test.go index 7422a7ebb..65c2fd3ae 100644 --- a/registry/storage/cache/redis_test.go +++ b/registry/storage/cache/redis_test.go @@ -17,7 +17,7 @@ func init() { // TestRedisLayerInfoCache exercises a live redis instance using the cache // implementation. -func TestRedisLayerInfoCache(t *testing.T) { +func TestRedisBlobDescriptorCacheProvider(t *testing.T) { if redisAddr == "" { // fallback to an environement variable redisAddr = os.Getenv("TEST_REGISTRY_STORAGE_CACHE_REDIS_ADDR") @@ -46,5 +46,5 @@ func TestRedisLayerInfoCache(t *testing.T) { t.Fatalf("unexpected error flushing redis db: %v", err) } - checkLayerInfoCache(t, NewRedisLayerInfoCache(pool)) + checkBlobDescriptorCache(t, NewRedisBlobDescriptorCacheProvider(pool)) } diff --git a/registry/storage/cachedblobdescriptorstore.go b/registry/storage/cachedblobdescriptorstore.go new file mode 100644 index 000000000..a0ccd067d --- /dev/null +++ b/registry/storage/cachedblobdescriptorstore.go @@ -0,0 +1,84 @@ +package storage + +import ( + "expvar" + "sync/atomic" + + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + + "github.com/docker/distribution" +) + +type cachedBlobStatter struct { + cache distribution.BlobDescriptorService + backend distribution.BlobStatter +} + +func (cbds *cachedBlobStatter) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + atomic.AddUint64(&blobStatterCacheMetrics.Stat.Requests, 1) + desc, err := cbds.cache.Stat(ctx, dgst) + if err != nil { + if err != distribution.ErrBlobUnknown { + context.GetLogger(ctx).Errorf("error retrieving descriptor from cache: %v", err) + } + + goto fallback + } + + atomic.AddUint64(&blobStatterCacheMetrics.Stat.Hits, 1) + return desc, nil +fallback: + atomic.AddUint64(&blobStatterCacheMetrics.Stat.Misses, 1) + desc, err = cbds.backend.Stat(ctx, dgst) + if err != nil { + return desc, err + } + + if err := cbds.cache.SetDescriptor(ctx, dgst, desc); err != nil { + context.GetLogger(ctx).Errorf("error adding descriptor %v to cache: %v", desc.Digest, err) + } + + return desc, err +} + +// blobStatterCacheMetrics keeps track of cache metrics for blob descriptor +// cache requests. Note this is kept globally and made available via expvar. +// For more detailed metrics, its recommend to instrument a particular cache +// implementation. +var blobStatterCacheMetrics struct { + // Stat tracks calls to the caches. + Stat struct { + Requests uint64 + Hits uint64 + Misses uint64 + } +} + +func init() { + registry := expvar.Get("registry") + if registry == nil { + registry = expvar.NewMap("registry") + } + + cache := registry.(*expvar.Map).Get("cache") + if cache == nil { + cache = &expvar.Map{} + cache.(*expvar.Map).Init() + registry.(*expvar.Map).Set("cache", cache) + } + + storage := cache.(*expvar.Map).Get("storage") + if storage == nil { + storage = &expvar.Map{} + storage.(*expvar.Map).Init() + cache.(*expvar.Map).Set("storage", storage) + } + + storage.(*expvar.Map).Set("blobdescriptor", expvar.Func(func() interface{} { + // no need for synchronous access: the increments are atomic and + // during reading, we don't care if the data is up to date. The + // numbers will always *eventually* be reported correctly. + return blobStatterCacheMetrics + })) +} diff --git a/registry/storage/filereader.go b/registry/storage/filereader.go index 72d58f8a2..b3a5f5203 100644 --- a/registry/storage/filereader.go +++ b/registry/storage/filereader.go @@ -7,7 +7,6 @@ import ( "io" "io/ioutil" "os" - "time" "github.com/docker/distribution/context" storagedriver "github.com/docker/distribution/registry/storage/driver" @@ -29,9 +28,8 @@ type fileReader struct { ctx context.Context // identifying fields - path string - size int64 // size is the total size, must be set. - modtime time.Time // TODO(stevvooe): This is not needed anymore. + path string + size int64 // size is the total size, must be set. // mutable fields rc io.ReadCloser // remote read closer @@ -40,41 +38,17 @@ type fileReader struct { err error // terminal error, if set, reader is closed } -// newFileReader initializes a file reader for the remote file. The read takes -// on the offset and size at the time the reader is created. If the underlying -// file changes, one must create a new fileReader. -func newFileReader(ctx context.Context, driver storagedriver.StorageDriver, path string) (*fileReader, error) { - rd := &fileReader{ +// newFileReader initializes a file reader for the remote file. The reader +// takes on the size and path that must be determined externally with a stat +// call. The reader operates optimistically, assuming that the file is already +// there. +func newFileReader(ctx context.Context, driver storagedriver.StorageDriver, path string, size int64) (*fileReader, error) { + return &fileReader{ + ctx: ctx, driver: driver, path: path, - ctx: ctx, - } - - // Grab the size of the layer file, ensuring existence. - if fi, err := driver.Stat(ctx, path); err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - // NOTE(stevvooe): We really don't care if the file is not - // actually present for the reader. If the caller needs to know - // whether or not the file exists, they should issue a stat call - // on the path. There is still no guarantee, since the file may be - // gone by the time the reader is created. The only correct - // behavior is to return a reader that immediately returns EOF. - default: - // Any other error we want propagated up the stack. - return nil, err - } - } else { - if fi.IsDir() { - return nil, fmt.Errorf("cannot read a directory") - } - - // Fill in file information - rd.size = fi.Size() - rd.modtime = fi.ModTime() - } - - return rd, nil + size: size, + }, nil } func (fr *fileReader) Read(p []byte) (n int, err error) { @@ -162,11 +136,6 @@ func (fr *fileReader) reader() (io.Reader, error) { fr.rc = rc if fr.brd == nil { - // TODO(stevvooe): Set an optimal buffer size here. We'll have to - // understand the latency characteristics of the underlying network to - // set this correctly, so we may want to leave it to the driver. For - // out of process drivers, we'll have to optimize this buffer size for - // local communication. fr.brd = bufio.NewReaderSize(fr.rc, fileReaderBufferSize) } else { fr.brd.Reset(fr.rc) diff --git a/registry/storage/filereader_test.go b/registry/storage/filereader_test.go index c48bf16dd..774a864b7 100644 --- a/registry/storage/filereader_test.go +++ b/registry/storage/filereader_test.go @@ -37,7 +37,7 @@ func TestSimpleRead(t *testing.T) { t.Fatalf("error putting patterned content: %v", err) } - fr, err := newFileReader(ctx, driver, path) + fr, err := newFileReader(ctx, driver, path, int64(len(content))) if err != nil { t.Fatalf("error allocating file reader: %v", err) } @@ -66,7 +66,7 @@ func TestFileReaderSeek(t *testing.T) { t.Fatalf("error putting patterned content: %v", err) } - fr, err := newFileReader(ctx, driver, path) + fr, err := newFileReader(ctx, driver, path, int64(len(content))) if err != nil { t.Fatalf("unexpected error creating file reader: %v", err) @@ -162,7 +162,7 @@ func TestFileReaderSeek(t *testing.T) { // read method, with an io.EOF error. func TestFileReaderNonExistentFile(t *testing.T) { driver := inmemory.New() - fr, err := newFileReader(context.Background(), driver, "/doesnotexist") + fr, err := newFileReader(context.Background(), driver, "/doesnotexist", 10) if err != nil { t.Fatalf("unexpected error initializing reader: %v", err) } diff --git a/registry/storage/filewriter.go b/registry/storage/filewriter.go index 95930f1d7..529fa6736 100644 --- a/registry/storage/filewriter.go +++ b/registry/storage/filewriter.go @@ -39,7 +39,6 @@ type bufferedFileWriter struct { // filewriter should implement. type fileWriterInterface interface { io.WriteSeeker - io.WriterAt io.ReaderFrom io.Closer } @@ -110,21 +109,31 @@ func (bfw *bufferedFileWriter) Flush() error { // Write writes the buffer p at the current write offset. func (fw *fileWriter) Write(p []byte) (n int, err error) { - nn, err := fw.readFromAt(bytes.NewReader(p), -1) - return int(nn), err -} - -// WriteAt writes p at the specified offset. The underlying offset does not -// change. -func (fw *fileWriter) WriteAt(p []byte, offset int64) (n int, err error) { - nn, err := fw.readFromAt(bytes.NewReader(p), offset) + nn, err := fw.ReadFrom(bytes.NewReader(p)) return int(nn), err } // ReadFrom reads reader r until io.EOF writing the contents at the current // offset. func (fw *fileWriter) ReadFrom(r io.Reader) (n int64, err error) { - return fw.readFromAt(r, -1) + if fw.err != nil { + return 0, fw.err + } + + nn, err := fw.driver.WriteStream(fw.ctx, fw.path, fw.offset, r) + + // We should forward the offset, whether or not there was an error. + // Basically, we keep the filewriter in sync with the reader's head. If an + // error is encountered, the whole thing should be retried but we proceed + // from an expected offset, even if the data didn't make it to the + // backend. + fw.offset += nn + + if fw.offset > fw.size { + fw.size = fw.offset + } + + return nn, err } // Seek moves the write position do the requested offest based on the whence @@ -169,34 +178,3 @@ func (fw *fileWriter) Close() error { return nil } - -// readFromAt writes to fw from r at the specified offset. If offset is less -// than zero, the value of fw.offset is used and updated after the operation. -func (fw *fileWriter) readFromAt(r io.Reader, offset int64) (n int64, err error) { - if fw.err != nil { - return 0, fw.err - } - - var updateOffset bool - if offset < 0 { - offset = fw.offset - updateOffset = true - } - - nn, err := fw.driver.WriteStream(fw.ctx, fw.path, offset, r) - - if updateOffset { - // We should forward the offset, whether or not there was an error. - // Basically, we keep the filewriter in sync with the reader's head. If an - // error is encountered, the whole thing should be retried but we proceed - // from an expected offset, even if the data didn't make it to the - // backend. - fw.offset += nn - - if fw.offset > fw.size { - fw.size = fw.offset - } - } - - return nn, err -} diff --git a/registry/storage/filewriter_test.go b/registry/storage/filewriter_test.go index 720e93850..858b03272 100644 --- a/registry/storage/filewriter_test.go +++ b/registry/storage/filewriter_test.go @@ -51,7 +51,7 @@ func TestSimpleWrite(t *testing.T) { t.Fatalf("unexpected write length: %d != %d", n, len(content)) } - fr, err := newFileReader(ctx, driver, path) + fr, err := newFileReader(ctx, driver, path, int64(len(content))) if err != nil { t.Fatalf("unexpected error creating fileReader: %v", err) } @@ -78,23 +78,23 @@ func TestSimpleWrite(t *testing.T) { t.Fatalf("write did not advance offset: %d != %d", end, len(content)) } - // Double the content, but use the WriteAt method + // Double the content doubled := append(content, content...) doubledgst, err := digest.FromReader(bytes.NewReader(doubled)) if err != nil { t.Fatalf("unexpected error digesting doubled content: %v", err) } - n, err = fw.WriteAt(content, end) + nn, err := fw.ReadFrom(bytes.NewReader(content)) if err != nil { - t.Fatalf("unexpected error writing content at %d: %v", end, err) + t.Fatalf("unexpected error doubling content: %v", err) } - if n != len(content) { + if nn != int64(len(content)) { t.Fatalf("writeat was short: %d != %d", n, len(content)) } - fr, err = newFileReader(ctx, driver, path) + fr, err = newFileReader(ctx, driver, path, int64(len(doubled))) if err != nil { t.Fatalf("unexpected error creating fileReader: %v", err) } @@ -111,20 +111,20 @@ func TestSimpleWrite(t *testing.T) { t.Fatalf("unable to verify write data") } - // Check that WriteAt didn't update the offset. + // Check that Write updated the offset. end, err = fw.Seek(0, os.SEEK_END) if err != nil { t.Fatalf("unexpected error seeking: %v", err) } - if end != int64(len(content)) { - t.Fatalf("write did not advance offset: %d != %d", end, len(content)) + if end != int64(len(doubled)) { + t.Fatalf("write did not advance offset: %d != %d", end, len(doubled)) } // Now, we copy from one path to another, running the data through the // fileReader to fileWriter, rather than the driver.Move command to ensure // everything is working correctly. - fr, err = newFileReader(ctx, driver, path) + fr, err = newFileReader(ctx, driver, path, int64(len(doubled))) if err != nil { t.Fatalf("unexpected error creating fileReader: %v", err) } @@ -136,7 +136,7 @@ func TestSimpleWrite(t *testing.T) { } defer fw.Close() - nn, err := io.Copy(fw, fr) + nn, err = io.Copy(fw, fr) if err != nil { t.Fatalf("unexpected error copying data: %v", err) } @@ -145,7 +145,7 @@ func TestSimpleWrite(t *testing.T) { t.Fatalf("unexpected copy length: %d != %d", nn, len(doubled)) } - fr, err = newFileReader(ctx, driver, "/copied") + fr, err = newFileReader(ctx, driver, "/copied", int64(len(doubled))) if err != nil { t.Fatalf("unexpected error creating fileReader: %v", err) } diff --git a/registry/storage/layercache.go b/registry/storage/layercache.go deleted file mode 100644 index b9732f203..000000000 --- a/registry/storage/layercache.go +++ /dev/null @@ -1,202 +0,0 @@ -package storage - -import ( - "expvar" - "sync/atomic" - "time" - - "github.com/docker/distribution" - ctxu "github.com/docker/distribution/context" - "github.com/docker/distribution/digest" - "github.com/docker/distribution/registry/storage/cache" - "github.com/docker/distribution/registry/storage/driver" - "golang.org/x/net/context" -) - -// cachedLayerService implements the layer service with path-aware caching, -// using a LayerInfoCache interface. -type cachedLayerService struct { - distribution.LayerService // upstream layer service - repository distribution.Repository - ctx context.Context - driver driver.StorageDriver - *blobStore // global blob store - cache cache.LayerInfoCache -} - -// Exists checks for existence of the digest in the cache, immediately -// returning if it exists for the repository. If not, the upstream is checked. -// When a positive result is found, it is written into the cache. -func (lc *cachedLayerService) Exists(dgst digest.Digest) (bool, error) { - ctxu.GetLogger(lc.ctx).Debugf("(*cachedLayerService).Exists(%q)", dgst) - now := time.Now() - defer func() { - // TODO(stevvooe): Replace this with a decent context-based metrics solution - ctxu.GetLoggerWithField(lc.ctx, "blob.exists.duration", time.Since(now)). - Infof("(*cachedLayerService).Exists(%q)", dgst) - }() - - atomic.AddUint64(&layerInfoCacheMetrics.Exists.Requests, 1) - available, err := lc.cache.Contains(lc.ctx, lc.repository.Name(), dgst) - if err != nil { - ctxu.GetLogger(lc.ctx).Errorf("error checking availability of %v@%v: %v", lc.repository.Name(), dgst, err) - goto fallback - } - - if available { - atomic.AddUint64(&layerInfoCacheMetrics.Exists.Hits, 1) - return true, nil - } - -fallback: - atomic.AddUint64(&layerInfoCacheMetrics.Exists.Misses, 1) - exists, err := lc.LayerService.Exists(dgst) - if err != nil { - return exists, err - } - - if exists { - // we can only cache this if the existence is positive. - if err := lc.cache.Add(lc.ctx, lc.repository.Name(), dgst); err != nil { - ctxu.GetLogger(lc.ctx).Errorf("error adding %v@%v to cache: %v", lc.repository.Name(), dgst, err) - } - } - - return exists, err -} - -// Fetch checks for the availability of the layer in the repository via the -// cache. If present, the metadata is resolved and the layer is returned. If -// any operation fails, the layer is read directly from the upstream. The -// results are cached, if possible. -func (lc *cachedLayerService) Fetch(dgst digest.Digest) (distribution.Layer, error) { - ctxu.GetLogger(lc.ctx).Debugf("(*layerInfoCache).Fetch(%q)", dgst) - now := time.Now() - defer func() { - ctxu.GetLoggerWithField(lc.ctx, "blob.fetch.duration", time.Since(now)). - Infof("(*layerInfoCache).Fetch(%q)", dgst) - }() - - atomic.AddUint64(&layerInfoCacheMetrics.Fetch.Requests, 1) - available, err := lc.cache.Contains(lc.ctx, lc.repository.Name(), dgst) - if err != nil { - ctxu.GetLogger(lc.ctx).Errorf("error checking availability of %v@%v: %v", lc.repository.Name(), dgst, err) - goto fallback - } - - if available { - // fast path: get the layer info and return - meta, err := lc.cache.Meta(lc.ctx, dgst) - if err != nil { - ctxu.GetLogger(lc.ctx).Errorf("error fetching %v@%v from cache: %v", lc.repository.Name(), dgst, err) - goto fallback - } - - atomic.AddUint64(&layerInfoCacheMetrics.Fetch.Hits, 1) - return newLayerReader(lc.driver, dgst, meta.Path, meta.Length) - } - - // NOTE(stevvooe): Unfortunately, the cache here only makes checks for - // existing layers faster. We'd have to provide more careful - // synchronization with the backend to make the missing case as fast. - -fallback: - atomic.AddUint64(&layerInfoCacheMetrics.Fetch.Misses, 1) - layer, err := lc.LayerService.Fetch(dgst) - if err != nil { - return nil, err - } - - // add the layer to the repository - if err := lc.cache.Add(lc.ctx, lc.repository.Name(), dgst); err != nil { - ctxu.GetLogger(lc.ctx). - Errorf("error caching repository relationship for %v@%v: %v", lc.repository.Name(), dgst, err) - } - - // lookup layer path and add it to the cache, if it succeds. Note that we - // still return the layer even if we have trouble caching it. - if path, err := lc.resolveLayerPath(layer); err != nil { - ctxu.GetLogger(lc.ctx). - Errorf("error resolving path while caching %v@%v: %v", lc.repository.Name(), dgst, err) - } else { - // add the layer to the cache once we've resolved the path. - if err := lc.cache.SetMeta(lc.ctx, dgst, cache.LayerMeta{Path: path, Length: layer.Length()}); err != nil { - ctxu.GetLogger(lc.ctx).Errorf("error adding meta for %v@%v to cache: %v", lc.repository.Name(), dgst, err) - } - } - - return layer, err -} - -// extractLayerInfo pulls the layerInfo from the layer, attempting to get the -// path information from either the concrete object or by resolving the -// primary blob store path. -func (lc *cachedLayerService) resolveLayerPath(layer distribution.Layer) (path string, err error) { - // try and resolve the type and driver, so we don't have to traverse links - switch v := layer.(type) { - case *layerReader: - // only set path if we have same driver instance. - if v.driver == lc.driver { - return v.path, nil - } - } - - ctxu.GetLogger(lc.ctx).Warnf("resolving layer path during cache lookup (%v@%v)", lc.repository.Name(), layer.Digest()) - // we have to do an expensive stat to resolve the layer location but no - // need to check the link, since we already have layer instance for this - // repository. - bp, err := lc.blobStore.path(layer.Digest()) - if err != nil { - return "", err - } - - return bp, nil -} - -// layerInfoCacheMetrics keeps track of cache metrics for layer info cache -// requests. Note this is kept globally and made available via expvar. For -// more detailed metrics, its recommend to instrument a particular cache -// implementation. -var layerInfoCacheMetrics struct { - // Exists tracks calls to the Exists caches. - Exists struct { - Requests uint64 - Hits uint64 - Misses uint64 - } - - // Fetch tracks calls to the fetch caches. - Fetch struct { - Requests uint64 - Hits uint64 - Misses uint64 - } -} - -func init() { - registry := expvar.Get("registry") - if registry == nil { - registry = expvar.NewMap("registry") - } - - cache := registry.(*expvar.Map).Get("cache") - if cache == nil { - cache = &expvar.Map{} - cache.(*expvar.Map).Init() - registry.(*expvar.Map).Set("cache", cache) - } - - storage := cache.(*expvar.Map).Get("storage") - if storage == nil { - storage = &expvar.Map{} - storage.(*expvar.Map).Init() - cache.(*expvar.Map).Set("storage", storage) - } - - storage.(*expvar.Map).Set("layerinfo", expvar.Func(func() interface{} { - // no need for synchronous access: the increments are atomic and - // during reading, we don't care if the data is up to date. The - // numbers will always *eventually* be reported correctly. - return layerInfoCacheMetrics - })) -} diff --git a/registry/storage/layerreader.go b/registry/storage/layerreader.go deleted file mode 100644 index 044dab09e..000000000 --- a/registry/storage/layerreader.go +++ /dev/null @@ -1,104 +0,0 @@ -package storage - -import ( - "fmt" - "net/http" - "time" - - "github.com/docker/distribution" - "github.com/docker/distribution/digest" - "github.com/docker/distribution/registry/storage/driver" -) - -// layerReader implements Layer and provides facilities for reading and -// seeking. -type layerReader struct { - fileReader - - digest digest.Digest -} - -// newLayerReader returns a new layerReader with the digest, path and length, -// eliding round trips to the storage backend. -func newLayerReader(driver driver.StorageDriver, dgst digest.Digest, path string, length int64) (*layerReader, error) { - fr := &fileReader{ - driver: driver, - path: path, - size: length, - } - - return &layerReader{ - fileReader: *fr, - digest: dgst, - }, nil -} - -var _ distribution.Layer = &layerReader{} - -func (lr *layerReader) Digest() digest.Digest { - return lr.digest -} - -func (lr *layerReader) Length() int64 { - return lr.size -} - -func (lr *layerReader) CreatedAt() time.Time { - return lr.modtime -} - -// Close the layer. Should be called when the resource is no longer needed. -func (lr *layerReader) Close() error { - return lr.closeWithErr(distribution.ErrLayerClosed) -} - -func (lr *layerReader) Handler(r *http.Request) (h http.Handler, err error) { - var handlerFunc http.HandlerFunc - - redirectURL, err := lr.fileReader.driver.URLFor(lr.ctx, lr.path, map[string]interface{}{"method": r.Method}) - - switch err { - case nil: - handlerFunc = func(w http.ResponseWriter, r *http.Request) { - // Redirect to storage URL. - http.Redirect(w, r, redirectURL, http.StatusTemporaryRedirect) - } - case driver.ErrUnsupportedMethod: - handlerFunc = func(w http.ResponseWriter, r *http.Request) { - // Fallback to serving the content directly. - http.ServeContent(w, r, lr.digest.String(), lr.CreatedAt(), lr) - } - default: - // Some unexpected error. - return nil, err - } - - return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - // If the registry is serving this content itself, check - // the If-None-Match header and return 304 on match. Redirected - // storage implementations do the same. - - if etagMatch(r, lr.digest.String()) { - w.WriteHeader(http.StatusNotModified) - return - } - setCacheHeaders(w, 86400, lr.digest.String()) - w.Header().Set("Docker-Content-Digest", lr.digest.String()) - handlerFunc.ServeHTTP(w, r) - }), nil -} - -func etagMatch(r *http.Request, etag string) bool { - for _, headerVal := range r.Header["If-None-Match"] { - if headerVal == etag { - return true - } - } - return false -} - -func setCacheHeaders(w http.ResponseWriter, cacheAge int, etag string) { - w.Header().Set("ETag", etag) - w.Header().Set("Cache-Control", fmt.Sprintf("max-age=%d", cacheAge)) - -} diff --git a/registry/storage/layerstore.go b/registry/storage/layerstore.go deleted file mode 100644 index 8da14ac74..000000000 --- a/registry/storage/layerstore.go +++ /dev/null @@ -1,178 +0,0 @@ -package storage - -import ( - "time" - - "code.google.com/p/go-uuid/uuid" - "github.com/docker/distribution" - "github.com/docker/distribution/context" - "github.com/docker/distribution/digest" - "github.com/docker/distribution/manifest" - storagedriver "github.com/docker/distribution/registry/storage/driver" -) - -type layerStore struct { - repository *repository -} - -func (ls *layerStore) Exists(digest digest.Digest) (bool, error) { - context.GetLogger(ls.repository.ctx).Debug("(*layerStore).Exists") - - // Because this implementation just follows blob links, an existence check - // is pretty cheap by starting and closing a fetch. - _, err := ls.Fetch(digest) - - if err != nil { - switch err.(type) { - case distribution.ErrUnknownLayer: - return false, nil - } - - return false, err - } - - return true, nil -} - -func (ls *layerStore) Fetch(dgst digest.Digest) (distribution.Layer, error) { - ctx := ls.repository.ctx - context.GetLogger(ctx).Debug("(*layerStore).Fetch") - bp, err := ls.path(dgst) - if err != nil { - return nil, err - } - - fr, err := newFileReader(ctx, ls.repository.driver, bp) - if err != nil { - return nil, err - } - - return &layerReader{ - fileReader: *fr, - digest: dgst, - }, nil -} - -// Upload begins a layer upload, returning a handle. If the layer upload -// is already in progress or the layer has already been uploaded, this -// will return an error. -func (ls *layerStore) Upload() (distribution.LayerUpload, error) { - ctx := ls.repository.ctx - context.GetLogger(ctx).Debug("(*layerStore).Upload") - - // NOTE(stevvooe): Consider the issues with allowing concurrent upload of - // the same two layers. Should it be disallowed? For now, we allow both - // parties to proceed and the the first one uploads the layer. - - uuid := uuid.New() - startedAt := time.Now().UTC() - - path, err := ls.repository.pm.path(uploadDataPathSpec{ - name: ls.repository.Name(), - uuid: uuid, - }) - - if err != nil { - return nil, err - } - - startedAtPath, err := ls.repository.pm.path(uploadStartedAtPathSpec{ - name: ls.repository.Name(), - uuid: uuid, - }) - - if err != nil { - return nil, err - } - - // Write a startedat file for this upload - if err := ls.repository.driver.PutContent(ctx, startedAtPath, []byte(startedAt.Format(time.RFC3339))); err != nil { - return nil, err - } - - return ls.newLayerUpload(uuid, path, startedAt) -} - -// Resume continues an in progress layer upload, returning the current -// state of the upload. -func (ls *layerStore) Resume(uuid string) (distribution.LayerUpload, error) { - ctx := ls.repository.ctx - context.GetLogger(ctx).Debug("(*layerStore).Resume") - - startedAtPath, err := ls.repository.pm.path(uploadStartedAtPathSpec{ - name: ls.repository.Name(), - uuid: uuid, - }) - - if err != nil { - return nil, err - } - - startedAtBytes, err := ls.repository.driver.GetContent(ctx, startedAtPath) - if err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - return nil, distribution.ErrLayerUploadUnknown - default: - return nil, err - } - } - - startedAt, err := time.Parse(time.RFC3339, string(startedAtBytes)) - if err != nil { - return nil, err - } - - path, err := ls.repository.pm.path(uploadDataPathSpec{ - name: ls.repository.Name(), - uuid: uuid, - }) - - if err != nil { - return nil, err - } - - return ls.newLayerUpload(uuid, path, startedAt) -} - -// newLayerUpload allocates a new upload controller with the given state. -func (ls *layerStore) newLayerUpload(uuid, path string, startedAt time.Time) (distribution.LayerUpload, error) { - fw, err := newFileWriter(ls.repository.ctx, ls.repository.driver, path) - if err != nil { - return nil, err - } - - lw := &layerWriter{ - layerStore: ls, - uuid: uuid, - startedAt: startedAt, - bufferedFileWriter: *fw, - } - - lw.setupResumableDigester() - - return lw, nil -} - -func (ls *layerStore) path(dgst digest.Digest) (string, error) { - // We must traverse this path through the link to enforce ownership. - layerLinkPath, err := ls.repository.pm.path(layerLinkPathSpec{name: ls.repository.Name(), digest: dgst}) - if err != nil { - return "", err - } - - blobPath, err := ls.repository.blobStore.resolve(layerLinkPath) - - if err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - return "", distribution.ErrUnknownLayer{ - FSLayer: manifest.FSLayer{BlobSum: dgst}, - } - default: - return "", err - } - } - - return blobPath, nil -} diff --git a/registry/storage/layerwriter.go b/registry/storage/layerwriter.go deleted file mode 100644 index a2672fe69..000000000 --- a/registry/storage/layerwriter.go +++ /dev/null @@ -1,478 +0,0 @@ -package storage - -import ( - "fmt" - "io" - "os" - "path" - "strconv" - "time" - - "github.com/Sirupsen/logrus" - "github.com/docker/distribution" - "github.com/docker/distribution/context" - "github.com/docker/distribution/digest" - storagedriver "github.com/docker/distribution/registry/storage/driver" -) - -var _ distribution.LayerUpload = &layerWriter{} - -// layerWriter is used to control the various aspects of resumable -// layer upload. It implements the LayerUpload interface. -type layerWriter struct { - layerStore *layerStore - - uuid string - startedAt time.Time - resumableDigester digest.ResumableDigester - - // implementes io.WriteSeeker, io.ReaderFrom and io.Closer to satisfy - // LayerUpload Interface - bufferedFileWriter -} - -var _ distribution.LayerUpload = &layerWriter{} - -// UUID returns the identifier for this upload. -func (lw *layerWriter) UUID() string { - return lw.uuid -} - -func (lw *layerWriter) StartedAt() time.Time { - return lw.startedAt -} - -// Finish marks the upload as completed, returning a valid handle to the -// uploaded layer. The final size and checksum are validated against the -// contents of the uploaded layer. The checksum should be provided in the -// format :. -func (lw *layerWriter) Finish(dgst digest.Digest) (distribution.Layer, error) { - context.GetLogger(lw.layerStore.repository.ctx).Debug("(*layerWriter).Finish") - - if err := lw.bufferedFileWriter.Close(); err != nil { - return nil, err - } - - var ( - canonical digest.Digest - err error - ) - - // HACK(stevvooe): To deal with s3's lack of consistency, attempt to retry - // validation on failure. Three attempts are made, backing off - // retries*100ms each time. - for retries := 0; ; retries++ { - canonical, err = lw.validateLayer(dgst) - if err == nil { - break - } - - context.GetLoggerWithField(lw.layerStore.repository.ctx, "retries", retries). - Errorf("error validating layer: %v", err) - - if retries < 3 { - time.Sleep(100 * time.Millisecond * time.Duration(retries+1)) - continue - } - - return nil, err - - } - - if err := lw.moveLayer(canonical); err != nil { - // TODO(stevvooe): Cleanup? - return nil, err - } - - // Link the layer blob into the repository. - if err := lw.linkLayer(canonical, dgst); err != nil { - return nil, err - } - - if err := lw.removeResources(); err != nil { - return nil, err - } - - return lw.layerStore.Fetch(canonical) -} - -// Cancel the layer upload process. -func (lw *layerWriter) Cancel() error { - context.GetLogger(lw.layerStore.repository.ctx).Debug("(*layerWriter).Cancel") - if err := lw.removeResources(); err != nil { - return err - } - - lw.Close() - return nil -} - -func (lw *layerWriter) Write(p []byte) (int, error) { - if lw.resumableDigester == nil { - return lw.bufferedFileWriter.Write(p) - } - - // Ensure that the current write offset matches how many bytes have been - // written to the digester. If not, we need to update the digest state to - // match the current write position. - if err := lw.resumeHashAt(lw.offset); err != nil { - return 0, err - } - - return io.MultiWriter(&lw.bufferedFileWriter, lw.resumableDigester).Write(p) -} - -func (lw *layerWriter) ReadFrom(r io.Reader) (n int64, err error) { - if lw.resumableDigester == nil { - return lw.bufferedFileWriter.ReadFrom(r) - } - - // Ensure that the current write offset matches how many bytes have been - // written to the digester. If not, we need to update the digest state to - // match the current write position. - if err := lw.resumeHashAt(lw.offset); err != nil { - return 0, err - } - - return lw.bufferedFileWriter.ReadFrom(io.TeeReader(r, lw.resumableDigester)) -} - -func (lw *layerWriter) Close() error { - if lw.err != nil { - return lw.err - } - - if lw.resumableDigester != nil { - if err := lw.storeHashState(); err != nil { - return err - } - } - - return lw.bufferedFileWriter.Close() -} - -type hashStateEntry struct { - offset int64 - path string -} - -// getStoredHashStates returns a slice of hashStateEntries for this upload. -func (lw *layerWriter) getStoredHashStates() ([]hashStateEntry, error) { - uploadHashStatePathPrefix, err := lw.layerStore.repository.pm.path(uploadHashStatePathSpec{ - name: lw.layerStore.repository.Name(), - uuid: lw.uuid, - alg: lw.resumableDigester.Digest().Algorithm(), - list: true, - }) - if err != nil { - return nil, err - } - - paths, err := lw.driver.List(lw.layerStore.repository.ctx, uploadHashStatePathPrefix) - if err != nil { - if _, ok := err.(storagedriver.PathNotFoundError); !ok { - return nil, err - } - // Treat PathNotFoundError as no entries. - paths = nil - } - - hashStateEntries := make([]hashStateEntry, 0, len(paths)) - - for _, p := range paths { - pathSuffix := path.Base(p) - // The suffix should be the offset. - offset, err := strconv.ParseInt(pathSuffix, 0, 64) - if err != nil { - logrus.Errorf("unable to parse offset from upload state path %q: %s", p, err) - } - - hashStateEntries = append(hashStateEntries, hashStateEntry{offset: offset, path: p}) - } - - return hashStateEntries, nil -} - -// resumeHashAt attempts to restore the state of the internal hash function -// by loading the most recent saved hash state less than or equal to the given -// offset. Any unhashed bytes remaining less than the given offset are hashed -// from the content uploaded so far. -func (lw *layerWriter) resumeHashAt(offset int64) error { - if offset < 0 { - return fmt.Errorf("cannot resume hash at negative offset: %d", offset) - } - - if offset == int64(lw.resumableDigester.Len()) { - // State of digester is already at the requested offset. - return nil - } - - // List hash states from storage backend. - var hashStateMatch hashStateEntry - hashStates, err := lw.getStoredHashStates() - if err != nil { - return fmt.Errorf("unable to get stored hash states with offset %d: %s", offset, err) - } - - ctx := lw.layerStore.repository.ctx - // Find the highest stored hashState with offset less than or equal to - // the requested offset. - for _, hashState := range hashStates { - if hashState.offset == offset { - hashStateMatch = hashState - break // Found an exact offset match. - } else if hashState.offset < offset && hashState.offset > hashStateMatch.offset { - // This offset is closer to the requested offset. - hashStateMatch = hashState - } else if hashState.offset > offset { - // Remove any stored hash state with offsets higher than this one - // as writes to this resumed hasher will make those invalid. This - // is probably okay to skip for now since we don't expect anyone to - // use the API in this way. For that reason, we don't treat an - // an error here as a fatal error, but only log it. - if err := lw.driver.Delete(ctx, hashState.path); err != nil { - logrus.Errorf("unable to delete stale hash state %q: %s", hashState.path, err) - } - } - } - - if hashStateMatch.offset == 0 { - // No need to load any state, just reset the hasher. - lw.resumableDigester.Reset() - } else { - storedState, err := lw.driver.GetContent(ctx, hashStateMatch.path) - if err != nil { - return err - } - - if err = lw.resumableDigester.Restore(storedState); err != nil { - return err - } - } - - // Mind the gap. - if gapLen := offset - int64(lw.resumableDigester.Len()); gapLen > 0 { - // Need to read content from the upload to catch up to the desired offset. - fr, err := newFileReader(ctx, lw.driver, lw.path) - if err != nil { - return err - } - - if _, err = fr.Seek(int64(lw.resumableDigester.Len()), os.SEEK_SET); err != nil { - return fmt.Errorf("unable to seek to layer reader offset %d: %s", lw.resumableDigester.Len(), err) - } - - if _, err := io.CopyN(lw.resumableDigester, fr, gapLen); err != nil { - return err - } - } - - return nil -} - -func (lw *layerWriter) storeHashState() error { - uploadHashStatePath, err := lw.layerStore.repository.pm.path(uploadHashStatePathSpec{ - name: lw.layerStore.repository.Name(), - uuid: lw.uuid, - alg: lw.resumableDigester.Digest().Algorithm(), - offset: int64(lw.resumableDigester.Len()), - }) - if err != nil { - return err - } - - hashState, err := lw.resumableDigester.State() - if err != nil { - return err - } - - return lw.driver.PutContent(lw.layerStore.repository.ctx, uploadHashStatePath, hashState) -} - -// validateLayer checks the layer data against the digest, returning an error -// if it does not match. The canonical digest is returned. -func (lw *layerWriter) validateLayer(dgst digest.Digest) (digest.Digest, error) { - var ( - verified, fullHash bool - canonical digest.Digest - ) - - if lw.resumableDigester != nil { - // Restore the hasher state to the end of the upload. - if err := lw.resumeHashAt(lw.size); err != nil { - return "", err - } - - canonical = lw.resumableDigester.Digest() - - if canonical.Algorithm() == dgst.Algorithm() { - // Common case: client and server prefer the same canonical digest - // algorithm - currently SHA256. - verified = dgst == canonical - } else { - // The client wants to use a different digest algorithm. They'll just - // have to be patient and wait for us to download and re-hash the - // uploaded content using that digest algorithm. - fullHash = true - } - } else { - // Not using resumable digests, so we need to hash the entire layer. - fullHash = true - } - - if fullHash { - digester := digest.NewCanonicalDigester() - - digestVerifier, err := digest.NewDigestVerifier(dgst) - if err != nil { - return "", err - } - - // Read the file from the backend driver and validate it. - fr, err := newFileReader(lw.layerStore.repository.ctx, lw.bufferedFileWriter.driver, lw.path) - if err != nil { - return "", err - } - - tr := io.TeeReader(fr, digester) - - if _, err = io.Copy(digestVerifier, tr); err != nil { - return "", err - } - - canonical = digester.Digest() - verified = digestVerifier.Verified() - } - - if !verified { - context.GetLoggerWithField(lw.layerStore.repository.ctx, "canonical", dgst). - Errorf("canonical digest does match provided digest") - return "", distribution.ErrLayerInvalidDigest{ - Digest: dgst, - Reason: fmt.Errorf("content does not match digest"), - } - } - - return canonical, nil -} - -// moveLayer moves the data into its final, hash-qualified destination, -// identified by dgst. The layer should be validated before commencing the -// move. -func (lw *layerWriter) moveLayer(dgst digest.Digest) error { - blobPath, err := lw.layerStore.repository.pm.path(blobDataPathSpec{ - digest: dgst, - }) - - if err != nil { - return err - } - - ctx := lw.layerStore.repository.ctx - // Check for existence - if _, err := lw.driver.Stat(ctx, blobPath); err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - break // ensure that it doesn't exist. - default: - return err - } - } else { - // If the path exists, we can assume that the content has already - // been uploaded, since the blob storage is content-addressable. - // While it may be corrupted, detection of such corruption belongs - // elsewhere. - return nil - } - - // If no data was received, we may not actually have a file on disk. Check - // the size here and write a zero-length file to blobPath if this is the - // case. For the most part, this should only ever happen with zero-length - // tars. - if _, err := lw.driver.Stat(ctx, lw.path); err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - // HACK(stevvooe): This is slightly dangerous: if we verify above, - // get a hash, then the underlying file is deleted, we risk moving - // a zero-length blob into a nonzero-length blob location. To - // prevent this horrid thing, we employ the hack of only allowing - // to this happen for the zero tarsum. - if dgst == digest.DigestSha256EmptyTar { - return lw.driver.PutContent(ctx, blobPath, []byte{}) - } - - // We let this fail during the move below. - logrus. - WithField("upload.uuid", lw.UUID()). - WithField("digest", dgst).Warnf("attempted to move zero-length content with non-zero digest") - default: - return err // unrelated error - } - } - - return lw.driver.Move(ctx, lw.path, blobPath) -} - -// linkLayer links a valid, written layer blob into the registry under the -// named repository for the upload controller. -func (lw *layerWriter) linkLayer(canonical digest.Digest, aliases ...digest.Digest) error { - dgsts := append([]digest.Digest{canonical}, aliases...) - - // Don't make duplicate links. - seenDigests := make(map[digest.Digest]struct{}, len(dgsts)) - - for _, dgst := range dgsts { - if _, seen := seenDigests[dgst]; seen { - continue - } - seenDigests[dgst] = struct{}{} - - layerLinkPath, err := lw.layerStore.repository.pm.path(layerLinkPathSpec{ - name: lw.layerStore.repository.Name(), - digest: dgst, - }) - - if err != nil { - return err - } - - ctx := lw.layerStore.repository.ctx - if err := lw.layerStore.repository.driver.PutContent(ctx, layerLinkPath, []byte(canonical)); err != nil { - return err - } - } - - return nil -} - -// removeResources should clean up all resources associated with the upload -// instance. An error will be returned if the clean up cannot proceed. If the -// resources are already not present, no error will be returned. -func (lw *layerWriter) removeResources() error { - dataPath, err := lw.layerStore.repository.pm.path(uploadDataPathSpec{ - name: lw.layerStore.repository.Name(), - uuid: lw.uuid, - }) - - if err != nil { - return err - } - - // Resolve and delete the containing directory, which should include any - // upload related files. - dirPath := path.Dir(dataPath) - if err := lw.driver.Delete(lw.layerStore.repository.ctx, dirPath); err != nil { - switch err := err.(type) { - case storagedriver.PathNotFoundError: - break // already gone! - default: - // This should be uncommon enough such that returning an error - // should be okay. At this point, the upload should be mostly - // complete, but perhaps the backend became unaccessible. - logrus.Errorf("unable to delete layer upload resources %q: %v", dirPath, err) - return err - } - } - - return nil -} diff --git a/registry/storage/layerwriter_nonresumable.go b/registry/storage/layerwriter_nonresumable.go deleted file mode 100644 index d4350c6b8..000000000 --- a/registry/storage/layerwriter_nonresumable.go +++ /dev/null @@ -1,6 +0,0 @@ -// +build noresumabledigest - -package storage - -func (lw *layerWriter) setupResumableDigester() { -} diff --git a/registry/storage/layerwriter_resumable.go b/registry/storage/layerwriter_resumable.go deleted file mode 100644 index 7d8c63354..000000000 --- a/registry/storage/layerwriter_resumable.go +++ /dev/null @@ -1,9 +0,0 @@ -// +build !noresumabledigest - -package storage - -import "github.com/docker/distribution/digest" - -func (lw *layerWriter) setupResumableDigester() { - lw.resumableDigester = digest.NewCanonicalResumableDigester() -} diff --git a/registry/storage/linkedblobstore.go b/registry/storage/linkedblobstore.go new file mode 100644 index 000000000..91dd0616a --- /dev/null +++ b/registry/storage/linkedblobstore.go @@ -0,0 +1,258 @@ +package storage + +import ( + "net/http" + "time" + + "code.google.com/p/go-uuid/uuid" + "github.com/docker/distribution" + "github.com/docker/distribution/context" + "github.com/docker/distribution/digest" + "github.com/docker/distribution/registry/storage/driver" +) + +// linkedBlobStore provides a full BlobService that namespaces the blobs to a +// given repository. Effectively, it manages the links in a given repository +// that grant access to the global blob store. +type linkedBlobStore struct { + *blobStore + blobServer distribution.BlobServer + statter distribution.BlobStatter + repository distribution.Repository + ctx context.Context // only to be used where context can't come through method args + + // linkPath allows one to control the repository blob link set to which + // the blob store dispatches. This is required because manifest and layer + // blobs have not yet been fully merged. At some point, this functionality + // should be removed an the blob links folder should be merged. + linkPath func(pm *pathMapper, name string, dgst digest.Digest) (string, error) +} + +var _ distribution.BlobStore = &linkedBlobStore{} + +func (lbs *linkedBlobStore) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + return lbs.statter.Stat(ctx, dgst) +} + +func (lbs *linkedBlobStore) Get(ctx context.Context, dgst digest.Digest) ([]byte, error) { + canonical, err := lbs.Stat(ctx, dgst) // access check + if err != nil { + return nil, err + } + + return lbs.blobStore.Get(ctx, canonical.Digest) +} + +func (lbs *linkedBlobStore) Open(ctx context.Context, dgst digest.Digest) (distribution.ReadSeekCloser, error) { + canonical, err := lbs.Stat(ctx, dgst) // access check + if err != nil { + return nil, err + } + + return lbs.blobStore.Open(ctx, canonical.Digest) +} + +func (lbs *linkedBlobStore) ServeBlob(ctx context.Context, w http.ResponseWriter, r *http.Request, dgst digest.Digest) error { + canonical, err := lbs.Stat(ctx, dgst) // access check + if err != nil { + return err + } + + if canonical.MediaType != "" { + // Set the repository local content type. + w.Header().Set("Content-Type", canonical.MediaType) + } + + return lbs.blobServer.ServeBlob(ctx, w, r, canonical.Digest) +} + +func (lbs *linkedBlobStore) Put(ctx context.Context, mediaType string, p []byte) (distribution.Descriptor, error) { + // Place the data in the blob store first. + desc, err := lbs.blobStore.Put(ctx, mediaType, p) + if err != nil { + context.GetLogger(ctx).Errorf("error putting into main store: %v", err) + return distribution.Descriptor{}, err + } + + // TODO(stevvooe): Write out mediatype if incoming differs from what is + // returned by Put above. Note that we should allow updates for a given + // repository. + + return desc, lbs.linkBlob(ctx, desc) +} + +// Writer begins a blob write session, returning a handle. +func (lbs *linkedBlobStore) Create(ctx context.Context) (distribution.BlobWriter, error) { + context.GetLogger(ctx).Debug("(*linkedBlobStore).Writer") + + uuid := uuid.New() + startedAt := time.Now().UTC() + + path, err := lbs.blobStore.pm.path(uploadDataPathSpec{ + name: lbs.repository.Name(), + id: uuid, + }) + + if err != nil { + return nil, err + } + + startedAtPath, err := lbs.blobStore.pm.path(uploadStartedAtPathSpec{ + name: lbs.repository.Name(), + id: uuid, + }) + + if err != nil { + return nil, err + } + + // Write a startedat file for this upload + if err := lbs.blobStore.driver.PutContent(ctx, startedAtPath, []byte(startedAt.Format(time.RFC3339))); err != nil { + return nil, err + } + + return lbs.newBlobUpload(ctx, uuid, path, startedAt) +} + +func (lbs *linkedBlobStore) Resume(ctx context.Context, id string) (distribution.BlobWriter, error) { + context.GetLogger(ctx).Debug("(*linkedBlobStore).Resume") + + startedAtPath, err := lbs.blobStore.pm.path(uploadStartedAtPathSpec{ + name: lbs.repository.Name(), + id: id, + }) + + if err != nil { + return nil, err + } + + startedAtBytes, err := lbs.blobStore.driver.GetContent(ctx, startedAtPath) + if err != nil { + switch err := err.(type) { + case driver.PathNotFoundError: + return nil, distribution.ErrBlobUploadUnknown + default: + return nil, err + } + } + + startedAt, err := time.Parse(time.RFC3339, string(startedAtBytes)) + if err != nil { + return nil, err + } + + path, err := lbs.pm.path(uploadDataPathSpec{ + name: lbs.repository.Name(), + id: id, + }) + + if err != nil { + return nil, err + } + + return lbs.newBlobUpload(ctx, id, path, startedAt) +} + +// newLayerUpload allocates a new upload controller with the given state. +func (lbs *linkedBlobStore) newBlobUpload(ctx context.Context, uuid, path string, startedAt time.Time) (distribution.BlobWriter, error) { + fw, err := newFileWriter(ctx, lbs.driver, path) + if err != nil { + return nil, err + } + + bw := &blobWriter{ + blobStore: lbs, + id: uuid, + startedAt: startedAt, + bufferedFileWriter: *fw, + } + + bw.setupResumableDigester() + + return bw, nil +} + +// linkBlob links a valid, written blob into the registry under the named +// repository for the upload controller. +func (lbs *linkedBlobStore) linkBlob(ctx context.Context, canonical distribution.Descriptor, aliases ...digest.Digest) error { + dgsts := append([]digest.Digest{canonical.Digest}, aliases...) + + // TODO(stevvooe): Need to write out mediatype for only canonical hash + // since we don't care about the aliases. They are generally unused except + // for tarsum but those versions don't care about mediatype. + + // Don't make duplicate links. + seenDigests := make(map[digest.Digest]struct{}, len(dgsts)) + + for _, dgst := range dgsts { + if _, seen := seenDigests[dgst]; seen { + continue + } + seenDigests[dgst] = struct{}{} + + blobLinkPath, err := lbs.linkPath(lbs.pm, lbs.repository.Name(), dgst) + if err != nil { + return err + } + + if err := lbs.blobStore.link(ctx, blobLinkPath, canonical.Digest); err != nil { + return err + } + } + + return nil +} + +type linkedBlobStatter struct { + *blobStore + repository distribution.Repository + + // linkPath allows one to control the repository blob link set to which + // the blob store dispatches. This is required because manifest and layer + // blobs have not yet been fully merged. At some point, this functionality + // should be removed an the blob links folder should be merged. + linkPath func(pm *pathMapper, name string, dgst digest.Digest) (string, error) +} + +var _ distribution.BlobStatter = &linkedBlobStatter{} + +func (lbs *linkedBlobStatter) Stat(ctx context.Context, dgst digest.Digest) (distribution.Descriptor, error) { + blobLinkPath, err := lbs.linkPath(lbs.pm, lbs.repository.Name(), dgst) + if err != nil { + return distribution.Descriptor{}, err + } + + target, err := lbs.blobStore.readlink(ctx, blobLinkPath) + if err != nil { + switch err := err.(type) { + case driver.PathNotFoundError: + return distribution.Descriptor{}, distribution.ErrBlobUnknown + default: + return distribution.Descriptor{}, err + } + + // TODO(stevvooe): For backwards compatibility with data in "_layers", we + // need to hit layerLinkPath, as well. Or, somehow migrate to the new path + // layout. + } + + if target != dgst { + // Track when we are doing cross-digest domain lookups. ie, tarsum to sha256. + context.GetLogger(ctx).Warnf("looking up blob with canonical target: %v -> %v", dgst, target) + } + + // TODO(stevvooe): Look up repository local mediatype and replace that on + // the returned descriptor. + + return lbs.blobStore.statter.Stat(ctx, target) +} + +// blobLinkPath provides the path to the blob link, also known as layers. +func blobLinkPath(pm *pathMapper, name string, dgst digest.Digest) (string, error) { + return pm.path(layerLinkPathSpec{name: name, digest: dgst}) +} + +// manifestRevisionLinkPath provides the path to the manifest revision link. +func manifestRevisionLinkPath(pm *pathMapper, name string, dgst digest.Digest) (string, error) { + return pm.path(layerLinkPathSpec{name: name, digest: dgst}) +} diff --git a/registry/storage/manifeststore.go b/registry/storage/manifeststore.go index 4946785d3..07f8de3c8 100644 --- a/registry/storage/manifeststore.go +++ b/registry/storage/manifeststore.go @@ -4,88 +4,92 @@ import ( "fmt" "github.com/docker/distribution" - ctxu "github.com/docker/distribution/context" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" "github.com/docker/libtrust" ) type manifestStore struct { - repository *repository - + repository *repository revisionStore *revisionStore tagStore *tagStore + ctx context.Context } var _ distribution.ManifestService = &manifestStore{} func (ms *manifestStore) Exists(dgst digest.Digest) (bool, error) { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).Exists") - return ms.revisionStore.exists(dgst) + context.GetLogger(ms.ctx).Debug("(*manifestStore).Exists") + + _, err := ms.revisionStore.blobStore.Stat(ms.ctx, dgst) + if err != nil { + if err == distribution.ErrBlobUnknown { + return false, nil + } + + return false, err + } + + return true, nil } func (ms *manifestStore) Get(dgst digest.Digest) (*manifest.SignedManifest, error) { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).Get") - return ms.revisionStore.get(dgst) + context.GetLogger(ms.ctx).Debug("(*manifestStore).Get") + return ms.revisionStore.get(ms.ctx, dgst) } func (ms *manifestStore) Put(manifest *manifest.SignedManifest) error { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).Put") - - // TODO(stevvooe): Add check here to see if the revision is already - // present in the repository. If it is, we should merge the signatures, do - // a shallow verify (or a full one, doesn't matter) and return an error - // indicating what happened. + context.GetLogger(ms.ctx).Debug("(*manifestStore).Put") // Verify the manifest. - if err := ms.verifyManifest(manifest); err != nil { + if err := ms.verifyManifest(ms.ctx, manifest); err != nil { return err } // Store the revision of the manifest - revision, err := ms.revisionStore.put(manifest) + revision, err := ms.revisionStore.put(ms.ctx, manifest) if err != nil { return err } // Now, tag the manifest - return ms.tagStore.tag(manifest.Tag, revision) + return ms.tagStore.tag(manifest.Tag, revision.Digest) } // Delete removes the revision of the specified manfiest. func (ms *manifestStore) Delete(dgst digest.Digest) error { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).Delete - unsupported") + context.GetLogger(ms.ctx).Debug("(*manifestStore).Delete - unsupported") return fmt.Errorf("deletion of manifests not supported") } func (ms *manifestStore) Tags() ([]string, error) { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).Tags") + context.GetLogger(ms.ctx).Debug("(*manifestStore).Tags") return ms.tagStore.tags() } func (ms *manifestStore) ExistsByTag(tag string) (bool, error) { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).ExistsByTag") + context.GetLogger(ms.ctx).Debug("(*manifestStore).ExistsByTag") return ms.tagStore.exists(tag) } func (ms *manifestStore) GetByTag(tag string) (*manifest.SignedManifest, error) { - ctxu.GetLogger(ms.repository.ctx).Debug("(*manifestStore).GetByTag") + context.GetLogger(ms.ctx).Debug("(*manifestStore).GetByTag") dgst, err := ms.tagStore.resolve(tag) if err != nil { return nil, err } - return ms.revisionStore.get(dgst) + return ms.revisionStore.get(ms.ctx, dgst) } // verifyManifest ensures that the manifest content is valid from the // perspective of the registry. It ensures that the signature is valid for the // enclosed payload. As a policy, the registry only tries to store valid // content, leaving trust policies of that content up to consumers. -func (ms *manifestStore) verifyManifest(mnfst *manifest.SignedManifest) error { +func (ms *manifestStore) verifyManifest(ctx context.Context, mnfst *manifest.SignedManifest) error { var errs distribution.ErrManifestVerification if mnfst.Name != ms.repository.Name() { - // TODO(stevvooe): This needs to be an exported error errs = append(errs, fmt.Errorf("repository name does not match manifest name")) } @@ -103,18 +107,18 @@ func (ms *manifestStore) verifyManifest(mnfst *manifest.SignedManifest) error { } for _, fsLayer := range mnfst.FSLayers { - exists, err := ms.repository.Layers().Exists(fsLayer.BlobSum) + _, err := ms.repository.Blobs(ctx).Stat(ctx, fsLayer.BlobSum) if err != nil { - errs = append(errs, err) - } + if err != distribution.ErrBlobUnknown { + errs = append(errs, err) + } - if !exists { - errs = append(errs, distribution.ErrUnknownLayer{FSLayer: fsLayer}) + // On error here, we always append unknown blob errors. + errs = append(errs, distribution.ErrManifestBlobUnknown{Digest: fsLayer.BlobSum}) } } if len(errs) != 0 { - // TODO(stevvooe): These need to be recoverable by a caller. return errs } diff --git a/registry/storage/manifeststore_test.go b/registry/storage/manifeststore_test.go index 3bafb9976..59f174b3b 100644 --- a/registry/storage/manifeststore_test.go +++ b/registry/storage/manifeststore_test.go @@ -6,16 +6,15 @@ import ( "reflect" "testing" - "github.com/docker/distribution/registry/storage/cache" - "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" + "github.com/docker/distribution/registry/storage/cache" "github.com/docker/distribution/registry/storage/driver" "github.com/docker/distribution/registry/storage/driver/inmemory" "github.com/docker/distribution/testutil" "github.com/docker/libtrust" - "golang.org/x/net/context" ) type manifestStoreTestEnv struct { @@ -30,7 +29,7 @@ type manifestStoreTestEnv struct { func newManifestStoreTestEnv(t *testing.T, name, tag string) *manifestStoreTestEnv { ctx := context.Background() driver := inmemory.New() - registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryLayerInfoCache()) + registry := NewRegistryWithDriver(ctx, driver, cache.NewInMemoryBlobDescriptorCacheProvider()) repo, err := registry.Repository(ctx, name) if err != nil { @@ -108,20 +107,33 @@ func TestManifestStorage(t *testing.T) { t.Fatalf("expected errors putting manifest") } - // TODO(stevvooe): We expect errors describing all of the missing layers. + switch err := err.(type) { + case distribution.ErrManifestVerification: + if len(err) != 2 { + t.Fatalf("expected 2 verification errors: %#v", err) + } + + for _, err := range err { + if _, ok := err.(distribution.ErrManifestBlobUnknown); !ok { + t.Fatalf("unexpected error type: %v", err) + } + } + default: + t.Fatalf("unexpected error verifying manifest: %v", err) + } // Now, upload the layers that were missing! for dgst, rs := range testLayers { - upload, err := env.repository.Layers().Upload() + wr, err := env.repository.Blobs(env.ctx).Create(env.ctx) if err != nil { t.Fatalf("unexpected error creating test upload: %v", err) } - if _, err := io.Copy(upload, rs); err != nil { + if _, err := io.Copy(wr, rs); err != nil { t.Fatalf("unexpected error copying to upload: %v", err) } - if _, err := upload.Finish(dgst); err != nil { + if _, err := wr.Commit(env.ctx, distribution.Descriptor{Digest: dgst}); err != nil { t.Fatalf("unexpected error finishing upload: %v", err) } } diff --git a/registry/storage/paths.go b/registry/storage/paths.go index fe648f519..9e150d3ba 100644 --- a/registry/storage/paths.go +++ b/registry/storage/paths.go @@ -30,7 +30,7 @@ const storagePathVersion = "v2" // -> //link // -> _layers/ // -// -> _uploads/ +// -> _uploads/ // data // startedat // hashstates// @@ -47,7 +47,7 @@ const storagePathVersion = "v2" // is just a directory of layers which are "linked" into a repository. A layer // can only be accessed through a qualified repository name if it is linked in // the repository. Uploads of layers are managed in the uploads directory, -// which is key by upload uuid. When all data for an upload is received, the +// which is key by upload id. When all data for an upload is received, the // data is moved into the blob store and the upload directory is deleted. // Abandoned uploads can be garbage collected by reading the startedat file // and removing uploads that have been active for longer than a certain time. @@ -80,20 +80,21 @@ const storagePathVersion = "v2" // manifestTagIndexEntryPathSpec: /v2/repositories//_manifests/tags//index/// // manifestTagIndexEntryLinkPathSpec: /v2/repositories//_manifests/tags//index///link // -// Layers: +// Blobs: // -// layerLinkPathSpec: /v2/repositories//_layers/tarsum////link +// layerLinkPathSpec: /v2/repositories//_layers///link // // Uploads: // -// uploadDataPathSpec: /v2/repositories//_uploads//data -// uploadStartedAtPathSpec: /v2/repositories//_uploads//startedat -// uploadHashStatePathSpec: /v2/repositories//_uploads//hashstates// +// uploadDataPathSpec: /v2/repositories//_uploads//data +// uploadStartedAtPathSpec: /v2/repositories//_uploads//startedat +// uploadHashStatePathSpec: /v2/repositories//_uploads//hashstates// // // Blob Store: // // blobPathSpec: /v2/blobs/// // blobDataPathSpec: /v2/blobs////data +// blobMediaTypePathSpec: /v2/blobs////data // // For more information on the semantic meaning of each path and their // contents, please see the path spec documentation. @@ -234,9 +235,14 @@ func (pm *pathMapper) path(spec pathSpec) (string, error) { return "", err } - layerLinkPathComponents := append(repoPrefix, v.name, "_layers") + // TODO(stevvooe): Right now, all blobs are linked under "_layers". If + // we have future migrations, we may want to rename this to "_blobs". + // A migration strategy would simply leave existing items in place and + // write the new paths, commit a file then delete the old files. - return path.Join(path.Join(append(layerLinkPathComponents, components...)...), "link"), nil + blobLinkPathComponents := append(repoPrefix, v.name, "_layers") + + return path.Join(path.Join(append(blobLinkPathComponents, components...)...), "link"), nil case blobDataPathSpec: components, err := digestPathComponents(v.digest, true) if err != nil { @@ -248,15 +254,15 @@ func (pm *pathMapper) path(spec pathSpec) (string, error) { return path.Join(append(blobPathPrefix, components...)...), nil case uploadDataPathSpec: - return path.Join(append(repoPrefix, v.name, "_uploads", v.uuid, "data")...), nil + return path.Join(append(repoPrefix, v.name, "_uploads", v.id, "data")...), nil case uploadStartedAtPathSpec: - return path.Join(append(repoPrefix, v.name, "_uploads", v.uuid, "startedat")...), nil + return path.Join(append(repoPrefix, v.name, "_uploads", v.id, "startedat")...), nil case uploadHashStatePathSpec: offset := fmt.Sprintf("%d", v.offset) if v.list { offset = "" // Limit to the prefix for listing offsets. } - return path.Join(append(repoPrefix, v.name, "_uploads", v.uuid, "hashstates", v.alg, offset)...), nil + return path.Join(append(repoPrefix, v.name, "_uploads", v.id, "hashstates", v.alg, offset)...), nil case repositoriesRootPathSpec: return path.Join(repoPrefix...), nil default: @@ -367,8 +373,8 @@ type manifestTagIndexEntryLinkPathSpec struct { func (manifestTagIndexEntryLinkPathSpec) pathSpec() {} -// layerLink specifies a path for a layer link, which is a file with a blob -// id. The layer link will contain a content addressable blob id reference +// blobLinkPathSpec specifies a path for a blob link, which is a file with a +// blob id. The blob link will contain a content addressable blob id reference // into the blob store. The format of the contents is as follows: // // : @@ -377,7 +383,7 @@ func (manifestTagIndexEntryLinkPathSpec) pathSpec() {} // // sha256:96443a84ce518ac22acb2e985eda402b58ac19ce6f91980bde63726a79d80b36 // -// This says indicates that there is a blob with the id/digest, calculated via +// This indicates that there is a blob with the id/digest, calculated via // sha256 that can be fetched from the blob store. type layerLinkPathSpec struct { name string @@ -415,7 +421,7 @@ func (blobDataPathSpec) pathSpec() {} // uploads. type uploadDataPathSpec struct { name string - uuid string + id string } func (uploadDataPathSpec) pathSpec() {} @@ -429,7 +435,7 @@ func (uploadDataPathSpec) pathSpec() {} // the client to enforce time out policies. type uploadStartedAtPathSpec struct { name string - uuid string + id string } func (uploadStartedAtPathSpec) pathSpec() {} @@ -437,10 +443,10 @@ func (uploadStartedAtPathSpec) pathSpec() {} // uploadHashStatePathSpec defines the path parameters for the file that stores // the hash function state of an upload at a specific byte offset. If `list` is // set, then the path mapper will generate a list prefix for all hash state -// offsets for the upload identified by the name, uuid, and alg. +// offsets for the upload identified by the name, id, and alg. type uploadHashStatePathSpec struct { name string - uuid string + id string alg string offset int64 list bool diff --git a/registry/storage/paths_test.go b/registry/storage/paths_test.go index 7dff6e093..3d17b3779 100644 --- a/registry/storage/paths_test.go +++ b/registry/storage/paths_test.go @@ -111,14 +111,14 @@ func TestPathMapper(t *testing.T) { { spec: uploadDataPathSpec{ name: "foo/bar", - uuid: "asdf-asdf-asdf-adsf", + id: "asdf-asdf-asdf-adsf", }, expected: "/pathmapper-test/repositories/foo/bar/_uploads/asdf-asdf-asdf-adsf/data", }, { spec: uploadStartedAtPathSpec{ name: "foo/bar", - uuid: "asdf-asdf-asdf-adsf", + id: "asdf-asdf-asdf-adsf", }, expected: "/pathmapper-test/repositories/foo/bar/_uploads/asdf-asdf-asdf-adsf/startedat", }, diff --git a/registry/storage/purgeuploads_test.go b/registry/storage/purgeuploads_test.go index 7c0f88134..d44084791 100644 --- a/registry/storage/purgeuploads_test.go +++ b/registry/storage/purgeuploads_test.go @@ -24,7 +24,7 @@ func testUploadFS(t *testing.T, numUploads int, repoName string, startedAt time. } func addUploads(ctx context.Context, t *testing.T, d driver.StorageDriver, uploadID, repo string, startedAt time.Time) { - dataPath, err := pm.path(uploadDataPathSpec{name: repo, uuid: uploadID}) + dataPath, err := pm.path(uploadDataPathSpec{name: repo, id: uploadID}) if err != nil { t.Fatalf("Unable to resolve path") } @@ -32,7 +32,7 @@ func addUploads(ctx context.Context, t *testing.T, d driver.StorageDriver, uploa t.Fatalf("Unable to write data file") } - startedAtPath, err := pm.path(uploadStartedAtPathSpec{name: repo, uuid: uploadID}) + startedAtPath, err := pm.path(uploadStartedAtPathSpec{name: repo, id: uploadID}) if err != nil { t.Fatalf("Unable to resolve path") } @@ -115,7 +115,7 @@ func TestPurgeOnlyUploads(t *testing.T) { // Create a directory tree outside _uploads and ensure // these files aren't deleted. - dataPath, err := pm.path(uploadDataPathSpec{name: "test-repo", uuid: uuid.New()}) + dataPath, err := pm.path(uploadDataPathSpec{name: "test-repo", id: uuid.New()}) if err != nil { t.Fatalf(err.Error()) } diff --git a/registry/storage/registry.go b/registry/storage/registry.go index 2834e5eb1..659c789e7 100644 --- a/registry/storage/registry.go +++ b/registry/storage/registry.go @@ -2,38 +2,53 @@ package storage import ( "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/registry/api/v2" "github.com/docker/distribution/registry/storage/cache" storagedriver "github.com/docker/distribution/registry/storage/driver" - "golang.org/x/net/context" ) // registry is the top-level implementation of Registry for use in the storage // package. All instances should descend from this object. type registry struct { - driver storagedriver.StorageDriver - pm *pathMapper - blobStore *blobStore - layerInfoCache cache.LayerInfoCache + blobStore *blobStore + blobServer distribution.BlobServer + statter distribution.BlobStatter // global statter service. + blobDescriptorCacheProvider cache.BlobDescriptorCacheProvider } // NewRegistryWithDriver creates a new registry instance from the provided // driver. The resulting registry may be shared by multiple goroutines but is // cheap to allocate. -func NewRegistryWithDriver(ctx context.Context, driver storagedriver.StorageDriver, layerInfoCache cache.LayerInfoCache) distribution.Namespace { - bs := &blobStore{ +func NewRegistryWithDriver(ctx context.Context, driver storagedriver.StorageDriver, blobDescriptorCacheProvider cache.BlobDescriptorCacheProvider) distribution.Namespace { + + // create global statter, with cache. + var statter distribution.BlobStatter = &blobStatter{ driver: driver, pm: defaultPathMapper, - ctx: ctx, + } + + if blobDescriptorCacheProvider != nil { + statter = &cachedBlobStatter{ + cache: blobDescriptorCacheProvider, + backend: statter, + } + } + + bs := &blobStore{ + driver: driver, + pm: defaultPathMapper, + statter: statter, } return ®istry{ - driver: driver, blobStore: bs, - - // TODO(sday): This should be configurable. - pm: defaultPathMapper, - layerInfoCache: layerInfoCache, + blobServer: &blobServer{ + driver: driver, + statter: statter, + pathFn: bs.path, + }, + blobDescriptorCacheProvider: blobDescriptorCacheProvider, } } @@ -54,18 +69,29 @@ func (reg *registry) Repository(ctx context.Context, name string) (distribution. } } + var descriptorCache distribution.BlobDescriptorService + if reg.blobDescriptorCacheProvider != nil { + var err error + descriptorCache, err = reg.blobDescriptorCacheProvider.RepositoryScoped(name) + if err != nil { + return nil, err + } + } + return &repository{ - ctx: ctx, - registry: reg, - name: name, + ctx: ctx, + registry: reg, + name: name, + descriptorCache: descriptorCache, }, nil } // repository provides name-scoped access to various services. type repository struct { *registry - ctx context.Context - name string + ctx context.Context + name string + descriptorCache distribution.BlobDescriptorService } // Name returns the name of the repository. @@ -78,47 +104,68 @@ func (repo *repository) Name() string { // to a request local. func (repo *repository) Manifests() distribution.ManifestService { return &manifestStore{ + ctx: repo.ctx, repository: repo, revisionStore: &revisionStore{ + ctx: repo.ctx, repository: repo, + blobStore: &linkedBlobStore{ + ctx: repo.ctx, + blobStore: repo.blobStore, + repository: repo, + statter: &linkedBlobStatter{ + blobStore: repo.blobStore, + repository: repo, + linkPath: manifestRevisionLinkPath, + }, + + // TODO(stevvooe): linkPath limits this blob store to only + // manifests. This instance cannot be used for blob checks. + linkPath: manifestRevisionLinkPath, + }, }, tagStore: &tagStore{ + ctx: repo.ctx, repository: repo, + blobStore: repo.registry.blobStore, }, } } -// Layers returns an instance of the LayerService. Instantiation is cheap and +// Blobs returns an instance of the BlobStore. Instantiation is cheap and // may be context sensitive in the future. The instance should be used similar // to a request local. -func (repo *repository) Layers() distribution.LayerService { - ls := &layerStore{ +func (repo *repository) Blobs(ctx context.Context) distribution.BlobStore { + var statter distribution.BlobStatter = &linkedBlobStatter{ + blobStore: repo.blobStore, repository: repo, + linkPath: blobLinkPath, } - if repo.registry.layerInfoCache != nil { - // TODO(stevvooe): This is not the best place to setup a cache. We would - // really like to decouple the cache from the backend but also have the - // manifeset service use the layer service cache. For now, we can simply - // integrate the cache directly. The main issue is that we have layer - // access and layer data coupled in a single object. Work is already under - // way to decouple this. - - return &cachedLayerService{ - LayerService: ls, - repository: repo, - ctx: repo.ctx, - driver: repo.driver, - blobStore: repo.blobStore, - cache: repo.registry.layerInfoCache, + if repo.descriptorCache != nil { + statter = &cachedBlobStatter{ + cache: repo.descriptorCache, + backend: statter, } } - return ls + return &linkedBlobStore{ + blobStore: repo.blobStore, + blobServer: repo.blobServer, + statter: statter, + repository: repo, + ctx: ctx, + + // TODO(stevvooe): linkPath limits this blob store to only layers. + // This instance cannot be used for manifest checks. + linkPath: blobLinkPath, + } } func (repo *repository) Signatures() distribution.SignatureService { return &signatureStore{ repository: repo, + blobStore: repo.blobStore, + ctx: repo.ctx, } } diff --git a/registry/storage/revisionstore.go b/registry/storage/revisionstore.go index 066ce972b..9838bff20 100644 --- a/registry/storage/revisionstore.go +++ b/registry/storage/revisionstore.go @@ -3,8 +3,8 @@ package storage import ( "encoding/json" - "github.com/Sirupsen/logrus" "github.com/docker/distribution" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" "github.com/docker/distribution/manifest" "github.com/docker/libtrust" @@ -12,47 +12,56 @@ import ( // revisionStore supports storing and managing manifest revisions. type revisionStore struct { - *repository + repository *repository + blobStore *linkedBlobStore + ctx context.Context } -// exists returns true if the revision is available in the named repository. -func (rs *revisionStore) exists(revision digest.Digest) (bool, error) { - revpath, err := rs.pm.path(manifestRevisionPathSpec{ - name: rs.Name(), - revision: revision, - }) - - if err != nil { - return false, err +func newRevisionStore(ctx context.Context, repo *repository, blobStore *blobStore) *revisionStore { + return &revisionStore{ + ctx: ctx, + repository: repo, + blobStore: &linkedBlobStore{ + blobStore: blobStore, + repository: repo, + ctx: ctx, + linkPath: manifestRevisionLinkPath, + }, } - - exists, err := exists(rs.repository.ctx, rs.driver, revpath) - if err != nil { - return false, err - } - - return exists, nil } // get retrieves the manifest, keyed by revision digest. -func (rs *revisionStore) get(revision digest.Digest) (*manifest.SignedManifest, error) { +func (rs *revisionStore) get(ctx context.Context, revision digest.Digest) (*manifest.SignedManifest, error) { // Ensure that this revision is available in this repository. - if exists, err := rs.exists(revision); err != nil { - return nil, err - } else if !exists { - return nil, distribution.ErrUnknownManifestRevision{ - Name: rs.Name(), - Revision: revision, + _, err := rs.blobStore.Stat(ctx, revision) + if err != nil { + if err == distribution.ErrBlobUnknown { + return nil, distribution.ErrManifestUnknownRevision{ + Name: rs.repository.Name(), + Revision: revision, + } } + + return nil, err } - content, err := rs.blobStore.get(revision) + // TODO(stevvooe): Need to check descriptor from above to ensure that the + // mediatype is as we expect for the manifest store. + + content, err := rs.blobStore.Get(ctx, revision) if err != nil { + if err == distribution.ErrBlobUnknown { + return nil, distribution.ErrManifestUnknownRevision{ + Name: rs.repository.Name(), + Revision: revision, + } + } + return nil, err } // Fetch the signatures for the manifest - signatures, err := rs.Signatures().Get(revision) + signatures, err := rs.repository.Signatures().Get(revision) if err != nil { return nil, err } @@ -78,69 +87,34 @@ func (rs *revisionStore) get(revision digest.Digest) (*manifest.SignedManifest, // put stores the manifest in the repository, if not already present. Any // updated signatures will be stored, as well. -func (rs *revisionStore) put(sm *manifest.SignedManifest) (digest.Digest, error) { +func (rs *revisionStore) put(ctx context.Context, sm *manifest.SignedManifest) (distribution.Descriptor, error) { // Resolve the payload in the manifest. payload, err := sm.Payload() if err != nil { - return "", err + return distribution.Descriptor{}, err } // Digest and store the manifest payload in the blob store. - revision, err := rs.blobStore.put(payload) + revision, err := rs.blobStore.Put(ctx, manifest.ManifestMediaType, payload) if err != nil { - logrus.Errorf("error putting payload into blobstore: %v", err) - return "", err + context.GetLogger(ctx).Errorf("error putting payload into blobstore: %v", err) + return distribution.Descriptor{}, err } // Link the revision into the repository. - if err := rs.link(revision); err != nil { - return "", err + if err := rs.blobStore.linkBlob(ctx, revision); err != nil { + return distribution.Descriptor{}, err } // Grab each json signature and store them. signatures, err := sm.Signatures() if err != nil { - return "", err + return distribution.Descriptor{}, err } - if err := rs.Signatures().Put(revision, signatures...); err != nil { - return "", err + if err := rs.repository.Signatures().Put(revision.Digest, signatures...); err != nil { + return distribution.Descriptor{}, err } return revision, nil } - -// link links the revision into the repository. -func (rs *revisionStore) link(revision digest.Digest) error { - revisionPath, err := rs.pm.path(manifestRevisionLinkPathSpec{ - name: rs.Name(), - revision: revision, - }) - - if err != nil { - return err - } - - if exists, err := exists(rs.repository.ctx, rs.driver, revisionPath); err != nil { - return err - } else if exists { - // Revision has already been linked! - return nil - } - - return rs.blobStore.link(revisionPath, revision) -} - -// delete removes the specified manifest revision from storage. -func (rs *revisionStore) delete(revision digest.Digest) error { - revisionPath, err := rs.pm.path(manifestRevisionPathSpec{ - name: rs.Name(), - revision: revision, - }) - - if err != nil { - return err - } - - return rs.driver.Delete(rs.repository.ctx, revisionPath) -} diff --git a/registry/storage/signaturestore.go b/registry/storage/signaturestore.go index fcf6224f2..f6c23e27b 100644 --- a/registry/storage/signaturestore.go +++ b/registry/storage/signaturestore.go @@ -10,14 +10,24 @@ import ( ) type signatureStore struct { - *repository + repository *repository + blobStore *blobStore + ctx context.Context +} + +func newSignatureStore(ctx context.Context, repo *repository, blobStore *blobStore) *signatureStore { + return &signatureStore{ + ctx: ctx, + repository: repo, + blobStore: blobStore, + } } var _ distribution.SignatureService = &signatureStore{} func (s *signatureStore) Get(dgst digest.Digest) ([][]byte, error) { - signaturesPath, err := s.pm.path(manifestSignaturesPathSpec{ - name: s.Name(), + signaturesPath, err := s.blobStore.pm.path(manifestSignaturesPathSpec{ + name: s.repository.Name(), revision: dgst, }) @@ -30,7 +40,7 @@ func (s *signatureStore) Get(dgst digest.Digest) ([][]byte, error) { // can be eliminated by implementing listAll on drivers. signaturesPath = path.Join(signaturesPath, "sha256") - signaturePaths, err := s.driver.List(s.repository.ctx, signaturesPath) + signaturePaths, err := s.blobStore.driver.List(s.ctx, signaturesPath) if err != nil { return nil, err } @@ -43,27 +53,32 @@ func (s *signatureStore) Get(dgst digest.Digest) ([][]byte, error) { } ch := make(chan result) + bs := s.linkedBlobStore(s.ctx, dgst) for i, sigPath := range signaturePaths { - // Append the link portion - sigPath = path.Join(sigPath, "link") + sigdgst, err := digest.ParseDigest("sha256:" + path.Base(sigPath)) + if err != nil { + context.GetLogger(s.ctx).Errorf("could not get digest from path: %q, skipping", sigPath) + continue + } wg.Add(1) - go func(idx int, sigPath string) { + go func(idx int, sigdgst digest.Digest) { defer wg.Done() context.GetLogger(s.ctx). - Debugf("fetching signature from %q", sigPath) + Debugf("fetching signature %q", sigdgst) r := result{index: idx} - if p, err := s.blobStore.linked(sigPath); err != nil { + + if p, err := bs.Get(s.ctx, sigdgst); err != nil { context.GetLogger(s.ctx). - Errorf("error fetching signature from %q: %v", sigPath, err) + Errorf("error fetching signature %q: %v", sigdgst, err) r.err = err } else { r.signature = p } ch <- r - }(i, sigPath) + }(i, sigdgst) } done := make(chan struct{}) go func() { @@ -91,25 +106,36 @@ loop: } func (s *signatureStore) Put(dgst digest.Digest, signatures ...[]byte) error { + bs := s.linkedBlobStore(s.ctx, dgst) for _, signature := range signatures { - signatureDigest, err := s.blobStore.put(signature) - if err != nil { - return err - } - - signaturePath, err := s.pm.path(manifestSignatureLinkPathSpec{ - name: s.Name(), - revision: dgst, - signature: signatureDigest, - }) - - if err != nil { - return err - } - - if err := s.blobStore.link(signaturePath, signatureDigest); err != nil { + if _, err := bs.Put(s.ctx, "application/json", signature); err != nil { return err } } return nil } + +// namedBlobStore returns the namedBlobStore of the signatures for the +// manifest with the given digest. Effectively, each singature link path +// layout is a unique linked blob store. +func (s *signatureStore) linkedBlobStore(ctx context.Context, revision digest.Digest) *linkedBlobStore { + linkpath := func(pm *pathMapper, name string, dgst digest.Digest) (string, error) { + return pm.path(manifestSignatureLinkPathSpec{ + name: name, + revision: revision, + signature: dgst, + }) + } + + return &linkedBlobStore{ + ctx: ctx, + repository: s.repository, + blobStore: s.blobStore, + statter: &linkedBlobStatter{ + blobStore: s.blobStore, + repository: s.repository, + linkPath: linkpath, + }, + linkPath: linkpath, + } +} diff --git a/registry/storage/tagstore.go b/registry/storage/tagstore.go index 882e6c351..a74d9b094 100644 --- a/registry/storage/tagstore.go +++ b/registry/storage/tagstore.go @@ -4,31 +4,33 @@ import ( "path" "github.com/docker/distribution" - // "github.com/docker/distribution/context" + "github.com/docker/distribution/context" "github.com/docker/distribution/digest" storagedriver "github.com/docker/distribution/registry/storage/driver" ) // tagStore provides methods to manage manifest tags in a backend storage driver. type tagStore struct { - *repository + repository *repository + blobStore *blobStore + ctx context.Context } // tags lists the manifest tags for the specified repository. func (ts *tagStore) tags() ([]string, error) { - p, err := ts.pm.path(manifestTagPathSpec{ - name: ts.name, + p, err := ts.blobStore.pm.path(manifestTagPathSpec{ + name: ts.repository.Name(), }) if err != nil { return nil, err } var tags []string - entries, err := ts.driver.List(ts.repository.ctx, p) + entries, err := ts.blobStore.driver.List(ts.ctx, p) if err != nil { switch err := err.(type) { case storagedriver.PathNotFoundError: - return nil, distribution.ErrRepositoryUnknown{Name: ts.name} + return nil, distribution.ErrRepositoryUnknown{Name: ts.repository.Name()} default: return nil, err } @@ -45,15 +47,15 @@ func (ts *tagStore) tags() ([]string, error) { // exists returns true if the specified manifest tag exists in the repository. func (ts *tagStore) exists(tag string) (bool, error) { - tagPath, err := ts.pm.path(manifestTagCurrentPathSpec{ - name: ts.Name(), + tagPath, err := ts.blobStore.pm.path(manifestTagCurrentPathSpec{ + name: ts.repository.Name(), tag: tag, }) if err != nil { return false, err } - exists, err := exists(ts.repository.ctx, ts.driver, tagPath) + exists, err := exists(ts.ctx, ts.blobStore.driver, tagPath) if err != nil { return false, err } @@ -64,18 +66,8 @@ func (ts *tagStore) exists(tag string) (bool, error) { // tag tags the digest with the given tag, updating the the store to point at // the current tag. The digest must point to a manifest. func (ts *tagStore) tag(tag string, revision digest.Digest) error { - indexEntryPath, err := ts.pm.path(manifestTagIndexEntryLinkPathSpec{ - name: ts.Name(), - tag: tag, - revision: revision, - }) - - if err != nil { - return err - } - - currentPath, err := ts.pm.path(manifestTagCurrentPathSpec{ - name: ts.Name(), + currentPath, err := ts.blobStore.pm.path(manifestTagCurrentPathSpec{ + name: ts.repository.Name(), tag: tag, }) @@ -83,77 +75,69 @@ func (ts *tagStore) tag(tag string, revision digest.Digest) error { return err } + nbs := ts.linkedBlobStore(ts.ctx, tag) // Link into the index - if err := ts.blobStore.link(indexEntryPath, revision); err != nil { + if err := nbs.linkBlob(ts.ctx, distribution.Descriptor{Digest: revision}); err != nil { return err } // Overwrite the current link - return ts.blobStore.link(currentPath, revision) + return ts.blobStore.link(ts.ctx, currentPath, revision) } // resolve the current revision for name and tag. func (ts *tagStore) resolve(tag string) (digest.Digest, error) { - currentPath, err := ts.pm.path(manifestTagCurrentPathSpec{ - name: ts.Name(), + currentPath, err := ts.blobStore.pm.path(manifestTagCurrentPathSpec{ + name: ts.repository.Name(), tag: tag, }) - if err != nil { return "", err } - if exists, err := exists(ts.repository.ctx, ts.driver, currentPath); err != nil { - return "", err - } else if !exists { - return "", distribution.ErrManifestUnknown{Name: ts.Name(), Tag: tag} - } - - revision, err := ts.blobStore.readlink(currentPath) + revision, err := ts.blobStore.readlink(ts.ctx, currentPath) if err != nil { + switch err.(type) { + case storagedriver.PathNotFoundError: + return "", distribution.ErrManifestUnknown{Name: ts.repository.Name(), Tag: tag} + } + return "", err } return revision, nil } -// revisions returns all revisions with the specified name and tag. -func (ts *tagStore) revisions(tag string) ([]digest.Digest, error) { - manifestTagIndexPath, err := ts.pm.path(manifestTagIndexPathSpec{ - name: ts.Name(), - tag: tag, - }) - - if err != nil { - return nil, err - } - - // TODO(stevvooe): Need to append digest alg to get listing of revisions. - manifestTagIndexPath = path.Join(manifestTagIndexPath, "sha256") - - entries, err := ts.driver.List(ts.repository.ctx, manifestTagIndexPath) - if err != nil { - return nil, err - } - - var revisions []digest.Digest - for _, entry := range entries { - revisions = append(revisions, digest.NewDigestFromHex("sha256", path.Base(entry))) - } - - return revisions, nil -} - // delete removes the tag from repository, including the history of all // revisions that have the specified tag. func (ts *tagStore) delete(tag string) error { - tagPath, err := ts.pm.path(manifestTagPathSpec{ - name: ts.Name(), + tagPath, err := ts.blobStore.pm.path(manifestTagPathSpec{ + name: ts.repository.Name(), tag: tag, }) if err != nil { return err } - return ts.driver.Delete(ts.repository.ctx, tagPath) + return ts.blobStore.driver.Delete(ts.ctx, tagPath) +} + +// namedBlobStore returns the namedBlobStore for the named tag, allowing one +// to index manifest blobs by tag name. While the tag store doesn't map +// precisely to the linked blob store, using this ensures the links are +// managed via the same code path. +func (ts *tagStore) linkedBlobStore(ctx context.Context, tag string) *linkedBlobStore { + return &linkedBlobStore{ + blobStore: ts.blobStore, + repository: ts.repository, + ctx: ctx, + linkPath: func(pm *pathMapper, name string, dgst digest.Digest) (string, error) { + return pm.path(manifestTagIndexEntryLinkPathSpec{ + name: name, + tag: tag, + revision: dgst, + }) + }, + } + } diff --git a/registry/storage/util.go b/registry/storage/util.go new file mode 100644 index 000000000..773d7ba0b --- /dev/null +++ b/registry/storage/util.go @@ -0,0 +1,21 @@ +package storage + +import ( + "github.com/docker/distribution/context" + "github.com/docker/distribution/registry/storage/driver" +) + +// Exists provides a utility method to test whether or not a path exists in +// the given driver. +func exists(ctx context.Context, drv driver.StorageDriver, path string) (bool, error) { + if _, err := drv.Stat(ctx, path); err != nil { + switch err := err.(type) { + case driver.PathNotFoundError: + return false, nil + default: + return false, err + } + } + + return true, nil +}