Commit graph

13 commits

Author SHA1 Message Date
Stephen J Day
ba6b774aea Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.

Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.

The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.

Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.

Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-09 14:50:39 -08:00
Brian Bland
ea6c082e85 Minor cleanup/testing for HMAC upload tokens
Changes configuration variable, lowercases private interface methods,
adds token sanity tests.
2015-01-05 14:37:56 -08:00
Brian Bland
07ba5db168 Serializes upload state to an HMAC token for subsequent requests
To support clustered registry, upload UUIDs must be recognizable by
registries that did not issue the UUID. By creating an HMAC verifiable
upload state token, registries can validate upload requests that other
instances authorized. The tokenProvider interface could also use a redis
store or other system for token handling in the future.
2015-01-05 14:27:05 -08:00
Stephen J Day
a4024b2f90 Move manifest to discrete package
Because manifests and their signatures are a discrete component of the
registry, we are moving the definitions into a separate package. This causes us
to lose some test coverage, but we can fill this in shortly. No changes have
been made to the external interfaces, but they are likely to come.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-02 13:23:11 -08:00
Olivier Gambier
67ca9d10cf Move from docker-registry to distribution 2014-12-23 17:13:02 -08:00
Stephen J Day
a4f42b8eea Relax requirement for size argument during blob upload
During client implementation, it was found that requiring the size argument
made client implementation more complex. The original benefit of the size
argument was to provide an additional check alongside of tarsum to validate
incoming data. For the purposes of the registry, it has been determined that
tarsum should be enough to validate incoming content.

At this time, the size check is optional but we may consider removing it
completely.
2014-12-12 19:08:50 -08:00
Stephen J Day
70ab06b864 Update storage package to use StorageDriver.Stat
This change updates the backend storage package that consumes StorageDriver to
use the new Stat call, over CurrentSize. It also makes minor updates for using
WriteStream and ReadStream.
2014-12-04 20:55:59 -08:00
Stephen J Day
6fead90736 Rich error reporting for manifest push
To provide rich error reporting during manifest pushes, the storage layers
verifyManifest stage has been modified to provide the necessary granularity.
Along with this comes with a partial shift to explicit error types, which
represents a small move in larger refactoring of error handling. Signature
methods from libtrust have been added to the various Manifest types to clean up
the verification code.

A primitive deletion implementation for manifests has been added. It only
deletes the manifest file and doesn't attempt to add some of the richer
features request, such as layer cleanup.
2014-11-26 12:57:14 -08:00
Stephen J Day
68944ea9cf Clean up layer storage layout
Previously, discussions were still ongoing about different storage layouts that
could support various access models. This changeset removes a layer of
indirection that was in place due to earlier designs. Effectively, this both
associates a layer with a named repository and ensures that content cannot be
accessed across repositories. It also moves to rely on tarsum as a true
content-addressable identifier, removing a layer of indirection during blob
resolution.
2014-11-25 09:57:43 -08:00
Stephen J Day
3f479b62b4 Refactor layerReader into fileReader
This change separates out the remote file reader functionality from layer
reprsentation data. More importantly, issues with seeking have been fixed and
thoroughly tested.
2014-11-21 15:24:14 -08:00
Stephen J Day
c0fe9d72d1 Various adjustments to digest package for govet/golint 2014-11-19 14:59:05 -08:00
Stephen J Day
1a508d67d9 Move storage package to use Digest type
Mostly, we've made superficial changes to the storage package to start using
the Digest type. Many of the exported interface methods have been changed to
reflect this in addition to changes in the way layer uploads will be initiated.

Further work here is necessary but will come with a separate PR.
2014-11-19 14:39:32 -08:00
Stephen J Day
2637e29e18 Initial implementation of registry LayerService
This change contains the initial implementation of the LayerService to power
layer push and pulls on the storagedriver. The interfaces presented in this
package will be used by the http application to drive most features around
efficient pulls and resumable pushes.

The file storage/layer.go defines the interface interactions. LayerService is
the root type and supports methods to access Layer and LayerUpload objects.
Pull operations are supported with LayerService.Fetch and push operations are
supported with LayerService.Upload and LayerService.Resume. Reads and writes of
layers are split between Layer and LayerUpload, respectively.

LayerService is implemented internally with the layerStore object, which takes
a storagedriver.StorageDriver and a pathMapper instance.

LayerUploadState is currently exported and will likely continue to be as the
interaction between it and layerUploadStore are better understood. Likely, the
layerUploadStore lifecycle and implementation will be deferred to the
application.

Image pushes pulls will be implemented in a similar manner without the
discrete, persistent upload.

Much of this change is in place to get something running and working. Caveats
of this change include the following:

1. Layer upload state storage is implemented on the local filesystem, separate
   from the storage driver. This must be replaced with using the proper backend
   and other state storage. This can be removed when we implement resumable
   hashing and tarsum calculations to avoid backend roundtrips.
2. Error handling is rather bespoke at this time. The http API implementation
   should really dictate the error return structure for the future, so we
   intend to refactor this heavily to support these errors. We'd also like to
   collect production data to understand how failures happen in the system as
   a while before moving to a particular edict around error handling.
3. The layerUploadStore, which manages layer upload storage and state is not
   currently exported. This will likely end up being split, with the file
   management portion being pointed at the storagedriver and the state storage
   elsewhere.
4. Access Control provisions are nearly completely missing from this change.
   There are details around how layerindex lookup works that are related with
   access controls. As the auth portions of the new API take shape, these
   provisions will become more clear.

Please see TODOs for details and individual recommendations.
2014-11-17 17:54:07 -08:00