distribution/registry/storage/layerupload.go

241 lines
6.9 KiB
Go
Raw Normal View History

Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
package storage
import (
"fmt"
"io"
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
"path"
"time"
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
"github.com/Sirupsen/logrus"
"github.com/docker/distribution"
ctxu "github.com/docker/distribution/context"
"github.com/docker/distribution/digest"
storagedriver "github.com/docker/distribution/registry/storage/driver"
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
)
// layerUploadController is used to control the various aspects of resumable
// layer upload. It implements the LayerUpload interface.
type layerUploadController struct {
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
layerStore *layerStore
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
uuid string
startedAt time.Time
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
// implementes io.WriteSeeker, io.ReaderFrom and io.Closer to satisy
// LayerUpload Interface
bufferedFileWriter
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
var _ distribution.LayerUpload = &layerUploadController{}
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
// UUID returns the identifier for this upload.
func (luc *layerUploadController) UUID() string {
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return luc.uuid
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
func (luc *layerUploadController) StartedAt() time.Time {
return luc.startedAt
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// Finish marks the upload as completed, returning a valid handle to the
// uploaded layer. The final size and checksum are validated against the
// contents of the uploaded layer. The checksum should be provided in the
// format <algorithm>:<hex digest>.
func (luc *layerUploadController) Finish(digest digest.Digest) (distribution.Layer, error) {
ctxu.GetLogger(luc.layerStore.repository.ctx).Debug("(*layerUploadController).Finish")
err := luc.bufferedFileWriter.Close()
if err != nil {
return nil, err
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
canonical, err := luc.validateLayer(digest)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
if err != nil {
return nil, err
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
if err := luc.moveLayer(canonical); err != nil {
// TODO(stevvooe): Cleanup?
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
return nil, err
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// Link the layer blob into the repository.
if err := luc.linkLayer(canonical, digest); err != nil {
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
return nil, err
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
if err := luc.removeResources(); err != nil {
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
return nil, err
}
return luc.layerStore.Fetch(canonical)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// Cancel the layer upload process.
func (luc *layerUploadController) Cancel() error {
ctxu.GetLogger(luc.layerStore.repository.ctx).Debug("(*layerUploadController).Cancel")
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
if err := luc.removeResources(); err != nil {
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
return err
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
luc.Close()
return nil
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// validateLayer checks the layer data against the digest, returning an error
// if it does not match. The canonical digest is returned.
func (luc *layerUploadController) validateLayer(dgst digest.Digest) (digest.Digest, error) {
digestVerifier := digest.NewDigestVerifier(dgst)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// TODO(stevvooe): Store resumable hash calculations in upload directory
// in driver. Something like a file at path <uuid>/resumablehash/<offest>
// with the hash state up to that point would be perfect. The hasher would
// then only have to fetch the difference.
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// Read the file from the backend driver and validate it.
fr, err := newFileReader(luc.bufferedFileWriter.driver, luc.path)
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
if err != nil {
return "", err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
tr := io.TeeReader(fr, digestVerifier)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
// TODO(stevvooe): This is one of the places we need a Digester write
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// sink. Instead, its read driven. This might be okay.
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
// Calculate an updated digest with the latest version.
canonical, err := digest.FromReader(tr)
if err != nil {
return "", err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
if !digestVerifier.Verified() {
return "", distribution.ErrLayerInvalidDigest{
Digest: dgst,
Reason: fmt.Errorf("content does not match digest"),
}
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return canonical, nil
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// moveLayer moves the data into its final, hash-qualified destination,
// identified by dgst. The layer should be validated before commencing the
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// move.
func (luc *layerUploadController) moveLayer(dgst digest.Digest) error {
blobPath, err := luc.layerStore.repository.registry.pm.path(blobDataPathSpec{
digest: dgst,
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
})
if err != nil {
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// Check for existence
if _, err := luc.driver.Stat(blobPath); err != nil {
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
switch err := err.(type) {
case storagedriver.PathNotFoundError:
break // ensure that it doesn't exist.
default:
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
} else {
// If the path exists, we can assume that the content has already
// been uploaded, since the blob storage is content-addressable.
// While it may be corrupted, detection of such corruption belongs
// elsewhere.
return nil
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// If no data was received, we may not actually have a file on disk. Check
// the size here and write a zero-length file to blobPath if this is the
// case. For the most part, this should only ever happen with zero-length
// tars.
if _, err := luc.driver.Stat(luc.path); err != nil {
switch err := err.(type) {
case storagedriver.PathNotFoundError:
// HACK(stevvooe): This is slightly dangerous: if we verify above,
// get a hash, then the underlying file is deleted, we risk moving
// a zero-length blob into a nonzero-length blob location. To
// prevent this horrid thing, we employ the hack of only allowing
// to this happen for the zero tarsum.
if dgst == digest.DigestSha256EmptyTar {
return luc.driver.PutContent(blobPath, []byte{})
}
// We let this fail during the move below.
logrus.
WithField("upload.uuid", luc.UUID()).
WithField("digest", dgst).Warnf("attempted to move zero-length content with non-zero digest")
default:
return err // unrelated error
}
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return luc.driver.Move(luc.path, blobPath)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
// linkLayer links a valid, written layer blob into the registry under the
// named repository for the upload controller.
func (luc *layerUploadController) linkLayer(canonical digest.Digest, aliases ...digest.Digest) error {
dgsts := append([]digest.Digest{canonical}, aliases...)
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
// Don't make duplicate links.
seenDigests := make(map[digest.Digest]struct{}, len(dgsts))
for _, dgst := range dgsts {
if _, seen := seenDigests[dgst]; seen {
continue
}
seenDigests[dgst] = struct{}{}
layerLinkPath, err := luc.layerStore.repository.registry.pm.path(layerLinkPathSpec{
name: luc.layerStore.repository.Name(),
digest: dgst,
})
if err != nil {
return err
}
if err := luc.layerStore.repository.registry.driver.PutContent(layerLinkPath, []byte(canonical)); err != nil {
return err
}
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
return nil
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// removeResources should clean up all resources associated with the upload
// instance. An error will be returned if the clean up cannot proceed. If the
// resources are already not present, no error will be returned.
func (luc *layerUploadController) removeResources() error {
dataPath, err := luc.layerStore.repository.registry.pm.path(uploadDataPathSpec{
name: luc.layerStore.repository.Name(),
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
uuid: luc.uuid,
})
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
if err != nil {
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
return err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
// Resolve and delete the containing directory, which should include any
// upload related files.
dirPath := path.Dir(dataPath)
Spool layer uploads to remote storage To smooth initial implementation, uploads were spooled to local file storage, validated, then pushed to remote storage. That approach was flawed in that it present easy clustering of registry services that share a remote storage backend. The original plan was to implement resumable hashes then implement remote upload storage. After some thought, it was found to be better to get remote spooling working, then optimize with resumable hashes. Moving to this approach has tradeoffs: after storing the complete upload remotely, the node must fetch the content and validate it before moving it to the final location. This can double bandwidth usage to the remote backend. Modifying the verification and upload code to store intermediate hashes should be trivial once the layer digest format has settled. The largest changes for users of the storage package (mostly the registry app) are the LayerService interface and the LayerUpload interface. The LayerService now takes qualified repository names to start and resume uploads. In corallry, the concept of LayerUploadState has been complete removed, exposing all aspects of that state as part of the LayerUpload object. The LayerUpload object has been modified to work as an io.WriteSeeker and includes a StartedAt time, to allow for upload timeout policies. Finish now only requires a digest, eliding the requirement for a size parameter. Resource cleanup has taken a turn for the better. Resources are cleaned up after successful uploads and during a cancel call. Admittedly, this is probably not completely where we want to be. It's recommend that we bolster this with a periodic driver utility script that scans for partial uploads and deletes the underlying data. As a small benefit, we can leave these around to better understand how and why these uploads are failing, at the cost of some extra disk space. Many other changes follow from the changes above. The webapp needs to be updated to meet the new interface requirements. Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
if err := luc.driver.Delete(dirPath); err != nil {
switch err := err.(type) {
case storagedriver.PathNotFoundError:
break // already gone!
default:
// This should be uncommon enough such that returning an error
// should be okay. At this point, the upload should be mostly
// complete, but perhaps the backend became unaccessible.
logrus.Errorf("unable to delete layer upload resources %q: %v", dirPath, err)
return err
Initial implementation of registry LayerService This change contains the initial implementation of the LayerService to power layer push and pulls on the storagedriver. The interfaces presented in this package will be used by the http application to drive most features around efficient pulls and resumable pushes. The file storage/layer.go defines the interface interactions. LayerService is the root type and supports methods to access Layer and LayerUpload objects. Pull operations are supported with LayerService.Fetch and push operations are supported with LayerService.Upload and LayerService.Resume. Reads and writes of layers are split between Layer and LayerUpload, respectively. LayerService is implemented internally with the layerStore object, which takes a storagedriver.StorageDriver and a pathMapper instance. LayerUploadState is currently exported and will likely continue to be as the interaction between it and layerUploadStore are better understood. Likely, the layerUploadStore lifecycle and implementation will be deferred to the application. Image pushes pulls will be implemented in a similar manner without the discrete, persistent upload. Much of this change is in place to get something running and working. Caveats of this change include the following: 1. Layer upload state storage is implemented on the local filesystem, separate from the storage driver. This must be replaced with using the proper backend and other state storage. This can be removed when we implement resumable hashing and tarsum calculations to avoid backend roundtrips. 2. Error handling is rather bespoke at this time. The http API implementation should really dictate the error return structure for the future, so we intend to refactor this heavily to support these errors. We'd also like to collect production data to understand how failures happen in the system as a while before moving to a particular edict around error handling. 3. The layerUploadStore, which manages layer upload storage and state is not currently exported. This will likely end up being split, with the file management portion being pointed at the storagedriver and the state storage elsewhere. 4. Access Control provisions are nearly completely missing from this change. There are details around how layerindex lookup works that are related with access controls. As the auth portions of the new API take shape, these provisions will become more clear. Please see TODOs for details and individual recommendations.
2014-11-18 00:29:42 +00:00
}
}
return nil
}