2014-11-18 00:29:42 +00:00
|
|
|
package storage
|
|
|
|
|
|
|
|
import (
|
2015-02-02 21:01:49 +00:00
|
|
|
"fmt"
|
2015-01-05 07:59:29 +00:00
|
|
|
"io"
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
"path"
|
|
|
|
"time"
|
2014-11-18 00:29:42 +00:00
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
"github.com/Sirupsen/logrus"
|
2015-02-12 00:49:49 +00:00
|
|
|
"github.com/docker/distribution"
|
2015-02-09 22:44:58 +00:00
|
|
|
ctxu "github.com/docker/distribution/context"
|
2014-12-24 00:01:38 +00:00
|
|
|
"github.com/docker/distribution/digest"
|
2015-02-11 02:14:23 +00:00
|
|
|
storagedriver "github.com/docker/distribution/registry/storage/driver"
|
2014-11-18 00:29:42 +00:00
|
|
|
)
|
|
|
|
|
|
|
|
// layerUploadController is used to control the various aspects of resumable
|
|
|
|
// layer upload. It implements the LayerUpload interface.
|
|
|
|
type layerUploadController struct {
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
layerStore *layerStore
|
2014-11-18 00:29:42 +00:00
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
uuid string
|
|
|
|
startedAt time.Time
|
2014-11-18 00:29:42 +00:00
|
|
|
|
2015-03-03 22:47:07 +00:00
|
|
|
// implementes io.WriteSeeker, io.ReaderFrom and io.Closer to satisy
|
|
|
|
// LayerUpload Interface
|
|
|
|
bufferedFileWriter
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
2015-02-12 00:49:49 +00:00
|
|
|
var _ distribution.LayerUpload = &layerUploadController{}
|
2014-11-18 00:29:42 +00:00
|
|
|
|
|
|
|
// UUID returns the identifier for this upload.
|
|
|
|
func (luc *layerUploadController) UUID() string {
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return luc.uuid
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
func (luc *layerUploadController) StartedAt() time.Time {
|
|
|
|
return luc.startedAt
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// Finish marks the upload as completed, returning a valid handle to the
|
|
|
|
// uploaded layer. The final size and checksum are validated against the
|
|
|
|
// contents of the uploaded layer. The checksum should be provided in the
|
|
|
|
// format <algorithm>:<hex digest>.
|
2015-02-12 00:49:49 +00:00
|
|
|
func (luc *layerUploadController) Finish(digest digest.Digest) (distribution.Layer, error) {
|
2015-02-09 22:44:58 +00:00
|
|
|
ctxu.GetLogger(luc.layerStore.repository.ctx).Debug("(*layerUploadController).Finish")
|
2015-03-03 22:47:07 +00:00
|
|
|
|
|
|
|
err := luc.bufferedFileWriter.Close()
|
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
canonical, err := luc.validateLayer(digest)
|
2014-11-18 00:29:42 +00:00
|
|
|
if err != nil {
|
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
if err := luc.moveLayer(canonical); err != nil {
|
|
|
|
// TODO(stevvooe): Cleanup?
|
2014-11-18 00:29:42 +00:00
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// Link the layer blob into the repository.
|
2015-03-05 00:31:31 +00:00
|
|
|
if err := luc.linkLayer(canonical, digest); err != nil {
|
2014-11-18 00:29:42 +00:00
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
if err := luc.removeResources(); err != nil {
|
2014-11-18 00:29:42 +00:00
|
|
|
return nil, err
|
|
|
|
}
|
|
|
|
|
2015-01-17 02:24:07 +00:00
|
|
|
return luc.layerStore.Fetch(canonical)
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// Cancel the layer upload process.
|
|
|
|
func (luc *layerUploadController) Cancel() error {
|
2015-02-09 22:44:58 +00:00
|
|
|
ctxu.GetLogger(luc.layerStore.repository.ctx).Debug("(*layerUploadController).Cancel")
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
if err := luc.removeResources(); err != nil {
|
2014-11-18 00:29:42 +00:00
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
luc.Close()
|
|
|
|
return nil
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// validateLayer checks the layer data against the digest, returning an error
|
|
|
|
// if it does not match. The canonical digest is returned.
|
|
|
|
func (luc *layerUploadController) validateLayer(dgst digest.Digest) (digest.Digest, error) {
|
2014-11-19 22:59:05 +00:00
|
|
|
digestVerifier := digest.NewDigestVerifier(dgst)
|
2014-11-18 00:29:42 +00:00
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// TODO(stevvooe): Store resumable hash calculations in upload directory
|
|
|
|
// in driver. Something like a file at path <uuid>/resumablehash/<offest>
|
|
|
|
// with the hash state up to that point would be perfect. The hasher would
|
|
|
|
// then only have to fetch the difference.
|
2014-11-18 00:29:42 +00:00
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// Read the file from the backend driver and validate it.
|
2015-03-03 22:47:07 +00:00
|
|
|
fr, err := newFileReader(luc.bufferedFileWriter.driver, luc.path)
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
if err != nil {
|
2015-02-02 21:01:49 +00:00
|
|
|
return "", err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
tr := io.TeeReader(fr, digestVerifier)
|
2014-11-18 00:29:42 +00:00
|
|
|
|
2014-11-19 22:39:32 +00:00
|
|
|
// TODO(stevvooe): This is one of the places we need a Digester write
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// sink. Instead, its read driven. This might be okay.
|
2014-11-18 00:29:42 +00:00
|
|
|
|
2014-11-19 22:39:32 +00:00
|
|
|
// Calculate an updated digest with the latest version.
|
2015-03-05 00:31:31 +00:00
|
|
|
canonical, err := digest.FromReader(tr)
|
2014-11-19 22:39:32 +00:00
|
|
|
if err != nil {
|
|
|
|
return "", err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
2014-11-19 22:39:32 +00:00
|
|
|
if !digestVerifier.Verified() {
|
2015-02-12 00:49:49 +00:00
|
|
|
return "", distribution.ErrLayerInvalidDigest{
|
2015-02-02 21:01:49 +00:00
|
|
|
Digest: dgst,
|
|
|
|
Reason: fmt.Errorf("content does not match digest"),
|
|
|
|
}
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return canonical, nil
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// moveLayer moves the data into its final, hash-qualified destination,
|
2014-12-05 04:55:59 +00:00
|
|
|
// identified by dgst. The layer should be validated before commencing the
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// move.
|
|
|
|
func (luc *layerUploadController) moveLayer(dgst digest.Digest) error {
|
2015-01-17 02:24:07 +00:00
|
|
|
blobPath, err := luc.layerStore.repository.registry.pm.path(blobDataPathSpec{
|
2014-11-25 00:21:02 +00:00
|
|
|
digest: dgst,
|
2014-11-18 00:29:42 +00:00
|
|
|
})
|
|
|
|
|
|
|
|
if err != nil {
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
// Check for existence
|
2015-02-02 21:01:49 +00:00
|
|
|
if _, err := luc.driver.Stat(blobPath); err != nil {
|
2014-11-18 00:29:42 +00:00
|
|
|
switch err := err.(type) {
|
|
|
|
case storagedriver.PathNotFoundError:
|
|
|
|
break // ensure that it doesn't exist.
|
|
|
|
default:
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
} else {
|
|
|
|
// If the path exists, we can assume that the content has already
|
|
|
|
// been uploaded, since the blob storage is content-addressable.
|
|
|
|
// While it may be corrupted, detection of such corruption belongs
|
|
|
|
// elsewhere.
|
|
|
|
return nil
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
2015-02-02 21:01:49 +00:00
|
|
|
// If no data was received, we may not actually have a file on disk. Check
|
|
|
|
// the size here and write a zero-length file to blobPath if this is the
|
|
|
|
// case. For the most part, this should only ever happen with zero-length
|
|
|
|
// tars.
|
|
|
|
if _, err := luc.driver.Stat(luc.path); err != nil {
|
|
|
|
switch err := err.(type) {
|
|
|
|
case storagedriver.PathNotFoundError:
|
|
|
|
// HACK(stevvooe): This is slightly dangerous: if we verify above,
|
|
|
|
// get a hash, then the underlying file is deleted, we risk moving
|
|
|
|
// a zero-length blob into a nonzero-length blob location. To
|
|
|
|
// prevent this horrid thing, we employ the hack of only allowing
|
|
|
|
// to this happen for the zero tarsum.
|
2015-03-05 04:26:56 +00:00
|
|
|
if dgst == digest.DigestSha256EmptyTar {
|
2015-02-02 21:01:49 +00:00
|
|
|
return luc.driver.PutContent(blobPath, []byte{})
|
|
|
|
}
|
|
|
|
|
|
|
|
// We let this fail during the move below.
|
|
|
|
logrus.
|
|
|
|
WithField("upload.uuid", luc.UUID()).
|
|
|
|
WithField("digest", dgst).Warnf("attempted to move zero-length content with non-zero digest")
|
|
|
|
default:
|
|
|
|
return err // unrelated error
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return luc.driver.Move(luc.path, blobPath)
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
2014-11-25 00:21:02 +00:00
|
|
|
// linkLayer links a valid, written layer blob into the registry under the
|
|
|
|
// named repository for the upload controller.
|
2015-03-05 00:31:31 +00:00
|
|
|
func (luc *layerUploadController) linkLayer(canonical digest.Digest, aliases ...digest.Digest) error {
|
|
|
|
dgsts := append([]digest.Digest{canonical}, aliases...)
|
2014-11-18 00:29:42 +00:00
|
|
|
|
2015-03-05 00:31:31 +00:00
|
|
|
// Don't make duplicate links.
|
|
|
|
seenDigests := make(map[digest.Digest]struct{}, len(dgsts))
|
|
|
|
|
|
|
|
for _, dgst := range dgsts {
|
|
|
|
if _, seen := seenDigests[dgst]; seen {
|
|
|
|
continue
|
|
|
|
}
|
|
|
|
seenDigests[dgst] = struct{}{}
|
|
|
|
|
|
|
|
layerLinkPath, err := luc.layerStore.repository.registry.pm.path(layerLinkPathSpec{
|
|
|
|
name: luc.layerStore.repository.Name(),
|
|
|
|
digest: dgst,
|
|
|
|
})
|
|
|
|
|
|
|
|
if err != nil {
|
|
|
|
return err
|
|
|
|
}
|
|
|
|
|
|
|
|
if err := luc.layerStore.repository.registry.driver.PutContent(layerLinkPath, []byte(canonical)); err != nil {
|
|
|
|
return err
|
|
|
|
}
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
2015-03-05 00:31:31 +00:00
|
|
|
return nil
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// removeResources should clean up all resources associated with the upload
|
|
|
|
// instance. An error will be returned if the clean up cannot proceed. If the
|
|
|
|
// resources are already not present, no error will be returned.
|
|
|
|
func (luc *layerUploadController) removeResources() error {
|
2015-01-17 02:24:07 +00:00
|
|
|
dataPath, err := luc.layerStore.repository.registry.pm.path(uploadDataPathSpec{
|
2015-02-12 01:00:42 +00:00
|
|
|
name: luc.layerStore.repository.Name(),
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
uuid: luc.uuid,
|
|
|
|
})
|
2014-11-18 00:29:42 +00:00
|
|
|
|
|
|
|
if err != nil {
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
return err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
// Resolve and delete the containing directory, which should include any
|
|
|
|
// upload related files.
|
|
|
|
dirPath := path.Dir(dataPath)
|
2015-01-05 07:59:29 +00:00
|
|
|
|
Spool layer uploads to remote storage
To smooth initial implementation, uploads were spooled to local file storage,
validated, then pushed to remote storage. That approach was flawed in that it
present easy clustering of registry services that share a remote storage
backend. The original plan was to implement resumable hashes then implement
remote upload storage. After some thought, it was found to be better to get
remote spooling working, then optimize with resumable hashes.
Moving to this approach has tradeoffs: after storing the complete upload
remotely, the node must fetch the content and validate it before moving it to
the final location. This can double bandwidth usage to the remote backend.
Modifying the verification and upload code to store intermediate hashes should
be trivial once the layer digest format has settled.
The largest changes for users of the storage package (mostly the registry app)
are the LayerService interface and the LayerUpload interface. The LayerService
now takes qualified repository names to start and resume uploads. In corallry,
the concept of LayerUploadState has been complete removed, exposing all aspects
of that state as part of the LayerUpload object. The LayerUpload object has
been modified to work as an io.WriteSeeker and includes a StartedAt time, to
allow for upload timeout policies. Finish now only requires a digest, eliding
the requirement for a size parameter.
Resource cleanup has taken a turn for the better. Resources are cleaned up
after successful uploads and during a cancel call. Admittedly, this is probably
not completely where we want to be. It's recommend that we bolster this with a
periodic driver utility script that scans for partial uploads and deletes the
underlying data. As a small benefit, we can leave these around to better
understand how and why these uploads are failing, at the cost of some extra
disk space.
Many other changes follow from the changes above. The webapp needs to be
updated to meet the new interface requirements.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
2015-01-08 22:24:02 +00:00
|
|
|
if err := luc.driver.Delete(dirPath); err != nil {
|
|
|
|
switch err := err.(type) {
|
|
|
|
case storagedriver.PathNotFoundError:
|
|
|
|
break // already gone!
|
|
|
|
default:
|
|
|
|
// This should be uncommon enough such that returning an error
|
|
|
|
// should be okay. At this point, the upload should be mostly
|
|
|
|
// complete, but perhaps the backend became unaccessible.
|
|
|
|
logrus.Errorf("unable to delete layer upload resources %q: %v", dirPath, err)
|
|
|
|
return err
|
2014-11-18 00:29:42 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return nil
|
|
|
|
}
|