It's possible to run into a race condition in which the enumerator lists
lots of repositories and then starts the long process of enumerating through
them. In that time if someone deletes a repo, the enumerator may error out.
Signed-off-by: Ryan Abrams <rdabrams@gmail.com>
According golang documentation [1]: no goroutine should expect to be
able to acquire a read lock until the initial read lock is released.
[1] https://golang.org/pkg/sync/#RWMutex
Signed-off-by: Gladkov Alexey <agladkov@redhat.com>
at the first iteration, only the following metrics are collected:
- HTTP metrics of each API endpoint
- cache counter for request/hit/miss
- histogram of storage actions, including:
GetContent, PutContent, Stat, List, Move, and Delete
Signed-off-by: tifayuki <tifayuki@gmail.com>
This removes the old global walk function, and changes all
the code to use the per-driver walk functions.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This changes the Walk Method used for catalog enumeration. Just to show
how much an effect this has on our s3 storage:
Original:
List calls: 6839
real 3m16.636s
user 0m0.000s
sys 0m0.016s
New:
ListObjectsV2 Calls: 1805
real 0m49.970s
user 0m0.008s
sys 0m0.000s
This is because it no longer performs a list and stat per item, and instead
is able to use the metadata gained from the list as a replacement to stat.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Move the Walk types into registry/storage/driver, and add a Walk method to each
storage driver. Although this is yet another API to implement, there is a fall
back implementation that relies on List and Stat. For some filesystems this is
very slow.
Also, this WalkDir Method conforms better do a traditional WalkDir (a la filepath).
This change is in preparation for refactoring.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
If tenant or tenantid are passed as env variables, we systematically use Sprint to make sure they are string and not integer as it would make mapstructure fail.
Signed-off-by: Raphaël Enrici <raphael@root-42.com>
Back in the before time, the best practices surrounding usage of Context
weren't quite worked out. We defined our own type to make usage easier.
As this packaged was used elsewhere, it make it more and more
challenging to integrate with the forked `Context` type. Now that it is
available in the standard library, we can just use that one directly.
To make usage more consistent, we now use `dcontext` when referring to
the distribution context package.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
In some conditions, regulator.exit may not send a signal to blocked
regulator.enter.
Let's assume we are in the critical section of regulator.exit and r.available
is equal to 0. And there are three more gorotines. One goroutine also executes
regulator.exit and waits for the lock. Rest run regulator.enter and wait for
the signal.
We send the signal, and after releasing the lock, there will be lock
contention:
1. Wait from regulator.enter
2. Lock from regulator.exit
If the winner is Lock from regulator.exit, we will not send another signal to
unlock the second Wait.
Signed-off-by: Oleg Bulatov <obulatov@redhat.com>
`app.driver.List` on `"/"` is very expensive if registry contains significant amount of images. And the result isn't used anyways.
In most (if not all) storage drivers, `Stat` has a cheaper implementation, so use it instead to achieve the same goal.
Signed-off-by: yixi zhang <yixi@memsql.com>
See #2077 for background.
The PR #1438 which was not reviewed by azure folks basically introduced
a race condition around uploads to the same blob by multiple clients
concurrently as it used the "writer" type for PutContent(), introduced in #1438.
This does chunked upload of blobs using "AppendBlob" type, which was not atomic.
Usage of "writer" type and thus AppendBlobs on metadata files is currently not
concurrency-safe and generally, they are not the right type of blob for the job.
This patch fixes PutContent() to use the atomic upload operation that works
for uploads smaller than 64 MB and creates blobs with "BlockBlob" type. To be
backwards compatible, we query the type of the blob first and if it is not
a "BlockBlob" we delete the blob first before doing an atomic PUT. This
creates a small inconsistency/race window "only once". Once the blob is made
"BlockBlob", it is overwritten with a single PUT atomicallly next time.
Therefore, going forward, PutContent() will be producing BlockBlobs and it
will silently migrate the AppendBlobs introduced in #1438 to BlockBlobs with
this patch.
Tested with existing code side by side, both registries with and without this
patch work fine without breaking each other. So this should be good from a
backwards/forward compatiblity perspective, with a cost of doing an extra
HEAD checking the blob type.
Fixes#2077.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
Updating to a recent version of Azure Storage SDK to be
able to patch some memory leaks through configurable HTTP client
changes which were made possible by recent patches to it.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
The current code determines the header order for the
"string-to-sign" payload by sorting on the concatenation
of headers and values, whereas it should only happen on the
key.
During multipart uploads, since `x-amz-copy-source-range` and
`x-amz-copy-source` headers are present, V2 signatures fail to
validate since header order is swapped.
This patch reverts to the expected behavior.
Signed-off-by: Pierre-Yves Ritschard <pyr@spootnik.org>
Driver was passing connections by copying. Storing
`swift.Connection` as pointer to fix the warnings.
Ref: #2030.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
In GetContent() we read the bytes from a blob but do not close
the underlying response body.
Signed-off-by: Ahmet Alp Balkan <ahmetalpbalkan@gmail.com>
Context should use type values instead of strings.
Updated direct calls to WithValue, but still other uses of string keys.
Update Acl to ACL in s3 driver.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
This change to the S3 Move method uses S3's multipart upload API to copy
objects whose size exceeds a threshold. Parts are copied concurrently.
The level of concurrency, part size, and threshold are all configurable
with reasonable defaults.
Using the multipart upload API has two benefits.
* The S3 Move method can now handle objects over 5 GB, fixing #886.
* Moving most objects, and espectially large ones, is faster. For
example, moving a 1 GB object averaged 30 seconds but now averages 10.
Signed-off-by: Noah Treuhaft <noah.treuhaft@docker.com>
This is already supported by ncw/swift, so we just need to pass the
parameters from the storage driver.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
Use the much faster math/rand.Read function where cryptographic
guarantees are not required. The unit test suite should speed up a
little bit but we've already optimized around this, so it may not
matter.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
* Add Object ACL Support to the S3 Storage Backend
Signed-off-by: Frank Chen <frankchn@gmail.com>
* Made changes per @RichardScothern's comments
Signed-off-by: Frank Chen <frankchn@gmail.com>
* Fix Typos
Signed-off-by: Frank Chen <frankchn@gmail.com>
This is similar to waitForSegmentsToShowUp which is called during
Close/Commit. Intuitively, you wouldn't expect missing segments to be a
problem during read operations, since the previous Close/Commit
confirmed that all segments are there.
But due to the distributed nature of Swift, the read request could be
hitting a different storage node of the Swift cluster, where the
segments are still missing.
Load tests on my team's staging Swift cluster have shown this to occur
about once every 100-200 layer uploads when the Swift proxies are under
high load. The retry logic, borrowed from waitForSegmentsToShowUp, fixes
this temporary inconsistency.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
This commit refactors base.regulator into the 2.4 interfaces and adds a
filesystem configuration option `maxthreads` to configure the regulator.
By default `maxthreads` is set to 100. This means the FS driver is
limited to 100 concurrent blocking file operations. Any subsequent
operations will block in Go until previous filesystem operations
complete.
This ensures that the registry can never open thousands of simultaneous
threads from os filesystem operations.
Note that `maxthreads` can never be less than 25.
Add test case covering parsable string maxthreads
Signed-off-by: Tony Holdstock-Brown <tony@docker.com>
It's easily possible for a flood of requests to trigger thousands of
concurrent file accesses on the storage driver. Each file I/O call creates
a new OS thread that is not reaped by the Golang runtime. By limiting it
to only 100 at a time we can effectively bound the number of OS threads
in use by the storage driver.
Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Signed-off-by: Tony Holdstock-Brown <tony@docker.com>
Not just when Commit()ing the result. This fixes some errors I observed
when the layer (i.e. the DLO) is Stat()ed immediately after closing,
and reports the wrong file size because the container listing is not
yet up-to-date.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
In 326c3a9c49, which was only intended to
be a refactoring commit, the behavior of this block subtly changed so
that unknown types of errors would be swallowed instead of propagated.
I noticed this while investigating an error similar to #1539 aka
docker/docker#21290. It appears that during GetContent() for a
hashstate, the Swift proxy produces an error. Since this error was
silently swallowed, an empty []byte is used to restart the hash, then
producing the digest of the empty string instead of the layer's digest.
This PR will not fix the issue, but it should make the actual error more
visible by propagating it into `blobWriter#resumeDigest' and
'blobWriter#validateBlob', respectively.
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
This commit adds context-specific documentation on StorageDriver,
StorageDriverFactory, and the factory’s Register func, explaining how
the internal registration mechanism should be used.
This documentation follows from the thread starting at
https://github.com/deis/builder/pull/262/files#r56720200.
cc/ @stevvooe
Signed-off-by: Aaron Schlesinger <aschlesinger@deis.com>
Updates registry storage code to use this for better resumable writes.
Implements this interface for the following drivers:
+ Inmemory
+ Filesystem
+ S3
+ Azure
Signed-off-by: Brian Bland <brian.bland@docker.com>
The Move operation is only used to move uploaded blobs
to their final destination. There is no point in implementing
Move on "folders". Apart from simplifying the code, this also
saves an HTTP request.
Signed-off-by: Arthur Baars <arthur@semmle.com>
Offset can be more than CurrentSize as long as this case is checked
by DriverSuite.testContinueStreamAppend.
Signed-off-by: Anton Tiurin <noxiouz@yandex.ru>
Fixes bug in TestStorageClass for s3aws driver where the "standard" file
was checked for reduced-redundnancy storage.
Signed-off-by: Brian Bland <brian.bland@docker.com>
Keeps old s3 driver, renames to s3goamz, registers new s3 driver as both
"s3" and "s3aws"
Changes cloudfront middleware to use aws-sdk-go
Signed-off-by: Brian Bland <brian.bland@docker.com>
Treats nil parameters the same as unprovided parameters (fixes issues
where certain parameters are printed to "<nil>").
Accepts "true" and "false" string values for boolean parameters.
Signed-off-by: Brian Bland <brian.bland@docker.com>
Uses docker/goamz instead of AdRoll/goamz
Adds a registry UA string param to the storage parameters when
constructing the storage driver for the registry App.
This could be used by other storage drivers as well
Signed-off-by: Brian Bland <brian.bland@docker.com>
Verify that the file(s) have been deleted after calling Delete,
and retry if this is not the case. Furthermore, report the error
if a Delete operation fails.
Signed-off-by: Arthur Baars <arthur@semmle.com>
Remove the requirement of file system access to run GCS unit tests. Deconstruct
the input parameters to take the private key and email which can be specified on
the build system via environment variables.
Signed-off-by: Richard Scothern <richard.scothern@gmail.com>
This fixes both the s3 driver and the oss driver to return the unmunged path
when returning errors.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
Swift returns an empty object list when trying to read a non-existing object path, treat it as a
PathNotFoundError when trying to list a non existing virtual directory.
Signed-off-by: David li <wenquan.li@hpe.com>
RADOS returns a -EIO when trying to read a non-existing OMAP, treat it as a
PathNotFoundError when trying to list a non existing virtual directory.
Signed-off-by: Vincent Giersch <vincent@giersch.fr>
Issue #1186 describes a condition where a null tags response is returned when
using the s3 driver. The issue seems to be related to a missing
PathNotFoundError in s3. This change adds a test for that to get an idea of the
lack of compliance across storage drivers. If the failures are manageable,
we'll add this test condition and fix the s3 driver.
Signed-off-by: Stephen J Day <stephen.day@docker.com>
The Delete method lists objects under the given path and tries to delete
all of them with a bulk delete request. If the path has no objects
underneath it, the body of this request will be empty, which causes
HTTP-level issues. Specifically, Go's HTTP client senses the empty
request buffer and doesn't include a Content-Length, which causes the
Swift server to fail the request.
This commit fixes the problem by avoiding sending empty bulk delete
requests. This is the correct thing to do anyway, since there's no
reason to request deletion of zero objects.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Errors thrown by storage drivers don't have the name of the driver, causing user
confusion about whether the error is coming from Docker or from a storage driver.
This change adds the storage driver name to each error message.
This required changing ErrUnsupportedDriver to a type, leading to code changes
whenever ErrUnsupportedDriver is used. The tests check whether the driver name
appears in the error message.
Signed-off-by: Amit Shukla <amit.shukla@docker.com>
In the S3 storage driver there is currently an initial access permission check by listing the bucket. If this check fails, registry will panic and exit.
However, this check is broken in two ways. First of all it strips the final slash from the root directory path, meaning that any access permissions which limit access to a single directory will fail, because S3 treats the path as strict prefix match. Secondly it fails to strip any leading slash that might be present, unlike the other access places, which means that the path used is different as a leading slash is allowed and significant in a filename in S3.
Since there is also a periodic health check which correctly checks access permissions and shows the error more cleanly, the best solution seems to be to just remove this initial access check.
Signed-off-by: Nuutti Kotivuori <nuutti.kotivuori@poplatek.fi>
When using the RADOS driver, the hierarchy of the files is stored
in OMAPs, but the root OMAP was not created and a call to List("/")
was returning an error instead of returned the first level files
stored. This patches creates an OMAP for "/" and excludes the listed
directory from the list of files returned.
Signed-off-by: Vincent Giersch <vincent@giersch.fr>