Commit graph

297 commits

Author SHA1 Message Date
Nick Craig-Wood
f7e3115955 s3: fix Wasabi HEAD requests returning stale data by using only 1 transport
In this commit

fc5b14b620 s3: Added `--s3-disable-http2` to disable http/2

We created our own transport so we could disable http/2. However the
added function is called twice meaning that we create two HTTP
transports. This didn't happen with the original code because the
default transport is cached by fshttp.

Rclone normally does a PUT followed by a HEAD request to check an
upload has been successful.

With the two transports, the PUT and the HEAD were being done on
different HTTP transports. This means that it wasn't re-using the same
HTTP connection, so the HEAD request showed the previous object value.
This caused rclone to declare the upload was corrupted, delete the
object and try again.

This patch makes sure we only create one transport and use it for both
PUT and HEAD requests which fixes the problem with Wasabi.

See: https://forum.rclone.org/t/each-time-rclone-is-run-1-3-fails-2-3-succeeds/22545
2021-03-05 15:34:56 +00:00
Nick Craig-Wood
b029fb591f s3: fix failed to create file system with folder level permissions policy
Before this change, if folder level access permissions policy was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.

    Failed to create file system for "s3:bucket/path/": Forbidden: Forbidden
        status code: 403, request id: XXXX, host id:

Previous to this change

53aa03cc44 s3: complete sse-c implementation

rclone would assume any errors when HEAD-ing the object implied it
didn't exist and this test would not fail.

This change reverts the functionality of the test to work as it did
before, meaning any errors on HEAD will make rclone assume the object
does not exist and the path is referring to a directory.

Fixes #4990
2021-02-24 20:35:44 +00:00
Dmitry Chepurovskiy
1605f9e14d
s3: Fix shared_credentials_file auth
S3 backend shared_credentials_file option wasn't working neither from
config option nor from command line option. This was caused cause
shared_credentials_file_provider works as part of chain provider, but in
case user haven't specified access_token and access_key we had removed
(set nil) to credentials field, that may contain actual credentials got
from ChainProvider.

AWS_SHARED_CREDENTIALS_FILE env varible as far as i understood worked,
cause aws_sdk code handles it as one of default auth options, when
there's not configured credentials.
2021-02-17 12:04:26 +00:00
Nick Craig-Wood
bbe791a886 swift: update github.com/ncw/swift to v2.0.0
The update to v2 of the swift library introduces a context parameter
to each function. This required a lot of mostly mechanical changes
adding context parameters.

See: https://github.com/ncw/swift/issues/159
See: https://github.com/ncw/swift/issues/161
2021-02-03 20:23:37 +00:00
Nick Craig-Wood
bcac8fdc83 Use http.NewRequestWithContext where possible after go1.13 minimum version 2021-02-03 17:41:27 +00:00
Nick Craig-Wood
8b41dfa50a s3: add --s3-no-head parameter to minimise transactions on upload
See: https://forum.rclone.org/t/prevent-head-on-amazon-s3-family/21935
2021-02-02 10:07:48 +00:00
Nick Craig-Wood
3877df4e62 s3: update help for --s3-no-check-bucket #4913 2021-01-10 17:54:19 +00:00
kelv
9e87f5090f s3: add requester pays option - fixes #301 2020-12-27 15:43:44 +00:00
Anagh Kumar Baranwal
8a429d12cf s3: Added error handling for error code 429 indicating too many requests
Signed-off-by: Anagh Kumar Baranwal <6824881+darthShadow@users.noreply.github.com>
2020-12-01 18:13:31 +00:00
Nick Craig-Wood
9d574c0d63 fshttp: read config from ctx not passed in ConfigInfo #4685 2020-11-26 16:40:12 +00:00
Nick Craig-Wood
2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
Nick Craig-Wood
76ee3060d1 s3: Add MD5 metadata to objects uploaded with SSE-AWS/SSE-C
Before this change, small objects uploaded with SSE-AWS/SSE-C would
not have MD5 sums.

This change adds metadata for these objects in the same way that the
metadata is stored for multipart uploaded objects.

See: #1824 #2827
2020-11-25 12:28:02 +00:00
Nick Craig-Wood
4bb241c435 s3: store md5 in the Object rather than the ETag
This enables us to set the md5 to cache it.

See: #1824 #2827
2020-11-25 12:28:02 +00:00
Nick Craig-Wood
a06f4c2514 s3: fix hashes on small files with aws:kms and sse-c
If rclone is configured for server side encryption - either aws:kms or
sse-c (but not sse-s3) then don't treat the ETags returned on objects
as MD5 hashes.

This fixes being able to upload small files.

Fixes #1824
2020-11-25 12:28:02 +00:00
Nick Craig-Wood
53aa03cc44 s3: complete sse-c implementation
This now can complete all operations with SSE-C enabled.

Fixes #2827
See: https://forum.rclone.org/t/issues-with-aws-s3-sse-c-getting-strange-log-entries-and-errors/20553
2020-11-25 12:28:02 +00:00
Nick Craig-Wood
8b96933e58 fs: Add context to fs.Features.Fill & fs.Features.Mask #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood
d846210978 fs: Add context to NewFs #3257 #4685
This adds a context.Context parameter to NewFs and related calls.

This is necessary as part of reading config from the context -
backends need to be able to read the global config.
2020-11-09 18:05:54 +00:00
Josh Soref
0a6196716c docs: style: avoid double-nesting parens
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
a15f50254a docs: grammar: if, then
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
5d4f77a022 docs: grammar: Oxford comma
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
a089de0964 docs: grammar: uncountable: links
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
3068ae8447 docs: grammar: count agreement: files
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
67ff153b0c docs: grammar: article: a-file
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
e4a87f772f docs: spelling: e.g.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
d4f38d45a5 docs: spelling: high-speed
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
bbe7eb35f1 docs: spelling: server-side
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
d0888edc0a Spelling fixes
Fix spelling of: above, already, anonymous, associated,
authentication, bandwidth, because, between, blocks, calculate,
candidates, cautious, changelog, cleaner, clipboard, command,
completely, concurrently, considered, constructs, corrupt, current,
daemon, dependencies, deprecated, directory, dispatcher, download,
eligible, ellipsis, encrypter, endpoint, entrieslist, essentially,
existing writers, existing, expires, filesystem, flushing, frequently,
hierarchy, however, implementation, implements, inaccurate,
individually, insensitive, longer, maximum, metadata, modified,
multipart, namedirfirst, nextcloud, obscured, opened, optional,
owncloud, pacific, passphrase, password, permanently, persimmon,
positive, potato, protocol, quota, receiving, recommends, referring,
requires, revisited, satisfied, satisfies, satisfy, semver,
serialized, session, storage, strategies, stringlist, successful,
supported, surprise, temporarily, temporary, transactions, unneeded,
update, uploads, wrapped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-14 15:21:31 +01:00
Anagh Kumar Baranwal
fc5b14b620
s3: Added --s3-disable-http2 to disable http/2
Fixes #4673

Signed-off-by: Anagh Kumar Baranwal <6824881+darthShadow@users.noreply.github.com>
2020-10-13 17:11:22 +01:00
Anagh Kumar Baranwal
e3a5bb9b48 s3: Add missing regions for AWS
Signed-off-by: Anagh Kumar Baranwal <6824881+darthShadow@users.noreply.github.com>
2020-10-06 16:54:42 +01:00
Christopher Stewart
f3cf6fcdd7
s3: fix spelling mistake
Fix spelling mistake "patific" => "pacific"
2020-09-18 12:03:13 +01:00
wjielai
22937e8982
docs: add Tencent COS to s3 provider list - fixes #4468
* add Tencent COS to s3 provider list.

Co-authored-by: wjielai <wjielai@tencent.com>
2020-09-08 16:34:25 +01:00
Nick Craig-Wood
725ae91387 s3: reduce the default --s3-copy-cutoff to < 5GB
The maximum value for the --s3--copy-cutoff should be 5GiB as tested
with AWS S3.

However b2 have implemented this as 5GB rather than 5GiB so having the
default at 5 GiB makes the b2s3 server side copy of a large file by
default.

This patch sets the default to 4768 MiB which is slightly less than
5GB.

This should have very little effect on anything.

If in future rclone can lower this limit more if Copy can multithread.

See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/76
2020-09-01 18:53:29 +01:00
Nick Craig-Wood
b7dd3ce608 s3: preserve metadata when doing multipart copy
Before this change the s3 multipart server side copy was not
preserving the metadata of the object. This was most noticeable
because the modtime was not preserved.

This change fetches the metadata from the object before starting the
copy and overwrites it if requires.

It will also mean any other metadata is preserved.

See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/70
2020-09-01 18:39:30 +01:00
Egor Margineanu
921e384c4d
s3: update IBM COS endpoints - fixes #4522 2020-08-30 17:21:11 +01:00
Nick Craig-Wood
801a820c54 s3: fix detection of bucket existing
This reverts part of

151f03378f s3: fix upload of single files into buckets without create permission

This erroneously assumed that a HEAD request on a non existent object
would return "NotFound" if the bucket was found. In fact it returns
"NotFound" when the bucket isn't found also.

This will break the fix for #4297 - however that can be made to work
using the new --s3-assume-bucket-exists flag
2020-08-21 13:28:08 +01:00
Nick Craig-Wood
d5f4c74697 s3: implement cleanup and backend command to list & remove multipart uploads
This implements `rclone cleanup` to remove multipart uploads over 24
hours old. It also implements the backend command
`list-multipart-uploads` to see which ones are available and `cleanup`
to delete them with a configurable expiry interval.

See #4302
2020-07-28 11:37:46 +01:00
Nick Craig-Wood
2288a5c617 s3: implement profile and shared_credentials_file options
It is impossible to use two different profiles at the same time -
these config vars enable that.

See: https://forum.rclone.org/t/s3-source-destination-named-profile/17417
2020-07-28 11:32:32 +01:00
Nick Craig-Wood
f406dbbb4d s3: add --s3-no-check-bucket for minimising rclone transactions and perms
Fixes #4449
2020-07-27 17:49:40 +01:00
Nick Craig-Wood
80d2f38192 s3: fix bucket Region auto detection when Region unset in config #2915
Previous to this fix if Region was not set and Endpoint was not set
then we set the endpoint to "https://s3.amazonaws.com/".

This is unecessary because if the Region alone isn't set then we set
it to "us-east-1" which has the same endpoint.

Having the endpoint set breaks the bucket region auto detection with
the error "Failed to update region for bucket: can't set region to
"xxx" as endpoint is set".

This fix removes that check.
2020-07-10 17:16:59 +01:00
Nick Craig-Wood
c820576329 fs: define SlowModTime and SlowHash features in the relevant backends 2020-06-30 12:01:36 +01:00
David
9058ec32e1 s3: Use regional s3 us-east-1 endpoint 2020-06-26 16:25:52 +01:00
Nick Craig-Wood
fd7c63bc78 s3: add backend restore command to restore objects from GLACIER
See: https://forum.rclone.org/t/rclone-settier-fails-with-scaleway-entitytoolarge/17384
2020-06-25 21:33:23 +01:00
Nick Craig-Wood
5f75444ef6 s3: cancel in progress multipart uploads and copies on rclone exit #4300 2020-06-25 12:55:56 +01:00
Nick Craig-Wood
85bcacac90 s3: Cap expiry duration to 1 Week and return error when sharing dir 2020-06-18 17:50:50 +01:00
Vincent Feltz
f4d7e41f24 s3: add Scaleway provider - fixes #4338 2020-06-13 11:55:37 +01:00
Nick Craig-Wood
2ea15a72bc s3: fix --header-upload - Fixes #4303
Before this change we were setting the headers on the PUT
request for normal and multipart uploads. For normal uploads this caused the error

    403 Forbidden: There were headers present in the request which were not signed

After this fix we set the headers in the object upload request itself
as the s3 SDK expects.

This means that we only support a limited range of headers

- Cache-Control
- Content-Disposition
- Content-Encoding
- Content-Language
- Content-Type
- X-Amz-Tagging
- X-Amz-Meta-

Note for the last of those are for setting custom metadata in the form
"X-Amz-Meta-Key: value".

This now works for multipart uploads and single part uploads

See also #59
2020-06-10 12:28:48 +01:00
Kamil Trzciński
7458d37d2a
s3: add max_upload_parts support - fixes #4159
* s3: add `max_upload_parts` support

This allows to configure a maximum amount of chunks used to upload file:

- Support Scaleway which has a limit of 1k chunks currently
- Reduce a cost on S3 when each request costs some money at the expense of memory used

Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2020-06-08 18:22:34 +01:00
Roman Kredentser
c0521791db s3: implement link sharing with PublicLink 2020-06-05 14:51:05 +01:00
Nick Craig-Wood
151f03378f s3: fix upload of single files into buckets without create permission
Before this change, attempting to upload a single file into an s3
bucket which did not have create permission gave AccessDenied: Access
Denied error when it tried to create the bucket.

This was masked until e2bf91452a was
fixed.

This fix marks the bucket as OK if a fetch on an object indicates it
is OK. This stops rclone thinking it has to create the bucket in the
first place.

Fixes #4297
2020-06-02 14:33:21 +01:00
Martin Michlmayr
4aee962233 doc: fix typos throughout docs and code 2020-05-20 15:54:51 +01:00
Nick Craig-Wood
8a58e0235d s3: don't leak memory or tokens in edge cases for multipart upload 2020-05-14 07:48:18 +01:00
Nick Craig-Wood
4e869e03f7 s3: improve docs for --s3-disable-checksum 2020-04-28 17:47:10 +01:00
Tim Gallant
5cb7229a16 s3: add support for HTTPOption 2020-04-23 11:07:21 +01:00
Nick Craig-Wood
f8039deb7c s3: fix detection of BucketAlreadyOwnedByYou and BucketAlreadyExists error
This was being silently ignored until this commit

e2bf91452a s3: report errors on bucket creation (mkdir) correctly
2020-04-22 18:14:03 +01:00
Nick Craig-Wood
e2bf91452a s3: report errors on bucket creation (mkdir) correctly
Before this fix errors on bucket creation were being silently
swallowed.

See: https://forum.rclone.org/t/rclone-with-brand-new-aws-account-for-s3/15590
2020-04-15 13:13:13 +01:00
Michał Matczuk
6893ce0bbf s3: do not resize buf on put to memBuf
This is handled by Pool implementation.
2020-04-11 16:35:48 +01:00
Michał Matczuk
399cf18013 s3: use single memory pool
Previously we had a map of pools for different chunk sizes.
In practice the mapping is not very useful and requires a lock.
Pools of size other that ChunkSize can only happen when we have a huge file (over 10k * ChunkSize).
We need to have a bunch of identically sized huge files.
In such case most likely ChunkSize should be increased.

The mapping and its lock is replaced with a single initialised pool for ChunkSize, in other cases pool is allocated and freed on per file basis.
2020-04-11 16:34:05 +01:00
Jack Anderson
815ae7df45 backend/s3: add SSE-C support for AWS, Ceph, and MinIO 2020-03-31 18:16:45 +01:00
Nick Craig-Wood
a5c2f2c138 s3: ignore directory markers at the root also
See: https://forum.rclone.org/t/issue-with-lsf-r-files-only-first-line-is-blank/15229/
2020-03-31 11:45:52 +01:00
Nick Craig-Wood
dc06973796 s3: use rclone's low level retries instead of AWS SDK to fix listing retries
In 5470d34740 "backend/s3: use low-level-retries as the number
of SDK retries" we switched over to using the AWS SDK low level
retries instead of rclone's low level retry logic.

This had the unfortunate attempt that retrying listings to correct XML
Syntax errors failed on non S3 backends such as CEPH. The AWS SDK was
also retrying the XML Syntax error request which doesn't make sense.

This change turns off the AWS SDK retries in favour of just using
rclone's retry logic.
2020-03-14 18:04:24 +00:00
Joachim Brandon LeBlanc
132ce94139
backend/s3: use the provided size parameter when allocating a new memory pool - fixes #4047 (#4049) 2020-03-09 16:56:21 +00:00
Lars Lehtonen
fef2c6bf7a backend/s3: replace deprecated session.New() with session.NewSession() 2020-03-05 11:34:10 +00:00
Aleksandar Jankovic
708b967f15 backend/s3: fix multipart abort context
S3 couldn't abort multi-part upload when context is canceled
because canceled context prevents abort request from being sent.
2020-02-25 12:11:32 +01:00
Aleksandar Janković
5470d34740
backend/s3: use low-level-retries as the number of SDK retries
Amazon S3 is built to handle different kinds of workloads.
In rare cases where S3 is not able to scale for whatever reason users
will face status 500 errors.
Main mechanism for handling these errors are retries.
Amount of needed retries varies for each different use case.

This change is making retries for s3 backend configurable by using
--low-level-retries option.
2020-02-24 16:43:44 +01:00
Maciej Zimnoch
ac9cb50fdb backend/s3: use memory pool for buffer allocations
Currently each multipart upload allocated his own buffers, which after
file upload was garbaged. Next files couldn't leverage already allocated
memory which resulted in inefficent memory management. This change
introduces backend memory pool keeping memory chunks which can be
used during object operations.

Fixes #3967
2020-02-24 13:32:32 +01:00
Michał Matczuk
e75c1f70bb backend/s3: Added 500 as retryErrorCode
The error code 500 Internal Error indicates that Amazon S3 is unable to handle the request at that time. The error code 503 Slow Down typically indicates that the requests to the S3 bucket are very high, exceeding the request rates described in Request Rate and Performance Guidelines.

Because Amazon S3 is a distributed service, a very small percentage of 5xx errors are expected during normal use of the service. All requests that return 5xx errors from Amazon S3 can and should be retried, so we recommend that applications making requests to Amazon S3 have a fault-tolerance mechanism to recover from these errors.

https://aws.amazon.com/premiumsupport/knowledge-center/http-5xx-errors-s3/
2020-02-12 11:43:18 +00:00
Michał Matczuk
19a4d74ee7 backend/s3: Fail fast multipart upload
When a part upload request fails error is returned and gCtx is cancelled.
This does not prevent from other parts being tried.
They immediately fail due to a canceled context, but are retried by rclone anyway...

Example AWS debug output

```
-----------------------------------------------------
2020/02/11 14:12:17 DEBUG: Retrying Request s3/UploadPart, attempt 4
2020/02/11 14:12:17 DEBUG: Request s3/UploadPart Details:
---[ REQUEST POST-SIGN ]-----------------------------
PUT /backuptest-rclone/huge/file.db?partNumber=11&uploadId=190939b4-3c43-4b98-ac11-92303e3f11b0 HTTP/1.1
Host: 192.168.100.99:9000
User-Agent: aws-sdk-go/1.23.8 (go1.13.1; linux; amd64)
Content-Length: 5242880
Authorization: AWS4-HMAC-SHA256 Credential=miniouser/20200211/us-east-1/s3/aws4_request, SignedHeaders=content-length;content-md5;expect;host;x-amz-content-sha256;x-amz-date, Signature=3fc03a01f651cec09b05290459e9ceb26db9a8aa00c4e1b16e8cf5617eb81da8
Content-Md5: XzY+DlipXwbL6bvGYsXftg==
Expect: 100-Continue
X-Amz-Content-Sha256: c036cbb7553a909f8b8877d4461924307f27ecb66cff928eeeafd569c3887e29
X-Amz-Date: 20200211T131217Z
Accept-Encoding: gzip

-----------------------------------------------------
http://192.168.100.99:9000/backuptest-rclone/huge/file.db?partNumber=11&uploadId=190939b4-3c43-4b98-ac11-92303e3f11b0
2020/02/11 14:12:17 DEBUG: Response s3/UploadPart Details:
---[ RESPONSE ]--------------------------------------
HTTP/1.1 500 InternalServerError
Content-Length: 0
-----------------------------------------------------
UploadPartWithContext() error InternalError: We encountered an internal error. Please try again
	status code: 500, request id: , host id:

2020/02/11 14:12:18 DEBUG ERROR: Request s3/UploadPart:
---[ REQUEST DUMP ERROR ]-----------------------------
context canceled
------------------------------------------------------
UploadPartWithContext() error RequestCanceled: request context canceled
caused by: context canceled
2020/02/11 14:12:20 DEBUG ERROR: Request s3/UploadPart:
---[ REQUEST DUMP ERROR ]-----------------------------
context canceled
------------------------------------------------------
UploadPartWithContext() error RequestCanceled: request context canceled
caused by: context canceled
2020/02/11 14:12:22 DEBUG ERROR: Request s3/UploadPart:
---[ REQUEST DUMP ERROR ]-----------------------------
context canceled
------------------------------------------------------
UploadPartWithContext() error RequestCanceled: request context canceled
caused by: context canceled
```

This adds a fail fast behaviour in case the context was cancelled.
2020-02-12 11:40:34 +00:00
Nick Craig-Wood
90377f5e65 s3: Specify that Minio supports URL encoding in listings
Thanks to @harshavardhana for pointing this out

See #3934 for background
2020-02-09 12:03:20 +00:00
Dave Koston
9f99c20232 s3: Add StackPath Object Storage Support 2020-01-31 16:05:44 +00:00
Nick Craig-Wood
bafe7d5a73 backends: move encoding definitions from fs/encodings 2020-01-16 14:40:36 +00:00
Nick Craig-Wood
3c620d521d backend: adjust backends to have encoding parameter
Fixes #3761
Fixes #3836
Fixes #3841
2020-01-16 14:40:36 +00:00
Nick Craig-Wood
b6e86b2c7f s3: fix missing x-amz-meta-md5chksum headers for multipart uploads
This reverts "s3: fix DisableChecksum condition" which introduced the
problem.

This reverts commit c05bb63f96.

The code was correct as it stands - the comment was incorrect and this
commit updates it.

See: https://forum.rclone.org/t/s3-upload-md5-check-sum/13706
2020-01-07 19:39:39 +00:00
Tennix
15d19131bd s3: use aws web identity role provider 2020-01-05 19:49:31 +00:00
Nick Craig-Wood
9d993e584b s3: force path style bucket access to off for AWS deprecation
AWS are deprecating path style bucket access so rclone should stop
using it by default for this provider.

This change shouldn't break any workflows as all AWS endpoints support
virtual hosted style lookups of buckets.  It may even improve
performance.

See: https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/
2020-01-05 17:53:45 +00:00
Nick Craig-Wood
7242c7ce95 s3: fix multipart upload uploading 0 length files
This regression was introduced by the recent re-write of the s3
multipart upload code.
2020-01-05 12:32:55 +00:00
Nick Craig-Wood
7e6fac8b1e s3: re-implement multipart upload to fix memory issues
There have been quite a few reports of problems with the multipart
uploader using too much memory and not retrying possible errors.

Before this change the multipart uploader used the s3manager
abstraction in the AWS SDK.  There are numerous bug reports of this
using up too much memory.

This change re-implements a much simplified version of the s3manager
code specialized for rclone's purposes.

This should use much less memory and retry chunks properly.

See: https://forum.rclone.org/t/memory-usage-s3-alike-to-glacier-without-big-directories/13563
See: https://forum.rclone.org/t/copy-from-local-to-s3-has-high-memory-usage/13405
See: https://forum.rclone.org/t/big-file-upload-to-s3-fails/13575
2020-01-03 22:19:28 +00:00
Thomas Kriechbaumer
584e705c0c s3: introduce list_chunk option for bucket listing
The S3 ListObject API returns paginated bucket listings, with
"MaxKeys" items for each GET call.

The default value is 1000 entries, but for buckets with millions of
objects it might make sense to request more elements per request, if
the backend supports it. This commit adds a "list_chunk" option for
the user to specify a lower or higher value.

This commit does not add safe guards around this value - if a user
decides to request a too large list, it might result in connection
timeouts (on the server or client).

In AWS S3, there is a fixed limit of 1000, some other services might
have one too.  In Ceph, this can be configured in RadosGW.
2020-01-02 12:15:01 +00:00
Outvi V
db1c7f9ca8 s3: Add new region Asia Patific (Hong Kong) 2020-01-02 11:10:48 +00:00
Nick Craig-Wood
0ecb8bc2f9 s3: fix url decoding of NextMarker - fixes #3799
Before this patch we were failing to URL decode the NextMarker when
url encoding was used for the listing.

The result of this was duplicated listings entries for directories
with >1000 entries where the NextMarker was a file containing a space.
2019-12-12 13:33:30 +00:00
Nick Craig-Wood
0d10640aaa s3: add --s3-copy-cutoff for size to switch to multipart copy
Before this change we used the same (relatively low limits) for server
side copy as we did for multipart uploads.  It doesn't make sense to
use the same limits since no data is being downloaded or uploaded for
a server side copy.

This change introduces a new parameter --s3-copy-cutoff to control
when the switch from single to multipart server size copy happens and
defaults it to the maximum 5GB.

This makes server side copies much more efficient.

It also fixes the erroneous error when trying to set the modification
time of a file bigger than 5GB.

See #3778
2019-12-03 10:37:55 +00:00
Nick Craig-Wood
f4746f5064 s3: fix multipart copy - fixes #3778
Before this change multipart copies were giving the error

    Range specified is not valid for source object of size

This was due to an off by one error in the range source introduced in
7b1274e29a "s3: support for multipart copy"
2019-12-03 10:37:55 +00:00
Aleksandar Janković
c05bb63f96 s3: fix DisableChecksum condition 2019-12-02 15:15:59 +00:00
Nick Craig-Wood
9b5308144f s3: Reduce memory usage streaming files by reducing max stream upload size
Before this change rclone would allow the user to stream (eg with
rclone mount, rclone rcat or uploading google photos or docs) 5TB
files.  This meant that rclone allocated 4 * 525 MB buffers per
transfer which is way too much memory by default.

This change makes rclone use the configured chunk size for streamed
uploads.  This is 5MB by default which means that rclone can stream
upload files up to 48GB by default staying below the 10,000 chunks
limit.

This can be increased with --s3-chunk-size if necessary.

If rclone detects that a file is being streamed to s3 it will make a
single NOTICE level log stating the limitation.

This fixes the enormous memory usage.

Fixes #3568
See: https://forum.rclone.org/t/how-much-memory-does-rclone-need/12743
2019-11-09 15:55:19 +00:00
Aleksandar Jankovic
4b20afa94a backend/s3: fix ExpiryWindow value
ExpiryWindow accepts duration but it was set to value 3.
This changes it to 3 * time.Minute since default is 5 min.
2019-11-05 13:55:55 +00:00
Nick Craig-Wood
ab895390f4 s3: fix nil pointer reference if no metadata returned for object
Fixes #3651 Fixes #3652
2019-10-25 13:45:47 +01:00
庄天翼
7b1274e29a s3: support for multipart copy
Fixes #2375 Fixes #3579
2019-10-04 16:49:06 +01:00
Aleksandar Jankovic
6b55b8b133 s3: add option for multipart failiure behaviour
This is needed for resuming uploads across different sessions.
2019-10-02 16:49:16 +01:00
Nick Craig-Wood
6e053ecbd0 s3: only ask for URL encoded directory listings if we need them on Ceph
This works around a bug in Ceph which doesn't encode CommonPrefixes
when using URL encoded directory listings.

See: https://tracker.ceph.com/issues/41870
2019-09-30 22:00:24 +01:00
Fabian Möller
33f129fbbc s3: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Nick Craig-Wood
a8adce9c59 s3: fix encoding for control characters - Fixes #3345 2019-09-30 22:00:24 +01:00
Anthony Rusdi
899f285319 s3: fix signature v2_auth headers
When used with v2_auth = true, PresignRequest doesn't return
signed headers, so remote dest authentication would be fail.
This commit copying back HTTPRequest.Header to headers.

Tested with RiakCS v2.1.0.

Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
2019-09-21 14:38:51 +01:00
Nick Craig-Wood
25786cafd3 s3: fix SetModTime on GLACIER/ARCHIVE objects and implement set/get tier
- Read the storage class for each object
- Implement SetTier/GetTier
- Check the storage class on the **object** before using SetModTime

This updates the fix in 1a2fb52 so that SetModTime works when you are
using objects which have been migrated to GLACIER but you aren't using
GLACIER as a storage class.

Fixes #3522
2019-09-14 09:18:55 +01:00
Nick Craig-Wood
66c23723e3 Add context to all http.NewRequest #3257
When we drop support for go1.12 we can use http.NewRequestWithContext
2019-09-09 23:27:07 +01:00
Nick Craig-Wood
6f16588123 s3,b2,googlecloudstorage,swift,qingstor,azureblob: fixes after code review #3421
- change the interface of listBuckets() removing dir parameter and adding context
- add makeBucket() and use in place of Mkdir("")
    - this fixes some corner cases in Copy/Update
- mark all the listed buckets OK in ListR

Thanks to @yparitcher for the review.
2019-08-22 23:06:59 +01:00
Nick Craig-Wood
eaaf2ded94 s3: make all operations work from the root #3421 2019-08-17 10:30:41 +01:00
Nick Craig-Wood
e502be475a azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files
In 0386d22cc9 we introduced a test for 0 length files read the
way mount does.

This test failed on these backends which we fix up here.
2019-08-06 15:18:08 +01:00
Nick Craig-Wood
57d5de6fba build: fix up package paths after repo move
git grep -l github.com/ncw/rclone | xargs -d'\n' perl -i~ -lpe 's|github.com/ncw/rclone|github.com/rclone/rclone|g'
goimports -w `find . -name \*.go`
2019-07-28 18:47:38 +01:00
Matti Niemenmaa
a6dca4c13f s3: Add INTELLIGENT_TIERING storage class
For Intelligent-Tiering:
https://aws.amazon.com/s3/storage-classes/#Unknown_or_changing_access
2019-07-01 18:17:48 +01:00
Aleksandar Jankovic
f78cd1e043 Add context propagation to rclone
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions

Context propagation is needed for stopping transfers and passing other
request-scoped values.
2019-06-19 11:59:46 +01:00
Philip Harvey
1a2fb52266 s3: make SetModTime work for GLACIER while syncing - Fixes #3224
Before this change rclone would fail with

    Failed to set modification time: InvalidObjectState: Operation is not valid for the source object's storage class

when attempting to set the modification time of an object in GLACIER.

After this change rclone will re-upload the object as part of a sync if it needs to change the modification time.

See: https://forum.rclone.org/t/suspected-bug-in-s3-or-compatible-sync-logic-to-glacier/10187
2019-06-03 15:28:19 +01:00
Robert Marko
5ccc2dcb8f s3: add config info for Wasabi's EU Central endpoint
Wasabi has a EU Central endpoint for a couple months now, so add it to the list.

Signed-off-by: Robert Marko <robimarko@gmail.com>
2019-05-15 13:35:55 +01:00
Nick Craig-Wood
b68c3ce74d s3: suppport S3 Accelerated endpoints with --s3-use-accelerate-endpoint
Fixes #3123
2019-05-02 14:00:00 +01:00
Manu
6e86526c9d s3: add support for "Glacier Deep Archive" storage class - fixes #3088 2019-04-11 10:21:41 +01:00
Fabian Möller
61616ba864 pacer: make pacer more flexible
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.

Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
2019-02-16 14:38:07 +00:00
Nick Craig-Wood
73f0a67d98 s3: Update Dreamhost endpoint - fixes #2974 2019-02-13 21:10:43 +00:00
Fabian Möller
a0d4c04687
backend: fix misspellings 2019-02-07 19:51:03 +01:00
weetmuts
96f6708461 s3: add aws endpoint eu-north-1 2019-02-03 12:17:15 +00:00
Nick Craig-Wood
e31578e03c s3: Auto detect region for buckets on operation failure - fixes #2915
If an incorrect region error is returned while using a bucket then the
region is updated, the session is remade and the operation is retried.
2019-01-27 21:22:49 +00:00
Nick Craig-Wood
39f5059d48 s3: add --s3-bucket-acl to control bucket ACL - fixes #2918
Before this change buckets were created with the same ACL as objects.

After this change, the user can set just --s3-acl to set the ACL of
buckets and objects, or use --s3-bucket-acl as well to have a
different ACL used for bucket creation.

This also logs at INFO level the creation and deletion of buckets.
2019-01-18 15:12:11 +00:00
Nick Craig-Wood
1318c6aec8 s3: Add Alibaba OSS to integration tests and fix storage classes 2019-01-12 20:41:47 +00:00
Nick Craig-Wood
ff0b8e10af s3: Support Alibaba Cloud (Aliyun) OSS
The existing s3 backend passed all integration tests with OSS provided
`force_path_style = false`.

This makes sure that is so and adds documentation and configuration
for OSS.

Thanks to @luolibin for their work on the OSS backend which we ended
up not needing.

Fixes #1641
Fixes #1237
2019-01-12 17:28:04 +00:00
William Cocker
8575abf599 s3: add GLACIER storage class
Fixes #923
2018-12-06 21:53:05 +00:00
Nick Craig-Wood
d99ffde7c0 s3: change --s3-upload-concurrency default to 4 to increase perfomance #2772
Increasing the --s3-upload-concurrency to 4 (from 2) gives an
additional 45% throughput at the cost of 10MB extra memory per transfer.

After testing the upload perfoc
2018-12-02 17:58:34 +00:00
Nick Craig-Wood
198c34ce21 s3: implement --s3-upload-cutoff for single part uploads below this - fixes #2772
Before this change rclone would use multipart uploads for any size of
file.  However multipart uploads are less efficient for smaller files
and don't have MD5 checksums so it is advantageous to use single part
uploads if possible.

This implements single part uploads for all files smaller than the
upload_cutoff size.  Streamed files must be uploaded as multipart
files though.
2018-12-02 17:58:34 +00:00
Henry Ptasinski
f95c1c61dd s3: add config info for Wasabi's US-West endpoint
Wasabi has two location, US East and US West, with different endpoint URLs.
When configuring S3 to use Wasabi, provide the endpoint information for both
locations.
2018-11-19 13:33:42 +00:00
Erik Swanson
fa0a1e7261 s3: fix role_arn, credential_source, ...
When the env_auth option is enabled, the AWS SDK's session constructor
now loads configuration from ~/.aws/config and environment variables,
and credentials per the selected (or default) AWS_PROFILE's settings.

This is accomplished by **NOT** including any Credential provider in the
aws.Config passed to the session constructor: If the Config.Credentials
is non-nil, that will always be used and the user's configuration re
role_arn, credential_source, source_profile, etc... from the shared
config will be completely ignored.

(The conditional creation and configuration of the stscreds Credential
provider is complicated enough that it is not worth re-creating that
logic.)
2018-11-08 12:58:23 +00:00
Nick Craig-Wood
baba6d67e6 s3: set ACL for server side copies to that provided by the user - fixes #2691
Before this change the ACL for objects which were server side copied
was left at the default "private" settings. S3 doesn't copy the ACL
from the source when you copy an object, you have to set it afresh
which is what this does.
2018-11-02 16:22:31 +00:00
Nick Craig-Wood
0f02c9540c s3: make --s3-v2-auth flag
This is an alternative to setting the region to "other-v2-signature"
which is inconvenient for multi-region providers.
2018-10-14 00:10:29 +01:00
Nick Craig-Wood
06922674c8 drive, s3: review hidden config items 2018-10-13 23:30:13 +01:00
Fabian Möller
98e2746e31 backend: add fstests.ChunkedUploadConfig
- azureblob
- b2
- drive
- dropbox
- onedrive
- s3
- swift
2018-10-11 14:47:58 +01:00
Nick Craig-Wood
a9273c5da5 docs: move documentation for options from docs/content into backends
In the following commit, the documentation will be autogenerated.
2018-10-06 11:47:46 +01:00
Paul Kohout
7826e39fcf s3: use configured server-side-encryption and storace class options when calling CopyObject() - fixes #2610 2018-10-04 08:25:20 +01:00
Craig Miskell
2543278c3f S3: Use (custom) pacer, to retry operations when reasonable - fixes #2503 2018-09-11 07:57:03 +01:00
bsteiss
aaa3d7e63b s3: add support for KMS Key ID - fixes #2217
This code supports aws:kms and the kms key id for the s3 backend.
2018-08-30 17:08:27 +01:00
Nick Craig-Wood
7194c358ad azureblob,b2,qingstor,s3,swift: remove leading / from paths - fixes #2484 2018-08-26 23:19:28 +01:00
Nick Craig-Wood
f06ba393b8 s3: Add --s3-force-path-style - fixes #2401 2018-07-20 15:41:40 +01:00
Nick Craig-Wood
f3f48d7d49 Implement new backend config system
This unifies the 3 methods of reading config

  * command line
  * environment variable
  * config file

And allows them all to be configured in all places.  This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.

The backend changes are:

  * Use the new configmap.Mapper parameter
  * Use configstruct to parse it into an Options struct
  * Add all config to []fs.Option including defaults and help
  * Remove all uses of pflag
  * Remove all uses of config.FileGet
2018-07-16 21:20:47 +01:00
Nick Craig-Wood
4fe6614ae1 s3: fix index out of range error with --fast-list fixes #2388 2018-07-09 17:00:52 +01:00
themylogin
7d2861ead6 Adjust S3 upload concurrency with --s3-upload-concurrency 2018-06-14 16:15:17 +01:00
Nick Craig-Wood
017297af70 s3: Fix --s3-chunk-size which was always using the minimum - fixes #2345 2018-06-10 12:22:30 +01:00
Nick Craig-Wood
311a962011 s3: Look in S3 named profile files for credentials - fixes #2243 2018-04-21 09:00:20 +01:00
Chris Redekop
a35e62e15c s3: Add an option to disable checksum uploading - fixes #2213 2018-04-20 20:51:12 +01:00
Nick Craig-Wood
dc247d21ff s3: add in config for all the supported S3 providers #2140
These are AWS, Ceph, Dreamhost, IBM COS S3, Minio, Wasabi and Other.

This configures endpoints where known and makes sure config doesn't
appear where it isn't valid where possible.
2018-04-13 16:33:26 +01:00
Giri Badanahatti
acd5d4377e config,s3: hierarchical configuration support #2140
This introduces a method of making provider specific configuration
within a remote.  This is useful particularly in s3.

This commit does the basic configuration in S3 for IBM COS.
2018-04-13 16:05:35 +01:00
Craig Rachel
0e6faa2313 s3: add One Zone Infrequent Access storage class - fixes #2240 2018-04-13 13:36:25 +01:00
Peter Baumgartner
1db68571fd s3,swift: Add --use-server-modtime
`--use-server-modtime` stops s3 and swift retrieving the modtime from metadata which enables a fast sync mode with the `--update` flag.
2018-04-13 13:32:17 +01:00
Nick Craig-Wood
92c5aa3786 s3: add --s3-chunk-size option - fixes #2203 2018-04-05 15:40:08 +01:00
Nick Craig-Wood
e9a2cbec37 s3: ignore zero length directory markers at the root too 2018-03-21 20:09:37 +00:00
Nick Craig-Wood
a46f2a9eb7 s3: ignore zero length directory markers - fixes #1621 2018-03-19 17:41:46 +00:00
Giri Badanahatti
aba43cd3a4 Documention for IBM COS (S3) configuration. 2018-03-15 20:20:43 +00:00
Nick Craig-Wood
89748feaa5 s3: update docs to discourage use of v2 auth - fixes #2120
From testing it appears that CEPH no longer works properly with v2
auth and neither does Dreamhost, so update the docs anc configuration
to recommend v4 auth.
2018-03-13 20:47:29 +00:00
Nick Craig-Wood
f3e982d3bf azureblob,b2,gcs,qingstor,s3,swift: Don't check for bucket/container presense if listing was OK
In a typical rclone copy to a bucket/container based remote, before
this change we were doing a list, followed by a HEAD of the bucket to
check it existed before doing the copy.  The fact the list succeeded
means the bucket exists so mark it OK at that point.

Issue #1421
2018-03-01 12:11:34 +00:00
Nick Craig-Wood
38f829842a s3: fix server side copy and set modtime on files with + in - fixes #2001
This was broken in 64ea94c1a4 when
putting a work-around for Digital Ocean.  PathEscape has now been
adjusted so it works with both providers.
2018-01-23 10:50:50 +00:00
Nick Craig-Wood
97c414f025 config/hash: rename more symbols after factoring into own package 2018-01-18 20:27:52 +00:00
Nick Craig-Wood
11da2a6c9b Break the fs package up into smaller parts.
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.

The new code layout is documented in CONTRIBUTING.
2018-01-15 17:51:14 +00:00
Nick Craig-Wood
60afda007b Move dircache, oauthutil, rest and pacer modules into lib 2018-01-12 17:07:38 +00:00
Nick Craig-Wood
b8b620f5c2 Move all backends into backend directory 2018-01-12 17:07:38 +00:00
Renamed from s3/s3.go (Browse further)