This implements `rclone cleanup` to remove multipart uploads over 24
hours old. It also implements the backend command
`list-multipart-uploads` to see which ones are available and `cleanup`
to delete them with a configurable expiry interval.
See #4302
The S3 ListObject API returns paginated bucket listings, with
"MaxKeys" items for each GET call.
The default value is 1000 entries, but for buckets with millions of
objects it might make sense to request more elements per request, if
the backend supports it. This commit adds a "list_chunk" option for
the user to specify a lower or higher value.
This commit does not add safe guards around this value - if a user
decides to request a too large list, it might result in connection
timeouts (on the server or client).
In AWS S3, there is a fixed limit of 1000, some other services might
have one too. In Ceph, this can be configured in RadosGW.
Before this change rclone would fail with
Failed to set modification time: InvalidObjectState: Operation is not valid for the source object's storage class
when attempting to set the modification time of an object in GLACIER.
After this change rclone will re-upload the object as part of a sync if it needs to change the modification time.
See: https://forum.rclone.org/t/suspected-bug-in-s3-or-compatible-sync-logic-to-glacier/10187
The existing s3 backend passed all integration tests with OSS provided
`force_path_style = false`.
This makes sure that is so and adds documentation and configuration
for OSS.
Thanks to @luolibin for their work on the OSS backend which we ended
up not needing.
Fixes#1641Fixes#1237
Increasing the --s3-upload-concurrency to 4 (from 2) gives an
additional 45% throughput at the cost of 10MB extra memory per transfer.
After testing the upload perfoc
Before this change rclone would use multipart uploads for any size of
file. However multipart uploads are less efficient for smaller files
and don't have MD5 checksums so it is advantageous to use single part
uploads if possible.
This implements single part uploads for all files smaller than the
upload_cutoff size. Streamed files must be uploaded as multipart
files though.
These are AWS, Ceph, Dreamhost, IBM COS S3, Minio, Wasabi and Other.
This configures endpoints where known and makes sure config doesn't
appear where it isn't valid where possible.
This introduces a method of making provider specific configuration
within a remote. This is useful particularly in s3.
This commit does the basic configuration in S3 for IBM COS.
From testing it appears that CEPH no longer works properly with v2
auth and neither does Dreamhost, so update the docs anc configuration
to recommend v4 auth.
ECS container IAM metadata is in a different place than EC2 IAM metadata.
Use defaults' RemoteCredProvider function to query the standard locations
for the credentials.
Give the ECS role precedence over the role available from the underlying
EC2 instance.
Current doc mentioned lack of ETag and metadata
support which since has been long fixed in many
upstream Minio releases.
Also cleanup the doc to show new startup banner etc.
This will make the s3 provider authentaction logic
- Configured credentials if both key and secret available
- Anonymous if key and secret missing and env_auth not set
- if env_auth is set to truthy (https://golang.org/pkg/strconv/#ParseBool)
- AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY environment variables
- IAM role credentials as fallback