Before this change if an --s3-profile was set which used AWS STS (eg
to assume a role) and --s3-endpoint was set then rclone would use the
value from --s3-endpoint to contact the STS server which did not work.
This fix implements an endpoint resolver which only overrides the "s3"
service if --s3-endpoint is set. It sends the "sts" service (and any
other service) to the default resolver.
Fixes#6443
See: https://forum.rclone.org/t/s3-profile-failing-when-explicit-s3-endpoint-is-present/36063/
Before this change, all types of checkers showed "checking" after the
file name despite the fact that not all of them were checking.
After this change, they can show
- checking
- deleting
- hashing
- importing
- listing
- merging
- moving
- renaming
See: https://forum.rclone.org/t/what-is-rclone-checking-during-a-purge/35931/
Before this change if --s3-no-head was in use rclone didn't check the
multipart upload ETag at all. However the ETag is returned in the
final POST request when completing the object.
This change uses that ETag from the final POST if --s3-no-head is in
use, otherwise it uses the ETag from a fresh HEAD request.
See: https://forum.rclone.org/t/in-some-cases-rclone-does-not-use-etag-to-verify-files/36095/
This commits ports a fast C-implementation from https://github.com/namazso/QuickXorHash
It uses new crypto/subtle code from go1.20 to avoid the use of unsafe.
Typical speedups are about 25x when using go1.20
goos: linux
goarch: amd64
cpu: Intel(R) Celeron(R) N5105 @ 2.00GHz
QuickXorHash-Before 2.49ms 422MB/s ±11% 100.00%
QuickXorHash-Subtle 87.9µs 11932MB/s ± 5% +2730.83% + 42.17%
Co-Author: @namazso
Uploading 100 files of each 1 MB took 20 seconds before. With above fix it takes around 2 seconds now.
10x time improvement in line with pacer's sleep reduction from 100ms to 10ms
Before this change when uploading files bigger than 1TiB, the chunk
calculator would work out that the chunk size needed to be bigger than
the default 100 MiB to fit within the 10,000 parts limit.
However the uploader was still using the memory pool for the old chunk
size and this caused errors like
panic: runtime error: slice bounds out of range [:122683392] with capacity 100663296
The fix for this is to make a temporary pool with the larger chunk
size and use it during the upload of the large file.
See: https://forum.rclone.org/t/rclone-cannot-complete-upload-to-b2-restarts-upload-frequently/35617/
Before this change we were sending webdav requests to the go http
FileServer. In go1.20 these (rightly) started returning errors which
caused the tests to fail.
The test has been changed to properly mock up an About query and
response so an end to end test of adding headers is possible.
Passwords for encrypted libraries are kept in memory in the server
and flushed after an hour.
This MR fixes an issue when the library password expires after 1 hour.
rclone sync erroneously deleted folders renamed to a different case on
crypts where directory name encryption was disabled and the underlying
remote was case insensitive.
Example: Renaming the folder Test to tEST before a sync to a crypt having
remote=OneDrive:crypt and directory_name_encryption=false could result in
the folder and all its content being deleted. The following sync would
correctly create the tEST folder and upload all of the content.
Additional tests have revealed other potential issues when using
filename_encryption=off or directory_name_encryption=false on case
insensitive remotes. The documentation has been updated to warn about
potential problems when using these combinations.
This error was caused by rclone supplying an empty
`x-ms-blob-public-access:` header when creating a container for
private access, rather than omitting it completely.
This is a valid way of specifying containers should be private, but if
the storage account has the flag "Blob public access" unset then it
gives "409 Public access is not permitted on this storage account".
This patch fixes the problem by only supplying the header if the
access is set.
Fixes#6645
Storj switched to a single global s3 endpoint backed by a BGP routing.
We want to stop advertizing the former regional endpoints and have the
global one as the only option.
Before this change, when a new object was created s3 returns its
versionID (on a versioned bucket) and rclone recorded it in the
object.
This means that when rclone came to delete the object it would delete
it with the versionID.
However it is common to forbid actions with versionIDs on buckets so
as to preserve the historical record and these operations would fail
whereas they succeeded in pre-v1.60.0 versions.
This patch fixes the problem by not recording versions of objects
supplied by the S3 API on upload unless `--s3-versions` or
`--s3-version-at` is used. This makes rclone behave as it did before
v1.60.0 when version support was introduced.
See: https://forum.rclone.org/t/s3-and-intermittent-403-errors-with-file-renames-and-drag-and-drop-operations-in-windows-explorer/34773
This patch implements --use-server-modtime for the Azureblob backend.
It does this by not reading the time from the metadata if the global
flag is set.
When the SDK was upgraded it started delivering metadata where the
keys were not in lower case as per the old SDK.
Rclone normalises the case of the keys for storage in the Object, but
the directory marker check was being done with the unnormalised keys
as it needs to be done before the Object is created.
This fixes the directory marker check to do a case insensitive compare
of the metadata keys.
Before this change, we were taking the version ID straight from the
XML blob returned by the SDK and thus pinning the XML into memory
which bulked up the average memory per object from about 400 bytes to
4k.
Copying the string fixes the excess memory usage.
This reverts commit 4f386a1ccd.
It turns out that Alibaba OSS does support list v2 and the detection
code was wrong.
This means that users of the gov version of Alibaba will have to add
`list_version 1` to their config files.
See #6600
In this commit
ab849b3613 s3: fix listing loop when using v2 listing on v1 server
The ContinuationToken was tested for existence, but it is the
NextContinuationToken that we are interested in.
See: #6600
Previously it was limited to plain ASCII (0-9, A-Z, a-z).
Implemented by adding \p{L}\p{N} alongside the \w in the regex,
even though these overlap it means we can be sure it is 100%
backwards compatible.
Fixes#6618
This was caused by
a9bd0c8de6 s3: reduce memory consumption for s3 objects
Which assumed that the StorageClass would always be set, but it isn't
set for Versions.
The updates the authentication to include
- Auth from the environment
1. Environment Variables
2. Managed Service Identity Credentials
3. Azure CLI credentials (as used by the az tool)
- Account and Shared Key
- SAS URL
- Service principal with client secret
- Service principal with certificate
- User with username and password
- Managed Service Identity Credentials
And rationalises the auth order.
Normally rclone will check the container exists before uploading if it
hasn't listed the container yet.
Often rclone will be running with a limited set of permissions which
means rclone can't create the container anyway, so this stops the
check.
This will save a transaction.
This commit switches from using the old Azure go modules
github.com/Azure/azure-pipeline-go/pipeline
github.com/Azure/azure-storage-blob-go/azblob
github.com/Azure/go-autorest/autorest/adal
To the new SDK
github.com/Azure/azure-sdk-for-go/
This stops rclone using deprecated code and enables the full range of
authentication with Azure.
See #6132 and #5284
Before this change, rclone would enter a listing loop if it used v2
listing on a v1 server and the list exceeded 1000 items.
This change detects the problem and gives the user a helpful message.
Fixes#6600
Copying the storageClass string instead of using a pointer to the original string.
This prevents the Go garbage collector from keeping large amounts of
XMLNode structs and references in memory, created by xmlutil.XMLToStruct()
from the aws-sdk-go.
This commit uses the MLST command (where available) to get the status
for single files rather than listing the parent directory and looking
for the file. This makes actions such as using `--files-from` much quicker.
* use getEntry to lookup remote files when supported
* findItem now expects the full path directly
It makes the expected argument similar to the getInfo method, the
difference now is that one is returning a FileInfo whereas
the other is returning an ftp Entry.
Fixes#6225
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
Before this fix a chain compress -> crypt -> s3 was giving errors
BadDigest: The Content-MD5 you specified did not match what we received.
This was because the crypt backend was encrypting the underlying local
object to calculate the hash rather than the contents of the metadata
stream.
It did this because the crypt backend incorrectly identified the
object as a local object.
This fixes the problem by making sure the crypt backend does not
unwrap anything but fs.OverrideRemote objects.
See: https://forum.rclone.org/t/not-encrypting-or-compressing-before-upload/32261/10
Before this change we were putting connections into the connection
pool which had a local context in.
This meant that when the operation had finished the context was
cancelled and the connection became unusable.
See: https://forum.rclone.org/t/failed-to-sync-context-canceled/34017/
Before this change, when using -server-side-across-configs rclone
would direct Move/Copy/DirMove to the destination server.
However this should be directed to the source server. This is a little
unclear in the RFC, but the name of the parameter "Destination:" seems
clear and this is how dCache and Rucio have implemented it.
See: https://forum.rclone.org/t/webdav-copy-request-implemented-incorrectly/34072/
In this commit
8d1fff9a82 local: obey file filters in listing to fix errors on excluded files
We introduced the concept of local backend filters.
Unfortunately the filters were being applied before we had resolved
the symlink to point to a directory. This meant that symlinks pointing
to directories were filtered out when they shouldn't have been.
This was fixed by moving the filter check until after the symlink had
been resolved.
See: https://forum.rclone.org/t/copy-links-not-following-symlinks-on-1-60-0/34073/7
Fish is different from POSIX-based Unix shells such as bash,
and a bracketed variable references like we use for the
auto-detection echo command is not supported. The command
will return with zero exit code but produce no output on
stdout. There is a message on stderr, but we don't log it
due to the zero exit code:
fish: Variables cannot be bracketed. In fish, please use {$ShellId}.
Fixes#6552
Before this change if we copied files of unknown size, then they lost
their metadata.
This was particularly noticeable using --s3-decompress.
This change adds metadata to Rcat and RcatSized and changes Copy to
pass the metadata in when it calls Rcat for an unknown sized input.
Fixes#6546
Before this change, some files were giving this error when downloaded
from Cloudflare and other providers.
ERROR corrupted on transfer: sizes differ NNN vs MMM
This is because these providers auto gzips the object when rclone
wasn't expecting it to. (AWS does not gzip objects without their being
uploaded gzipped).
This patch adds a quirk to for fix the problem and a flag to control
it. The quirk `might_gzip` is set to `true` for all providers except
AWS.
See: https://forum.rclone.org/t/s3-error-corrupted-on-transfer-sizes-differ-nnn-vs-mmm/33694/Fixes: #6533
Before this fix it was impossible to stop rclone generating an
X-Amx-Acl: header which is incompatible with GCS with uniform access
control and is generally deprecated at AWS.
The API endpoint GetBucketLocation requires
top level permission.
If we do an authenticated head request to a bucket, the bucket location will be returned in the HTTP headers.
Fixes#5066
Before this change if --user-server-modtime was in use the ModTime
could change for an object as we receive it accurate to the nearest ms
in listings, but only accurate to the nearest second in HEAD and GET
requests.
Normally AWS returns the milliseconds as .000 in listings, but if
versions are in use it may not. Storj S3 also seems to return
milliseconds.
This patch tries to keep the maximum precision in the last modified
time, so it doesn't update a last modified time with a truncated
version if the times were the same to the nearest second.
See: https://forum.rclone.org/t/cache-fingerprint-miss-behavior-leading-to-false-positive-stalen-cache/33404/
Before this change rclone used statx() to read the metadata for files
from the local filesystem when `-M` was in use.
Unfortunately statx() was only introduced in kernel 4.11 which was
released in April 2017 so there are current systems (eg Centos 7)
still on kernel versions which don't support statx().
This patch checks to see if statx() is available and if it isn't, it
falls back to using fstatat() which was introduced in Linux 2.6.16
which is guaranteed for all Go versions.
See: https://forum.rclone.org/t/metadata-from-linux-local-s3-failed-to-copy-failed-to-read-metadata-from-source-object-function-not-implemented/33233/
In https://github.com/jlaffaye/ftp/commit/212daf295f the upstream FTP
library changed the way adding your own dialer works which meant that
connections when using explicit FTP were failing.
This patch reworks our connection code to bring it into the
expectations of the library.
Before this fix, if an error ocurred reading the metadata, it could be
set as nil and then used, causing a crash.
This fix changes the readMetadata function so it returns an error, and
the error is always set if the metadata returned is nil.
If mkdir fails then before this change it would have thrown an
error.
After this change, if the error indicated that the directory
already exists then the error is not returned to the user.
This fixes a race condition when two rclone threads are trying to
create the same directory.
In this commit
8d1fff9a82 local: obey file filters in listing to fix errors on excluded files
We started using filters in the local backend so the user could short
circuit troublesome files/directories at a low level.
However this caused a number of integration tests to fail. This turned
out to be in backends wrapping the local backend. For example the
combine backend test failed because it changes the paths passed to the
local backend so they no longer match the paths in the current filter.
To fix this, a new feature flag `FilterAware` was added and the
UseFilter context flag is only passed to backends which support it. As
the wrapping backends don't support the flag, this fixes the problems
in the integration tests.
In future the wrapping backends could modify the active filters to
match the path modifications and then they could set the FilterAware
flag.
See #6376
Before this fix, the chunksize calculator was using the previous size
of the object, not the new size of the object to calculate the chunk
sizes.
This meant that uploading a replacement object which needed a new
chunk size would fail, using too many parts.
This fix fixes the calculator to take the size explicitly.
Before this change, if rclone was run with `-M` on a filesystem
without xattr support, it would error out.
This patch makes rclone detect the not supported errors and disable
xattrs from then on. It prints one ERROR level message about this.
See: https://forum.rclone.org/t/metadata-update-local-s3/32277/7
Before this change, if an object compressed with "Content-Encoding:
gzip" was downloaded, a length and hash mismatch would occur since the
go runtime automatically decompressed the object on download.
If --s3-decompress is set, this change erases the length and hash on
compressed objects so they can be downloaded successfully, at the cost
of not being able to check the length or the hash of the downloaded
object.
If --s3-decompress is not set the compressed files will be downloaded
as-is providing compressed objects with intact size and hash
information.
See #2658
Before this fix, the dropbox backend wasn't decoding the file names
received in changenotify events into rclone standard format.
This meant that changenotify events for filenames which had encoded
characters were failing to be decrypted properly if wrapped in crypt.
See: https://forum.rclone.org/t/rclone-vfs-cache-says-file-name-too-long/31535
Before this patch backends could be shutdown when they fell out of the
cache when they were in use with combine. This was particularly
noticeable with the dropbox backend which gave this error when
uploading files after the backend was Shutdown.
Failed to copy: upload failed: batcher is shutting down
This patch gets the combine remote to pin them until it is finished.
See: https://forum.rclone.org/t/rclone-combine-upload-failed-batcher-is-shutting-down/32168
Previously, with standard auth, the username would be stored in config - but only after
entering the non-standard device/mountpoint sequence during config (a feature introduced
with #5926). Regardless of that, rclone always requests the username from the api at
startup (NewFS).
In #6270 (commit 9dbed02329) this was changed to always
store username in config (consistency), and then also use it to avoid the repeated
customer info request in NewFs (performance). But, as reported in #6309, it did not work
with legacy auth, where user enters username manually, if user entered an email address
instead of the internal username required for api requests. This change was therefore
recently reverted.
The current commit takes another step back to not store the username in config during
the non-standard device/mountpoint config sequence (consistentcy). The username will
now only be stored in config when using legacy auth, where it is an input parameter.
Extend the shouldRetry function by also checking for the quotaExceeded
reason, and since this function appeared to be untested, add a test case
for the existing errors and this new one.
Fixes#615
In
22abd785eb s3: implement reading and writing of metadata #111
The reading information of objects was refactored to use the
s3.HeadObjectOutput structure.
Unfortunately the code branch with `--s3-no-head` was not tested
otherwise this panic would have been discovered.
This shows that this is path is not integration tested, so this adds a
new integration test.
Fixes#6322
`FS.cacheExpiry` is accessed through sync/atomic.
According to the documentation, "On ARM, 386, and 32-bit MIPS, it is
the caller's responsibility to arrange for 64-bit alignment of 64-bit
words accessed atomically. The first word in a variable or in an
allocated struct, array, or slice can be relied upon to be 64-bit
aligned."
Before commit 1d2fe0d856 this field was
aligned, but then a new field was added to the structure, causing the
test suite to panic on linux/386.
No other field is used with sync/atomic, so `cacheExpiry` can just be
placed at the beginning of the stuct to ensure it is always aligned.
By default these will be downloaded compressed.
This changes the default of the previous commit
2781f8e2f1 gcs: Fix download of "Content-Encoding: gzip" compressed objects
But will fit in better with the metadata framework when copying
gzip-encoded objects from backend to backend.
Existing version did save username in config, but only when entering the custom
device/mountpoint sequence in config. Regardless of that, it did always look up the
username at startup with an api request.
This commit improves it so that the username will always be stored in config,
and when using standard authentication it picks it from the login token instead of
requesting it from the remote api, and also in fs constructor it picks it from config
instead of requesting it from remote api (again).
The SDK doesn't wrap errors in a Go standard way so they can't be
unwrapped and tested for - eg fatal error.
The code looks for a Serialization or RequestError and returns the
unwrapped underlying error if possible.
This fixes the fs/operations integration tests checking for fatal
errors being returned.
In this commit
e5974ac4b0 s3: use PutObject from the aws SDK to upload single part objects
rclone was made to upload objects to s3 using PUT requests rather than
using signed uploads.
However this change missed the fact that there is a supported way to
do this in the SDK using the SetStreamingBody method on the Request.
This therefore reverts a lot of the previous commit to do with making
an unsigned connection and other complication and uses the SDK
facility.
Before this fix GetFreeSpace returned math.MaxInt64 for remotes which
don't support reading free space, however this is used in various
comparison routines as a too large value, meaning that remotes of size
math.MaxInt64 were never being selected.
This fixes GetFreeSpace to return math.MaxInt64 - 1 so then can be selected.
It also fixes GetUsedSpace the same way however as the default for not
supported was 0 this was very unlikely to have ever caused a problem.
By default, rclone always requests read and write permissions. No matter what settings you configure in the AAD application. This option allows to explicitly request readonly permissions
Migrated read only option to access scope option and set disable_site_permission option to hidden.
Windows shells like cmd and powershell needs to use different quoting/escaping
of strings and paths than the unix shell, and also absolute paths must be fixed
by removing leading slash that the POSIX formatted paths have
(e.g. /C:/Users does not work in shell, it must be converted to C:/Users).
Tries to autodetect shell type (cmd, powershell, unix) on first use.
Implemented default builtin powershell functions for hashsum and about when remote
shell is powershell.
See #5763Fixes#5758
This adjusts
rclone backend drives -o config drive:
So that it also emits a config section called `AllDrives` which uses
the combine backend to make a backend which combines all the shared
drives into one.
It also makes sure that all the shared drive names are valid rclone
config names, deduplicating if necessary.
Fixes#4506
Before this change, if an object compressed with "Content-Encoding:
gzip" was downloaded, a length and hash mismatch would occur since the
as the go runtime automatically decompressed the object on download.
This change erases the length and hash on compressed objects so they
can be downloaded successfully, at the cost of not being able to check
the length or the hash of the downloaded object.
This also adds the --gcs-download-compressed flag to allow the
compressed files to be downloaded as-is providing compressed objects
with intact size and hash information.
Fixes#2658
Before this fix, if uploading to a union consisting of all bucket
based remotes (eg s3), uploads failed with:
Failed to copy: object not found
This was because the union backend was relying on parent directories
being created to work out which files to upload. If all the upstreams
were bucket based backends which can't hold empty directories, no
directories were created and the upload failed.
This fixes the problem by returning the upstreams used when creating
the directory for the upload, rather than searching for them again
after they've been created.
This will also make the union backend a little more efficient.
Fixes#6170
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
The "relative" argument was missing when Put'ing a file. This
sets an incorrect object entry in the cache, leading to the file being
unreadable when using mount functionality.
Fixes#6151
Before this change rclone used presigned requests to upload single
part objects. This was because of a limitation in the SDK which didn't
allow non seekable io.Readers to be passed in.
This is incompatible with some S3 backends, and rclone wasn't adding
the `X-Amz-Content-Sha256: UNSIGNED-PAYLOAD` header which was
incompatible with other S3 backends.
The SDK now allows for this so rclone can use PutObject directly.
This sets the `X-Amz-Content-Sha256: UNSIGNED-PAYLOAD` flag on the PUT
request. However rclone will add a `Content-Md5` header if at all
possible so the body data is still protected.
Note that the old behaviour can still be configured if required with
the `use_presigned_request` config parameter.
Fixes#5422
Uses b2_list_file_versions to retrieve all file versions, and returns
the one that was active at the specified time
This is especially useful in combination with other backup tools, such
as restic, which may use rclone as a backend.
Jottacloud have several different apis and endpoints using a mix of different
timestamp formats. In existing code the list operation (after the recent liststream
implementation) uses format from golang's time.RFC3339 constant. Uploads (using the
allocate api) uses format from a hard coded constant with value identical to golang's
time.RFC3339. And then we have the classic JFS time format, which is similar to RFC3339
but not identical, using a different constant. Also the naming is a bit confusing,
since the term api is used both as a generic term and also as a reference to the
newer format used in the api subdomain where the allocate endpoint is located.
This commit refactors these things a bit.
Now using the utility function for deduplication that was newly implemented to
fix an issue with server-side copy. This function uses the original, and generic,
"jfs" api (and its "cphash" feature), instead of the newer "allocate" api dedicated
for uploads. Both apis support similar deduplication functionaly that we rely on for
the SetModTime operation. One advantage of using the jfs variant is that the allocate
api is specialized for uploads, an initial request performs modtime-only changes and
deduplication if possible but if not possible it creates an incomplete file revision
and returns a special url to be used with a following request to upload missing content.
In the SetModTime function we only sent the first request, using metadata from existing
remote file but different timestamps, which lead to a modtime-only change. If, for some
reason, this should fail it would leave the incomplete revision behind. Probably not
a problem, but the jfs implementation used with this commit is simpler and
a more "standalone" request which either succeeds or fails without expecting additional
requests.
A strange feature (probably bug) in the api used by the server-side copy implementation
in Jottacloud backend is that if the destination file is in trash, the copy request
succeeds but the destination will still be in trash! When this situation occurs in
rclone, the copy command will fail with "Failed to copy: object not found" because
rclone verifies that the file info in the response from the copy request is valid,
and since it is marked as deleted it is treated as invalid.
This commit works around this problem by looking for this situation in the response
from the copy operation, and send an additional request to a built-in deduplication
endpoint that will restore the file from trash.
Fixes#6112
The existing code in rclone set the value "offline_access+openid",
when encoded in body it will become "offline_access%2Bopenid". I think
this is wrong. Probably an artifact of "double urlencoding" mixup -
either in rclone or in the jottacloud cli tool version it was sniffed
from? It does work, though. The token received will have scopes "email
offline_access" in it, and the same is true if I change to only
sending "offline_access" as scope.
If a proper space delimited list of "offline_access openid"
is used in the request, the response also includes openid scope:
"openid email offline_access". I think this is more correct and this
patch implements this.
See: #6107
Adds a configuration option to the GCS backend to allow skipping the
check if a bucket exists before copying an object to it, much like
f406dbb added for S3.
Before this change the cache backend was passing -1 into
rate.NewLimiter to mean unlimited transactions per second.
In a recent update this immediately returns a rate limit error as
might be expected.
This patch uses rate.Inf as indicated by the docs to signal no limits
are required.
Before this change the 206 responses from putio Range requests were being
returned as errors.
This change checks for 200 and 206 in the GET response now.
Before this fix, rclone retries chunks of multipart uploads. However
if they had been partially received dropbox would reply with an
incorrect_offset error which rclone was ignoring.
This patch parses the new offset from the error response and uses it
to adjust the data that rclone sends so it is the same as what dropbox
is expecting.
See: https://forum.rclone.org/t/dropbox-rate-limiting-for-upload/29779
This commit switches Google Cloud Storage from the drive pacer to the
s3 pacer. The main difference between them is that the s3 pacer does
not limit transactions in the non-error case. This is appropriate for
a cloud storage backend where you pay for each transaction.
Before this fix `NewObject` could return a wrapped `fs.Object(nil)`
which caused a crash. This was caused by `wrapObject` returning a
`nil` `*Object` which was cast into an `fs.Object`.
This changes the interface of `wrapObject` so it returns an
`fs.Object` instead of a `*Object` and an error which must be checked.
This forces the callers to return a `nil` object rather than an
`fs.Object(nil)`.
See: https://forum.rclone.org/t/panic-in-hasher-when-mounting-with-vfs-cache-and-not-synced-data-in-the-cache/29697/11
Having a replace directive in go.mod causes "go get
github.com/rclone/rclone" to fail as it discussed in this Go issue:
https://github.com/golang/go/issues/44840
This is apparently how the Go team want go.mod to work, so this commit
hard forks github.com/jlaffaye/ftp into github.com/rclone/ftp so we
can remove the `replace` directive from the go.mod file.
Fixes#5810
Before this change rclone send pre-1970 timestamps as negative
numbers. pCloud ignores these and sets them as todays date.
This change sends the timestamps as unsigned 64 bit integers (which is
how the binary protocol sends them) and pCloud accepts the (actually
negative) timestamp like this.
Before this change the new multipart upload ETag checking code was
failing in the integration tests with Alibaba OSS.
Apparently Alibaba calculate the ETag in a different way to AWS.
This introduces a new provider quirk with a flag to disable the
checking of the ETag for multipart uploads.
Mulpart Etag checking has been enabled for all providers that we can
test for and work, and left disabled for the others.
Before this rclone ignored the ETag on multipart uploads which missed
an opportunity for a whole file integrity check.
This adds that check which means that we now check even harder that
multipart uploads have arrived properly.
See #5993
Before this change `rclone about swift:container` would show aggregate
info about all the containers, not just the one in use.
This causes a problem if container listing is disabled (for example in
the Blomp service).
This fix makes `rclone about swift:container` show only the info about
the given `container`. If aggregate info about all the containers is
required then use `rclone about swift:`.
See: https://forum.rclone.org/t/rclone-mount-blomp-problem/29151/18
Before this change, rclone supported authorizing for remote systems by
going to a URL and cutting and pasting a token from Google. This is
known as the OAuth out-of-band (oob) flow.
This, while very convenient for users, has been shown to be insecure
and has been deprecated by Google.
https://developers.googleblog.com/2022/02/making-oauth-flows-safer.html#disallowed-oob
> OAuth out-of-band (OOB) is a legacy flow developed to support native
> clients which do not have a redirect URI like web apps to accept the
> credentials after a user approves an OAuth consent request. The OOB
> flow poses a remote phishing risk and clients must migrate to an
> alternative method to protect against this vulnerability. New
> clients will be unable to use this flow starting on Feb 28, 2022.
This change disables that flow, and forces the user to use the
redirect URL flow. (This is the flow used already for local configs.)
In practice this will mean that instead of cutting and pasting a token
for remote config, it will be necessary to run "rclone authorize"
instead. This is how all the other OAuth backends work so it is a well
tested code path.
Fixes#6000
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Before this change a multipart upload with the --no-head flag returned
the MD5SUM as a base64 string rather than a Hex string as the rest of
rclone was expecting.
Before this change attempting NewObject on a SAS URL's root would
crash the Azure SDK.
This change detects that using the code from this previous fix
f7404f52e7 azureblob: fix crash when listing outside a SAS URL's root - fixes#4851
And returns not object not found instead.
It also prevents things being uploaded to the root of the SAS URL
which also crashes the Azure SDK.
Before this fix if a file was updated, but to the same length and
timestamp then the local backend would return the wrong (cached)
hashes for the object.
This happens regularly on a crypted local disk mount when the VFS
thinks files have been changed but actually their contents are
identical to that written previously. This is because when files are
uploaded their nonce changes so the contents of the file changes but
the timestamp and size remain the same because the file didn't
actually change.
This causes errors like this:
ERROR: file: Failed to copy: corrupted on transfer: md5 crypted
hash differ "X" vs "Y"
This turned out to be because the local backend wasn't clearing its
cache of hashes when the file was updated.
This fix clears the hash cache for Update and Remove.
It also puts a src and destination in the crypt message to make future
debugging easier.
Fixes#4031
Currently the B2 docs don't specify which format the download_url
setting should have, and if you input it wrong, there is nothing
in the verbose logs or anywhere else that can let you know that.
* Wasabi starts to provide AP Northeast 2 (Osaka) endpoint, so add it to the list
* Rename ap-northeast-1 as "AP Northeast 1 (Tokyo)" from "AP Northeast"
Signed-off-by: lindwurm <lindwurm.q@gmail.com>
After speed testing it was discovered that upload speed goes up pretty
much linearly with upload concurrency. This patch changes the default
from 4 to 16 which means that rclone will use 16 * 4M = 64M per
transfer which is OK even for low memory devices.
This adds a note that performance may be increased by increasing
upload concurrency.
See: https://forum.rclone.org/t/performance-of-rclone-vs-azcopy/27437/9
Previously only the fs being checked on gets passed to
GetModifyWindow(). However, in most tests, the test files are
generated in the local fs and transferred to the remote fs. So the
local fs time precision has to be taken into account.
This meant that on Windows the time tests failed because the
local fs has a time precision of 100ns. Checking remote items uploaded
from local fs on Windows also requires a modify window of 100ns.
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.
This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
This removes the checks against the provider throughout the code and
puts them into a single setQuirks function for easy maintenance when
adding a new provider.
It also updates the quirks with the results of testing against
backends we have access to.
This also adds a list_url_encode parameter so that quirk can be
manually set.
This implements a quirks system for providers and notes which
providers we have tested to support ListObjectsV2.
For those providers which don't support ListObjectsV2 we use the
original ListObjects call.
The API doesn't seem to accept a value of "0" any more for the root
directory ID, giving the error "Could not decode folder id".
However omitting it seems to work fine.
In this commit, released in 1.56.0 we started reading the size of the
object from the Content-Length header as returned by the GET request
to read the object.
4401d180aa s3: add --s3-no-head-object
However some object storage systems, notably Ceph, don't return a
Content-Length header.
The new code correctly calls the setMetaData function with a nil
pointer to the ContentLength.
However due to this commit from 2014, released in v1.18, the
setMetaData function was not ignoring the size as it should have done.
0da6f24221 s3: use official github.com/aws/aws-sdk-go including multipart upload #101
This commit correctly ignores the content length if not set.
Fixes#5732
Before this change the `shared_credentials_file` config option was
being ignored.
The correct value is passed into the SDK but it only sets the
credentials in the default provider. Unfortunately we wipe the default
provider in order to install our own chain if env_auth is true.
This patch restores the shared credentials file in the session
options, exactly the same as how we restore the profile.
Original fix:
1605f9e14d s3: Fix shared_credentials_file auth
This patch reverts this commit
1605f9e14d s3: Fix shared_credentials_file auth
It unfortunately had the side effect of making the s3 SDK ignore the
config in our custom chain and use the default provider. This means
that advanced auth was being ignored such as --s3-profile with
role_arn.
Fixes#5468Fixes#5762
The API has changed in the directory move call JSON response from
returning a TaskID as a string to returning it as an integer. In other
places it is still returned as a string though.
This patch allows the TaskID to be an integer or a string in the JSON
response and keeps it internally as a string like before.
Before this change the backoff for the error_background error was 6
seconds. This means that if it wasn't resolved in 60 seconds (with the
default 10 low level retries) then an error was reported.
This error was being reported frequently in the integration tests, so
is likely affecting real users too.
This patch changes the backoff into an exponential backoff
1,2,4,8...1024 seconds to make sure we wait long enough for the
background operation to complete.
See #5734
- setup correct path encoding (fixes backend test FsEncoding)
- ignore range option if file is empty (fixes VFS test TestFileReadAtZeroLength)
- cleanup stray files left after failed upload (fixes test FsPutError)
- rebase code on master, adapt backend for rclone context passing
- translate Siad errors to rclone native FS errors in sia errorHandler
- TestSia: return proper backend options from the script
- TestSia: use uptodate AntFarm image, nebulouslabs/siaantfarm is stale
In
05f128868f azureblob: add --azureblob-no-head-object
we incorrectly parsed the size of the object as the Content-Length of
the returned header. This is incorrect in the presense of Range
requests.
This fixes the problem by parsing the Content-Range header if
avaialble to read the correct length from if a Range request was
issued.
See: #5734
This reverts commit
dc06973796 Revert "s3: use rclone's low level retries instead of AWS SDK to fix listing retries"
Which in turn reverted
5470d34740 "backend/s3: use low-level-retries as the number of SDK retries"
So we are back where we started.
It then modifies it to set the AWS SDK to `--low-level-retries`
retries, but set the rclone retries to 2 so that directory listings
can be retried.
Before this change the cleanup routine exited on the first deletion
error.
This change counts any errors on deletion and exits when the iteration
is complete with an error showing the number of deletion failures.
Deletion failures will be logged.
Before this change we uses limit/offset paging for directories in the
main directory listing routine and in the trash cleanup listing.
This switches to the new scheme of limit/marker which is more reliable
on a directory which is continuously changing. It has the disadvantage
that it doesn't tell us the total number of items available, however
that wasn't information rclone uses.
This changes the interface to NewObject so that if NewObject is called
on a directory then it should return fs.ErrorIsDir if possible without
doing any extra work, otherwise fs.ErrorObjectNotFound.
Tested on integration test server with:
go run integration-test.go -tests backend -run TestIntegration/FsMkdir/FsPutFiles/FsNewObjectDir -branch fix-stat -maxtries 1
The egress charges while using a CloudFront CDN url is cheaper when
compared to accessing the file directly from S3. So added a download
URL advanced option, which when set downloads the file using it.
Before this patch the md5all option would skip creating metadata with
hashsum if base filesystem provided md5, in hope to pass it through.
However, if base hash is slow (for example on local fs), chunker passed
slow md5 but never reported this fact in features.
This patch makes chunker snapshot base hashsum in metadata when md5all is
set and base hashsum is slow since chunker was intended to provide only
instant hashsums from the start.
Fixes#5508
Before this change, when uploading to a crypt, the ObjectInfo
accidentally used the encrypted size, not the unencrypted size when
--crypt-no-data-encryption was set.
Fixes#5498
In presence of no_data_encryption the Crypt's Put method used to over-optimize
and returned base object. This patch makes it return Crypt-wrapped object now.
Fixes#5498
I discovered that `rclone` always upload in chunks of 16MiB whenever
uploading a file smaller than `--drive-upload-cutoff`. This is
undesirable since the purpose of the flag `--drive-upload-cutoff` is
to *prevent* chunking below a certain file size.
I realized that it wasn't `rclone` forcing the 16MiB chunks. The
`google-api-go-client` forces a chunk size default of
[`googleapi.DefaultUploadChunkSize`](32bf29c2e1/googleapi/googleapi.go (L55-L57))
bytes for resumable type uploads. This means that all requests that
use `*drive.Service` directly for upload without specifying a
`googleapi.ChunkSize` will be forced to use a *`resumable`*
`uploadType` (rather than `multipart`) for files less than
`googleapi.DefaultUploadChunkSize`. This is also noted directly in the
Drive API client documentation [here](https://pkg.go.dev/google.golang.org/api/drive/v3@v0.44.0#FilesUpdateCall.Media).
This fixes the problem by passing `googleapi.ChunkSize(0)` to
`Media()` method calls, which is the only way to disable chunking
completely. This is mentioned in the API docs
[here](https://pkg.go.dev/google.golang.org/api/googleapi@v0.44.0#ChunkSize).
The other alternative would be to pass
`googleapi.ChunkSize(f.opt.ChunkSize)` -- however, I'm *strongly* in
favor of *not* doing this for performance reasons. By not explicitly
passing a `googleapi.ChunkSize(0)`, we effectively allow
[`PrepareUpload()`](https://pkg.go.dev/google.golang.org/api/internal/gensupport@v0.44.0#PrepareUpload)
to create a
[`NewMediaBuffer`](https://pkg.go.dev/google.golang.org/api/internal/gensupport@v0.44.0#NewMediaBuffer)
that copies the original `io.Reader` passed to `Media()` in order to
check that its size is less than `ChunkSize`, which will unnecessarily
consume time and memory.
`minChunkSize` is also changed to be `googleapi.MinUploadChunkSize`,
as it is something specified we have no control over.
Google Drive API allows for clauses like "modifiedTime > '2012-06-04T12:00:00'"
in the query param, so the filter flags --max-age and --min-age can be applied
directly at the directory listing phase rather than in a filter.
This is extremely helpful when we want to do an incremental backup of a remote
drive with many files but the number of recently changed file is small.
Co-authored-by: fotile96 <fotile96@users.noreply.github.com>
This replaces built-in os.MkdirAll with a patched version that stops the recursion
when reaching the volume part of the path. The original version would continue recursion,
and for extended length paths end up with \\? as the top-level directory, and the error
message would then be something like:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
Before this change the union's feature flags were a strict AND of the
underlying remotes. This means that a union of a local disk (which can
Move but not Copy) and a bucket based remote (which can Copy but not
Move) could neither Move nor Copy.
This fix advertises Move in the union if all the remotes can Move or
Copy. It also implements Move as Copy+Delete (like rclone does
normally) if the underlying union does not support Move.
This enables renames to work with unions of local disk and bucket
based remotes expected.
Fixes#5632
This was started in
3626f10f26 pcloud: add sha256 support - fixes#5496
But this support turned out to be incomplete and caused the
integration tests to fail.
After updating rclone's dependencies these tests started failing on
windows/386
- TestInternalDoubleWrittenContentMatches
- TestInternalMaxChunkSizeRespected
The failures look like this. The root cause is unknown. The `Wait(n=1)
would exceed context deadline` errors come from golang.org/x/time/rate
but it isn't clear what is calling them.
2021/08/20 21:57:16 ERROR : worker-0 <one>: object open failed 0: rate: Wait(n=1) would exceed context deadline
[snip ~10 duplicates]
2021/08/20 21:57:56 ERROR : tidwcm1629496636/one: (0/26) error (chunk not found 0) response
2021/08/20 21:58:02 ERROR : worker-0 <one>: object open failed 0: rate: Wait(n=1) would exceed context deadline
--- FAIL: TestInternalDoubleWrittenContentMatches (45.77s)
cache_internal_test.go:310:
Error Trace: cache_internal_test.go:310
Error: Not equal:
expected: "one content updated double"
actual : ""
Diff:
--- Expected
+++ Actual
@@ -1 +1 @@
-one content updated double
+
Test: TestInternalDoubleWrittenContentMatches
2021/08/20 21:58:03 original size: 23592960
2021/08/20 21:58:03 updated size: 12