Before this change if we copied files of unknown size, then they lost
their metadata.
This was particularly noticeable using --s3-decompress.
This change adds metadata to Rcat and RcatSized and changes Copy to
pass the metadata in when it calls Rcat for an unknown sized input.
Fixes#6546
Before this change, some files were giving this error when downloaded
from Cloudflare and other providers.
ERROR corrupted on transfer: sizes differ NNN vs MMM
This is because these providers auto gzips the object when rclone
wasn't expecting it to. (AWS does not gzip objects without their being
uploaded gzipped).
This patch adds a quirk to for fix the problem and a flag to control
it. The quirk `might_gzip` is set to `true` for all providers except
AWS.
See: https://forum.rclone.org/t/s3-error-corrupted-on-transfer-sizes-differ-nnn-vs-mmm/33694/Fixes: #6533
Before this fix it was impossible to stop rclone generating an
X-Amx-Acl: header which is incompatible with GCS with uniform access
control and is generally deprecated at AWS.
The API endpoint GetBucketLocation requires
top level permission.
If we do an authenticated head request to a bucket, the bucket location will be returned in the HTTP headers.
Fixes#5066
Before this change if --user-server-modtime was in use the ModTime
could change for an object as we receive it accurate to the nearest ms
in listings, but only accurate to the nearest second in HEAD and GET
requests.
Normally AWS returns the milliseconds as .000 in listings, but if
versions are in use it may not. Storj S3 also seems to return
milliseconds.
This patch tries to keep the maximum precision in the last modified
time, so it doesn't update a last modified time with a truncated
version if the times were the same to the nearest second.
See: https://forum.rclone.org/t/cache-fingerprint-miss-behavior-leading-to-false-positive-stalen-cache/33404/
Before this change rclone used statx() to read the metadata for files
from the local filesystem when `-M` was in use.
Unfortunately statx() was only introduced in kernel 4.11 which was
released in April 2017 so there are current systems (eg Centos 7)
still on kernel versions which don't support statx().
This patch checks to see if statx() is available and if it isn't, it
falls back to using fstatat() which was introduced in Linux 2.6.16
which is guaranteed for all Go versions.
See: https://forum.rclone.org/t/metadata-from-linux-local-s3-failed-to-copy-failed-to-read-metadata-from-source-object-function-not-implemented/33233/
In https://github.com/jlaffaye/ftp/commit/212daf295f the upstream FTP
library changed the way adding your own dialer works which meant that
connections when using explicit FTP were failing.
This patch reworks our connection code to bring it into the
expectations of the library.
Before this fix, if an error ocurred reading the metadata, it could be
set as nil and then used, causing a crash.
This fix changes the readMetadata function so it returns an error, and
the error is always set if the metadata returned is nil.
If mkdir fails then before this change it would have thrown an
error.
After this change, if the error indicated that the directory
already exists then the error is not returned to the user.
This fixes a race condition when two rclone threads are trying to
create the same directory.
In this commit
8d1fff9a82 local: obey file filters in listing to fix errors on excluded files
We started using filters in the local backend so the user could short
circuit troublesome files/directories at a low level.
However this caused a number of integration tests to fail. This turned
out to be in backends wrapping the local backend. For example the
combine backend test failed because it changes the paths passed to the
local backend so they no longer match the paths in the current filter.
To fix this, a new feature flag `FilterAware` was added and the
UseFilter context flag is only passed to backends which support it. As
the wrapping backends don't support the flag, this fixes the problems
in the integration tests.
In future the wrapping backends could modify the active filters to
match the path modifications and then they could set the FilterAware
flag.
See #6376
Before this fix, the chunksize calculator was using the previous size
of the object, not the new size of the object to calculate the chunk
sizes.
This meant that uploading a replacement object which needed a new
chunk size would fail, using too many parts.
This fix fixes the calculator to take the size explicitly.
Before this change, if rclone was run with `-M` on a filesystem
without xattr support, it would error out.
This patch makes rclone detect the not supported errors and disable
xattrs from then on. It prints one ERROR level message about this.
See: https://forum.rclone.org/t/metadata-update-local-s3/32277/7
Before this change, if an object compressed with "Content-Encoding:
gzip" was downloaded, a length and hash mismatch would occur since the
go runtime automatically decompressed the object on download.
If --s3-decompress is set, this change erases the length and hash on
compressed objects so they can be downloaded successfully, at the cost
of not being able to check the length or the hash of the downloaded
object.
If --s3-decompress is not set the compressed files will be downloaded
as-is providing compressed objects with intact size and hash
information.
See #2658
Before this fix, the dropbox backend wasn't decoding the file names
received in changenotify events into rclone standard format.
This meant that changenotify events for filenames which had encoded
characters were failing to be decrypted properly if wrapped in crypt.
See: https://forum.rclone.org/t/rclone-vfs-cache-says-file-name-too-long/31535
Before this patch backends could be shutdown when they fell out of the
cache when they were in use with combine. This was particularly
noticeable with the dropbox backend which gave this error when
uploading files after the backend was Shutdown.
Failed to copy: upload failed: batcher is shutting down
This patch gets the combine remote to pin them until it is finished.
See: https://forum.rclone.org/t/rclone-combine-upload-failed-batcher-is-shutting-down/32168
Previously, with standard auth, the username would be stored in config - but only after
entering the non-standard device/mountpoint sequence during config (a feature introduced
with #5926). Regardless of that, rclone always requests the username from the api at
startup (NewFS).
In #6270 (commit 9dbed02329) this was changed to always
store username in config (consistency), and then also use it to avoid the repeated
customer info request in NewFs (performance). But, as reported in #6309, it did not work
with legacy auth, where user enters username manually, if user entered an email address
instead of the internal username required for api requests. This change was therefore
recently reverted.
The current commit takes another step back to not store the username in config during
the non-standard device/mountpoint config sequence (consistentcy). The username will
now only be stored in config when using legacy auth, where it is an input parameter.
Extend the shouldRetry function by also checking for the quotaExceeded
reason, and since this function appeared to be untested, add a test case
for the existing errors and this new one.
Fixes#615
In
22abd785eb s3: implement reading and writing of metadata #111
The reading information of objects was refactored to use the
s3.HeadObjectOutput structure.
Unfortunately the code branch with `--s3-no-head` was not tested
otherwise this panic would have been discovered.
This shows that this is path is not integration tested, so this adds a
new integration test.
Fixes#6322
`FS.cacheExpiry` is accessed through sync/atomic.
According to the documentation, "On ARM, 386, and 32-bit MIPS, it is
the caller's responsibility to arrange for 64-bit alignment of 64-bit
words accessed atomically. The first word in a variable or in an
allocated struct, array, or slice can be relied upon to be 64-bit
aligned."
Before commit 1d2fe0d856 this field was
aligned, but then a new field was added to the structure, causing the
test suite to panic on linux/386.
No other field is used with sync/atomic, so `cacheExpiry` can just be
placed at the beginning of the stuct to ensure it is always aligned.
By default these will be downloaded compressed.
This changes the default of the previous commit
2781f8e2f1 gcs: Fix download of "Content-Encoding: gzip" compressed objects
But will fit in better with the metadata framework when copying
gzip-encoded objects from backend to backend.
Existing version did save username in config, but only when entering the custom
device/mountpoint sequence in config. Regardless of that, it did always look up the
username at startup with an api request.
This commit improves it so that the username will always be stored in config,
and when using standard authentication it picks it from the login token instead of
requesting it from the remote api, and also in fs constructor it picks it from config
instead of requesting it from remote api (again).
The SDK doesn't wrap errors in a Go standard way so they can't be
unwrapped and tested for - eg fatal error.
The code looks for a Serialization or RequestError and returns the
unwrapped underlying error if possible.
This fixes the fs/operations integration tests checking for fatal
errors being returned.
In this commit
e5974ac4b0 s3: use PutObject from the aws SDK to upload single part objects
rclone was made to upload objects to s3 using PUT requests rather than
using signed uploads.
However this change missed the fact that there is a supported way to
do this in the SDK using the SetStreamingBody method on the Request.
This therefore reverts a lot of the previous commit to do with making
an unsigned connection and other complication and uses the SDK
facility.
Before this fix GetFreeSpace returned math.MaxInt64 for remotes which
don't support reading free space, however this is used in various
comparison routines as a too large value, meaning that remotes of size
math.MaxInt64 were never being selected.
This fixes GetFreeSpace to return math.MaxInt64 - 1 so then can be selected.
It also fixes GetUsedSpace the same way however as the default for not
supported was 0 this was very unlikely to have ever caused a problem.
By default, rclone always requests read and write permissions. No matter what settings you configure in the AAD application. This option allows to explicitly request readonly permissions
Migrated read only option to access scope option and set disable_site_permission option to hidden.
Windows shells like cmd and powershell needs to use different quoting/escaping
of strings and paths than the unix shell, and also absolute paths must be fixed
by removing leading slash that the POSIX formatted paths have
(e.g. /C:/Users does not work in shell, it must be converted to C:/Users).
Tries to autodetect shell type (cmd, powershell, unix) on first use.
Implemented default builtin powershell functions for hashsum and about when remote
shell is powershell.
See #5763Fixes#5758
This adjusts
rclone backend drives -o config drive:
So that it also emits a config section called `AllDrives` which uses
the combine backend to make a backend which combines all the shared
drives into one.
It also makes sure that all the shared drive names are valid rclone
config names, deduplicating if necessary.
Fixes#4506
Before this change, if an object compressed with "Content-Encoding:
gzip" was downloaded, a length and hash mismatch would occur since the
as the go runtime automatically decompressed the object on download.
This change erases the length and hash on compressed objects so they
can be downloaded successfully, at the cost of not being able to check
the length or the hash of the downloaded object.
This also adds the --gcs-download-compressed flag to allow the
compressed files to be downloaded as-is providing compressed objects
with intact size and hash information.
Fixes#2658
Before this fix, if uploading to a union consisting of all bucket
based remotes (eg s3), uploads failed with:
Failed to copy: object not found
This was because the union backend was relying on parent directories
being created to work out which files to upload. If all the upstreams
were bucket based backends which can't hold empty directories, no
directories were created and the upload failed.
This fixes the problem by returning the upstreams used when creating
the directory for the upload, rather than searching for them again
after they've been created.
This will also make the union backend a little more efficient.
Fixes#6170
strings.ReplaceAll(s, old, new) is a wrapper function for
strings.Replace(s, old, new, -1). But strings.ReplaceAll is more
readable and removes the hardcoded -1.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
The "relative" argument was missing when Put'ing a file. This
sets an incorrect object entry in the cache, leading to the file being
unreadable when using mount functionality.
Fixes#6151
Before this change rclone used presigned requests to upload single
part objects. This was because of a limitation in the SDK which didn't
allow non seekable io.Readers to be passed in.
This is incompatible with some S3 backends, and rclone wasn't adding
the `X-Amz-Content-Sha256: UNSIGNED-PAYLOAD` header which was
incompatible with other S3 backends.
The SDK now allows for this so rclone can use PutObject directly.
This sets the `X-Amz-Content-Sha256: UNSIGNED-PAYLOAD` flag on the PUT
request. However rclone will add a `Content-Md5` header if at all
possible so the body data is still protected.
Note that the old behaviour can still be configured if required with
the `use_presigned_request` config parameter.
Fixes#5422
Uses b2_list_file_versions to retrieve all file versions, and returns
the one that was active at the specified time
This is especially useful in combination with other backup tools, such
as restic, which may use rclone as a backend.
Jottacloud have several different apis and endpoints using a mix of different
timestamp formats. In existing code the list operation (after the recent liststream
implementation) uses format from golang's time.RFC3339 constant. Uploads (using the
allocate api) uses format from a hard coded constant with value identical to golang's
time.RFC3339. And then we have the classic JFS time format, which is similar to RFC3339
but not identical, using a different constant. Also the naming is a bit confusing,
since the term api is used both as a generic term and also as a reference to the
newer format used in the api subdomain where the allocate endpoint is located.
This commit refactors these things a bit.
Now using the utility function for deduplication that was newly implemented to
fix an issue with server-side copy. This function uses the original, and generic,
"jfs" api (and its "cphash" feature), instead of the newer "allocate" api dedicated
for uploads. Both apis support similar deduplication functionaly that we rely on for
the SetModTime operation. One advantage of using the jfs variant is that the allocate
api is specialized for uploads, an initial request performs modtime-only changes and
deduplication if possible but if not possible it creates an incomplete file revision
and returns a special url to be used with a following request to upload missing content.
In the SetModTime function we only sent the first request, using metadata from existing
remote file but different timestamps, which lead to a modtime-only change. If, for some
reason, this should fail it would leave the incomplete revision behind. Probably not
a problem, but the jfs implementation used with this commit is simpler and
a more "standalone" request which either succeeds or fails without expecting additional
requests.
A strange feature (probably bug) in the api used by the server-side copy implementation
in Jottacloud backend is that if the destination file is in trash, the copy request
succeeds but the destination will still be in trash! When this situation occurs in
rclone, the copy command will fail with "Failed to copy: object not found" because
rclone verifies that the file info in the response from the copy request is valid,
and since it is marked as deleted it is treated as invalid.
This commit works around this problem by looking for this situation in the response
from the copy operation, and send an additional request to a built-in deduplication
endpoint that will restore the file from trash.
Fixes#6112
The existing code in rclone set the value "offline_access+openid",
when encoded in body it will become "offline_access%2Bopenid". I think
this is wrong. Probably an artifact of "double urlencoding" mixup -
either in rclone or in the jottacloud cli tool version it was sniffed
from? It does work, though. The token received will have scopes "email
offline_access" in it, and the same is true if I change to only
sending "offline_access" as scope.
If a proper space delimited list of "offline_access openid"
is used in the request, the response also includes openid scope:
"openid email offline_access". I think this is more correct and this
patch implements this.
See: #6107
Adds a configuration option to the GCS backend to allow skipping the
check if a bucket exists before copying an object to it, much like
f406dbb added for S3.
Before this change the cache backend was passing -1 into
rate.NewLimiter to mean unlimited transactions per second.
In a recent update this immediately returns a rate limit error as
might be expected.
This patch uses rate.Inf as indicated by the docs to signal no limits
are required.
Before this change the 206 responses from putio Range requests were being
returned as errors.
This change checks for 200 and 206 in the GET response now.
Before this fix, rclone retries chunks of multipart uploads. However
if they had been partially received dropbox would reply with an
incorrect_offset error which rclone was ignoring.
This patch parses the new offset from the error response and uses it
to adjust the data that rclone sends so it is the same as what dropbox
is expecting.
See: https://forum.rclone.org/t/dropbox-rate-limiting-for-upload/29779
This commit switches Google Cloud Storage from the drive pacer to the
s3 pacer. The main difference between them is that the s3 pacer does
not limit transactions in the non-error case. This is appropriate for
a cloud storage backend where you pay for each transaction.
Before this fix `NewObject` could return a wrapped `fs.Object(nil)`
which caused a crash. This was caused by `wrapObject` returning a
`nil` `*Object` which was cast into an `fs.Object`.
This changes the interface of `wrapObject` so it returns an
`fs.Object` instead of a `*Object` and an error which must be checked.
This forces the callers to return a `nil` object rather than an
`fs.Object(nil)`.
See: https://forum.rclone.org/t/panic-in-hasher-when-mounting-with-vfs-cache-and-not-synced-data-in-the-cache/29697/11
Having a replace directive in go.mod causes "go get
github.com/rclone/rclone" to fail as it discussed in this Go issue:
https://github.com/golang/go/issues/44840
This is apparently how the Go team want go.mod to work, so this commit
hard forks github.com/jlaffaye/ftp into github.com/rclone/ftp so we
can remove the `replace` directive from the go.mod file.
Fixes#5810
Before this change rclone send pre-1970 timestamps as negative
numbers. pCloud ignores these and sets them as todays date.
This change sends the timestamps as unsigned 64 bit integers (which is
how the binary protocol sends them) and pCloud accepts the (actually
negative) timestamp like this.
Before this change the new multipart upload ETag checking code was
failing in the integration tests with Alibaba OSS.
Apparently Alibaba calculate the ETag in a different way to AWS.
This introduces a new provider quirk with a flag to disable the
checking of the ETag for multipart uploads.
Mulpart Etag checking has been enabled for all providers that we can
test for and work, and left disabled for the others.
Before this rclone ignored the ETag on multipart uploads which missed
an opportunity for a whole file integrity check.
This adds that check which means that we now check even harder that
multipart uploads have arrived properly.
See #5993
Before this change `rclone about swift:container` would show aggregate
info about all the containers, not just the one in use.
This causes a problem if container listing is disabled (for example in
the Blomp service).
This fix makes `rclone about swift:container` show only the info about
the given `container`. If aggregate info about all the containers is
required then use `rclone about swift:`.
See: https://forum.rclone.org/t/rclone-mount-blomp-problem/29151/18
Before this change, rclone supported authorizing for remote systems by
going to a URL and cutting and pasting a token from Google. This is
known as the OAuth out-of-band (oob) flow.
This, while very convenient for users, has been shown to be insecure
and has been deprecated by Google.
https://developers.googleblog.com/2022/02/making-oauth-flows-safer.html#disallowed-oob
> OAuth out-of-band (OOB) is a legacy flow developed to support native
> clients which do not have a redirect URI like web apps to accept the
> credentials after a user approves an OAuth consent request. The OOB
> flow poses a remote phishing risk and clients must migrate to an
> alternative method to protect against this vulnerability. New
> clients will be unable to use this flow starting on Feb 28, 2022.
This change disables that flow, and forces the user to use the
redirect URL flow. (This is the flow used already for local configs.)
In practice this will mean that instead of cutting and pasting a token
for remote config, it will be necessary to run "rclone authorize"
instead. This is how all the other OAuth backends work so it is a well
tested code path.
Fixes#6000
The directory created by `T.TempDir` is automatically removed when the
test and all its subtests complete.
Reference: https://pkg.go.dev/testing#T.TempDir
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Before this change a multipart upload with the --no-head flag returned
the MD5SUM as a base64 string rather than a Hex string as the rest of
rclone was expecting.
Before this change attempting NewObject on a SAS URL's root would
crash the Azure SDK.
This change detects that using the code from this previous fix
f7404f52e7 azureblob: fix crash when listing outside a SAS URL's root - fixes#4851
And returns not object not found instead.
It also prevents things being uploaded to the root of the SAS URL
which also crashes the Azure SDK.
Before this fix if a file was updated, but to the same length and
timestamp then the local backend would return the wrong (cached)
hashes for the object.
This happens regularly on a crypted local disk mount when the VFS
thinks files have been changed but actually their contents are
identical to that written previously. This is because when files are
uploaded their nonce changes so the contents of the file changes but
the timestamp and size remain the same because the file didn't
actually change.
This causes errors like this:
ERROR: file: Failed to copy: corrupted on transfer: md5 crypted
hash differ "X" vs "Y"
This turned out to be because the local backend wasn't clearing its
cache of hashes when the file was updated.
This fix clears the hash cache for Update and Remove.
It also puts a src and destination in the crypt message to make future
debugging easier.
Fixes#4031
Currently the B2 docs don't specify which format the download_url
setting should have, and if you input it wrong, there is nothing
in the verbose logs or anywhere else that can let you know that.
* Wasabi starts to provide AP Northeast 2 (Osaka) endpoint, so add it to the list
* Rename ap-northeast-1 as "AP Northeast 1 (Tokyo)" from "AP Northeast"
Signed-off-by: lindwurm <lindwurm.q@gmail.com>
After speed testing it was discovered that upload speed goes up pretty
much linearly with upload concurrency. This patch changes the default
from 4 to 16 which means that rclone will use 16 * 4M = 64M per
transfer which is OK even for low memory devices.
This adds a note that performance may be increased by increasing
upload concurrency.
See: https://forum.rclone.org/t/performance-of-rclone-vs-azcopy/27437/9
Previously only the fs being checked on gets passed to
GetModifyWindow(). However, in most tests, the test files are
generated in the local fs and transferred to the remote fs. So the
local fs time precision has to be taken into account.
This meant that on Windows the time tests failed because the
local fs has a time precision of 100ns. Checking remote items uploaded
from local fs on Windows also requires a modify window of 100ns.
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.
This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
This removes the checks against the provider throughout the code and
puts them into a single setQuirks function for easy maintenance when
adding a new provider.
It also updates the quirks with the results of testing against
backends we have access to.
This also adds a list_url_encode parameter so that quirk can be
manually set.
This implements a quirks system for providers and notes which
providers we have tested to support ListObjectsV2.
For those providers which don't support ListObjectsV2 we use the
original ListObjects call.
The API doesn't seem to accept a value of "0" any more for the root
directory ID, giving the error "Could not decode folder id".
However omitting it seems to work fine.
In this commit, released in 1.56.0 we started reading the size of the
object from the Content-Length header as returned by the GET request
to read the object.
4401d180aa s3: add --s3-no-head-object
However some object storage systems, notably Ceph, don't return a
Content-Length header.
The new code correctly calls the setMetaData function with a nil
pointer to the ContentLength.
However due to this commit from 2014, released in v1.18, the
setMetaData function was not ignoring the size as it should have done.
0da6f24221 s3: use official github.com/aws/aws-sdk-go including multipart upload #101
This commit correctly ignores the content length if not set.
Fixes#5732
Before this change the `shared_credentials_file` config option was
being ignored.
The correct value is passed into the SDK but it only sets the
credentials in the default provider. Unfortunately we wipe the default
provider in order to install our own chain if env_auth is true.
This patch restores the shared credentials file in the session
options, exactly the same as how we restore the profile.
Original fix:
1605f9e14d s3: Fix shared_credentials_file auth
This patch reverts this commit
1605f9e14d s3: Fix shared_credentials_file auth
It unfortunately had the side effect of making the s3 SDK ignore the
config in our custom chain and use the default provider. This means
that advanced auth was being ignored such as --s3-profile with
role_arn.
Fixes#5468Fixes#5762
The API has changed in the directory move call JSON response from
returning a TaskID as a string to returning it as an integer. In other
places it is still returned as a string though.
This patch allows the TaskID to be an integer or a string in the JSON
response and keeps it internally as a string like before.
Before this change the backoff for the error_background error was 6
seconds. This means that if it wasn't resolved in 60 seconds (with the
default 10 low level retries) then an error was reported.
This error was being reported frequently in the integration tests, so
is likely affecting real users too.
This patch changes the backoff into an exponential backoff
1,2,4,8...1024 seconds to make sure we wait long enough for the
background operation to complete.
See #5734
- setup correct path encoding (fixes backend test FsEncoding)
- ignore range option if file is empty (fixes VFS test TestFileReadAtZeroLength)
- cleanup stray files left after failed upload (fixes test FsPutError)
- rebase code on master, adapt backend for rclone context passing
- translate Siad errors to rclone native FS errors in sia errorHandler
- TestSia: return proper backend options from the script
- TestSia: use uptodate AntFarm image, nebulouslabs/siaantfarm is stale
In
05f128868f azureblob: add --azureblob-no-head-object
we incorrectly parsed the size of the object as the Content-Length of
the returned header. This is incorrect in the presense of Range
requests.
This fixes the problem by parsing the Content-Range header if
avaialble to read the correct length from if a Range request was
issued.
See: #5734
This reverts commit
dc06973796 Revert "s3: use rclone's low level retries instead of AWS SDK to fix listing retries"
Which in turn reverted
5470d34740 "backend/s3: use low-level-retries as the number of SDK retries"
So we are back where we started.
It then modifies it to set the AWS SDK to `--low-level-retries`
retries, but set the rclone retries to 2 so that directory listings
can be retried.