Before this change crypt would not calculate hashes for files it was
uploading. This is because, in the general case, they have to be
downloaded, encrypted and hashed which is too resource intensive.
However this causes backends which need the hash first before
uploading (eg s3/b2 when uploading chunked files) not to have a hash
of the file. This causes cryptcheck to complain about missing hashes
on large files uploaded via s3/b2.
This change calculates hashes for the upload if the upload is coming
from a local filesystem. It does this by encrypting and hashing the
local file re-using the code used by cryptcheck. For a local disk this
is not a lot more intensive than calculating the hash.
See: https://forum.rclone.org/t/strange-output-for-cryptcheck/15437Fixes: #2809
Previously we had a map of pools for different chunk sizes.
In practice the mapping is not very useful and requires a lock.
Pools of size other that ChunkSize can only happen when we have a huge file (over 10k * ChunkSize).
We need to have a bunch of identically sized huge files.
In such case most likely ChunkSize should be increased.
The mapping and its lock is replaced with a single initialised pool for ChunkSize, in other cases pool is allocated and freed on per file basis.
Rclone can't safely delete files with multiple parents without
PATCHing the parents list. This can be done, but since multiple
parents are going away to be replaced by drive shortcuts we return an
error for now.
See #4013
Before this change we queries /me/drives for a list of the users
drives and asked the user to choose. Sometimes this does not return
the users main drive for reasons unknown.
After this change we query /me/drives first then /me/drive and add
that to the list of drives if it wasn't already there.
In 5470d34740 "backend/s3: use low-level-retries as the number
of SDK retries" we switched over to using the AWS SDK low level
retries instead of rclone's low level retry logic.
This had the unfortunate attempt that retrying listings to correct XML
Syntax errors failed on non S3 backends such as CEPH. The AWS SDK was
also retrying the XML Syntax error request which doesn't make sense.
This change turns off the AWS SDK retries in favour of just using
rclone's retry logic.
If chunk size is more than 250M (262,144,000 bytes) then API throws the following error:
Microsoft.SharePoint.Client.InvalidClientQueryException: The request message is too big. The server does not allow messages larger than 262144000 bytes.
Before this change rclone didn't use sparse files on Windows. This
means that when you downloaded a file with multithread download it
wrote the entire file with zeros first on the first write not at the
start of the file.
This change makes the file be sparse on Windows. Linux/macOS files
were already sparse.
Before this change shared with me items with multiple parents (ie most
of them that aren't in the root) would appear twice in the directory
listings.
This fixes the problem by doing an early exit for shared with me
items.
This bug was introduced here by removing some necessary code detecting
shared with me items at the root with no parents.
4453fa4ba6 "drive: fix --fast-list when using appDataFolder"
This fix reverts that part of the patch.
Fixes#4018
This adds a bit of missed locking around the uploaded info to fix the
concurrent map write.
All the other accesses have locking - this one must have got missed.
pureftpd has a bug where it sends messages like this
```
150-Accepted data connection\r\n
Response code: File status okay; about to open data connection (150)
Response arg: Accepted data connection
150 32768.0 kbytes to download\r\n
150 0.014 seconds (measured here), 1665.27 Mbytes per second\r\n
```
The last `150` is treated as a new response - the previous `150` should have been `150-`.
This means that rclone sees the `150 0.014 seconds (measured here),
1665.27 Mbytes per second` as a reply to the next message and reports
it as an error.
This fix ignores that specific message when it is received in the
`Close` method. It dumps the FTP connection after as it is out of
sync.
See: #3984Fixes#3445
Before this change if rclone failed to close a file download for some
reason it would leak a concurrency token. When all the tokens were
leaked then rclone would lock up.
This fix returns the concurrency token regardless of the error status.
Before this change if rclone failed to upload a file for some reason
it would leak a concurrency token. When all the tokens were leaked
then rclone would lock up.
The fix returns the concurrency token regardless of the error state.
Before this change if rclone failed to make an FTP connection for some
reason it would leak a concurrency token. When all the tokens were
leaked then rclone would lock up.
The fix returns the concurrency token if creating the FTP connection
returns an error.
Amazon S3 is built to handle different kinds of workloads.
In rare cases where S3 is not able to scale for whatever reason users
will face status 500 errors.
Main mechanism for handling these errors are retries.
Amount of needed retries varies for each different use case.
This change is making retries for s3 backend configurable by using
--low-level-retries option.
Currently each multipart upload allocated his own buffers, which after
file upload was garbaged. Next files couldn't leverage already allocated
memory which resulted in inefficent memory management. This change
introduces backend memory pool keeping memory chunks which can be
used during object operations.
Fixes#3967
The error code 500 Internal Error indicates that Amazon S3 is unable to handle the request at that time. The error code 503 Slow Down typically indicates that the requests to the S3 bucket are very high, exceeding the request rates described in Request Rate and Performance Guidelines.
Because Amazon S3 is a distributed service, a very small percentage of 5xx errors are expected during normal use of the service. All requests that return 5xx errors from Amazon S3 can and should be retried, so we recommend that applications making requests to Amazon S3 have a fault-tolerance mechanism to recover from these errors.
https://aws.amazon.com/premiumsupport/knowledge-center/http-5xx-errors-s3/
This removes the unused functions run.writeRemoteRandomBytes() run.writeObjectRandomBytes() run.listPath() Directory.parentRemote() and Persistent.dumpRoot().
Before this change, when uploading multipart files, onedrive would
sometimes return an unexpected 416 error and rclone would abort the
transfer.
This is usually after a 500 error which caused rclone to do a retry.
This change checks the upload position on a 416 error and works how
much of the current chunk to skip, then retries (or skips) the current
chunk as appropriate.
If the position is before the current chunk or after the current chunk
then rclone will abort the transfer.
See: https://forum.rclone.org/t/fragment-overlap-error-with-encrypted-onedrive/14001Fixes#3131
This hides:
- "use_created_date"
- "use_shared_date"
- "size_as_quota"
from the configurator (rclone config) as they interfere with normal
operations and shouldn't be set in a backend config.
They can still be put in the config file by hand and will still work
as variables, etc.
This adds some more docs to "size_as_quota" also.
Fixes#3912
Before this change we used non multipart uploads for files of unknown
size (streaming and uploads in mount). This is slower and less
reliable and is not recommended by Google for files smaller than 5MB.
After this change we use multipart resumable uploads for all files of
unknown length. This will use an extra transaction so is less
efficient for files under the chunk size, however the natural
buffering in the operations.Rcat call specified by
`--streaming-upload-cutoff` will overcome this.
See: https://forum.rclone.org/t/upload-behaviour-and-speed-when-using-vfs-cache/9920/
This error started happening after updating golang/x/crypto which was
done as a side effect of:
3801b8109 vendor: update termbox-go to fix ncdu command on FreeBSD
This turned out to be a deliberate policy of making
ssh.ParsePrivateKeyWithPassphrase fail if the passphrase was empty.
See: https://go-review.googlesource.com/c/crypto/+/207599
This fix calls ssh.ParsePrivateKey if the passphrase is empty and
ssh.ParsePrivateKeyWithPassphrase otherwise which fixes the problem.
If the --drive-stop-on-upload-limit flag is in effect this checks the
error string from Google Drive to see if it is the error you get when
you've breached your 750GB a day limit.
If so then it turns this error into a Fatal error which should stop
the sync.
Fixes#3857
In listings if the ID `appDataFolder` is used to list a directory the
parents of the items returned have the actual ID instead the alias
`appDataFolder`. This confused the ListR routine into ignoring all
these items.
This change makes the listing routine accept all parent IDs returned
if there was only one ID in the query. This fixes the `appDataFolder`
problem. This means we are relying on Google Drive to only return the
items we asked for which is probably OK.
Fixes#3851
The S3 ListObject API returns paginated bucket listings, with
"MaxKeys" items for each GET call.
The default value is 1000 entries, but for buckets with millions of
objects it might make sense to request more elements per request, if
the backend supports it. This commit adds a "list_chunk" option for
the user to specify a lower or higher value.
This commit does not add safe guards around this value - if a user
decides to request a too large list, it might result in connection
timeouts (on the server or client).
In AWS S3, there is a fixed limit of 1000, some other services might
have one too. In Ceph, this can be configured in RadosGW.
Before this patch we were failing to URL decode the NextMarker when
url encoding was used for the listing.
The result of this was duplicated listings entries for directories
with >1000 entries where the NextMarker was a file containing a space.
Before this change we used the same (relatively low limits) for server
side copy as we did for multipart uploads. It doesn't make sense to
use the same limits since no data is being downloaded or uploaded for
a server side copy.
This change introduces a new parameter --s3-copy-cutoff to control
when the switch from single to multipart server size copy happens and
defaults it to the maximum 5GB.
This makes server side copies much more efficient.
It also fixes the erroneous error when trying to set the modification
time of a file bigger than 5GB.
See #3778
Before this change multipart copies were giving the error
Range specified is not valid for source object of size
This was due to an off by one error in the range source introduced in
7b1274e29a "s3: support for multipart copy"
Before this change rclone used "Authorization: BEARER token". However
according the the RFC this should be "Bearer"
https://tools.ietf.org/html/rfc6750#section-2.1
This changes it to "Authorization: Bearer token"
Fixes#3751 and interop with Salesforce Webdav server
When using nextcloud, before this change we only uploaded one of SHA1
or MD5 checksum in the OC-Checksum header with preference to SHA1 if
both were set.
This makes the MD5 checksums read as empty string which makes syncing
with checksums less useful than they should be as all the MD5
checksums are blank.
This change makes it so that we only upload the SHA1 to nextcloud.
The behaviour of owncloud is unchanged as owncloud uses the checksum
as an upload integrity check only and calculates its own checksums.
See: https://forum.rclone.org/t/how-to-specify-hash-method-to-checksum/13055
This also corrects the symlink detection logic to only check symlink
files. Previous to this it was checking all directories too which was
making it do more stat calls than was necessary.
Before this change we forgot to URL decode the X-Object-Manifest in a dynamic large object.
This problem was introduced by 2fe8285f89 "swift: reserve
segments of dynamic large object when delete objects in container what
was enabled versioning."
For few commands, RClone counts a error multiple times. This was fixed by
creating a new error type which keeps a flag to remember if the error has
already been counted or not. The CountError function now wraps the original
error eith the above new error type and returns it.
Before this change rclone used the team_drive ID as the root if set
even if the root_folder_id was set too.
This change uses the root_folder_id in preference over the team_drive
which restores the functionality.
This problem was introduced by ba7c2ac443Fixes#3742
We attempt to find the ID of the root folder by doing a GET on the
folder ID "root". With scope "drive.files" this fails with a 404
message.
After this change if we get the 404 message, we just carry on using
"root" as the root folder ID and we cache that for future lookups.
This means that changenotify messages will not work correctly in the
root folder but otherwise has minor consequences.
See: https://forum.rclone.org/t/fresh-raspberry-pi-build-google-drive-404-error-failed-to-ls-googleapi-error-404-file-not-found/12791
Before this change rclone would allow the user to stream (eg with
rclone mount, rclone rcat or uploading google photos or docs) 5TB
files. This meant that rclone allocated 4 * 525 MB buffers per
transfer which is way too much memory by default.
This change makes rclone use the configured chunk size for streamed
uploads. This is 5MB by default which means that rclone can stream
upload files up to 48GB by default staying below the 10,000 chunks
limit.
This can be increased with --s3-chunk-size if necessary.
If rclone detects that a file is being streamed to s3 it will make a
single NOTICE level log stating the limitation.
This fixes the enormous memory usage.
Fixes#3568
See: https://forum.rclone.org/t/how-much-memory-does-rclone-need/12743
Before this fix we neglected to add the shared drive ID to the request
when asking for an initial change notify token and this caused a lot
more results to be returned than was necessary.
When we changed recursive lists to use --fast-list by default this
broke listing with --drive-shared-with-me from the root.
This turned out to be an unwarranted assumption in the ListR code that
all items would have a parent folder that we had searched for - this
isn't true for shared with me items.
This was fixed when using --drive-shared-with-me to give items that
didn't have any parents a synthetic parent.
Fixes#3639
Before this change we used the id "root" as an alias for the root drive ID.
However this causes problems when we receive IDs back from drive which
are not in this format and have been expanded to their canonical ID.
This change looks up the ID "root" and stores it in the
"drive_folder_id" parameter in the config file.
This helps with
- Notifying changes at the root
- Files shared with me at the root
See #3639
Before this change when rclone was compiled with go1.13 it used HTTP/2
to contact drive by default.
This causes lockups and INTERNAL_ERRORs from the HTTP/2 code.
This is a workaround disabling the HTTP/2 code on an option.
It can be re-enabled with `--drive-disable-http2=false`
See #3631
Before this change we silently skipped uploads to dropbox of
disallowed file names. However this then caused "corrupted on
transfer" errors because the sizes were wrong.
After this change we return an no retry error which will mean that the
sync fails (as it should - not all files were uploaded) but no
unecessary retries happened.
This works around a bug in Ceph which doesn't encode CommonPrefixes
when using URL encoded directory listings.
See: https://tracker.ceph.com/issues/41870
changes:
- chunker: remove GetTier and SetTier
- remove wdmrcompat metaformat
- remove fastopen strategy
- make hash_type option non-advanced
- adverise hash support when possible
- add metadata field "ver", run strict checks
- describe internal behavior in comments
- improve documentation
note:
wdmrcompat used to write file name in the metadata, so maximum metadata
size was 1K; removing it allows to cap size by 200 bytes now.
Note: chunker implements many irrelevant methods (UserInfo, Disconnect etc),
but they are required by TestIntegration/FsCheckWrap and cannot be removed.
Dropped API methods: MergeDirs DirCacheFlush PublicLink UserInfo Disconnect OpenWriterAt
Meta formats:
- renamed old simplejson format to wdmrcompat.
- new simplejson format supports hash sums and verification of chunk size/count.
Change list:
- split-chunking overlay for mailru
- add to all
- fix linter errors
- fix integration tests
- support chunks without meta object
- fix package paths
- propagate context
- fix formatting
- implement new required wrapper interfaces
- also test large file uploads
- simplify options
- user friendly name pattern
- set default chunk size 2G
- fix building with golang 1.9
- fix ci/cd on a separate branch
- fix updated object name (SyncUTFNorm failed)
- fix panic in Box overlay
- workaround: Box rename failed if name taken
- enhance comments in unit test
- fix formatting
- embed wrapped remote rather than inherit
- require wrapped remote to support move (or copy)
- implement 3 (keep fstest)
- drop irrelevant file system interfaces
- factor out Object.mainChunk
- refactor TestLargeUpload as InternalTest
- add unit test for chunk name formats
- new improved simplejson meta format
- tricky case in test FsIsFile (fix+ignore)
- remove debugging print
- hide temporary objects from listings
- fix bugs in chunking reader:
- return EOF immediately when all data is sent
- handle case when wrapped remote puts by hash (bug detected by TestRcat)
- chunked file hashing (feature)
- server-side copy across configs (feature)
- robust cleanup of temporary chunks in Put
- linear download strategy (no read-ahead, feature)
- fix unexpected EOF in the box multipart uploader
- throw error if destination ignores data
When used with v2_auth = true, PresignRequest doesn't return
signed headers, so remote dest authentication would be fail.
This commit copying back HTTPRequest.Header to headers.
Tested with RiakCS v2.1.0.
Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
- Read the storage class for each object
- Implement SetTier/GetTier
- Check the storage class on the **object** before using SetModTime
This updates the fix in 1a2fb52 so that SetModTime works when you are
using objects which have been migrated to GLACIER but you aren't using
GLACIER as a storage class.
Fixes#3522
Before this change we used PATCH on the object to update the metadata.
Apparently this requires the "full_control" scope which Google were
unhappy with in their oauth review.
This changes it to update the metadata by copying the object ontop of
itself (which is the way s3 works). This can be done with normal
permissions.
This fixes a crash on the google photos backend when an error is
returned from the rest.Call function.
This turned out to be a mis-understanding of the rest docs so
- improved rest.Call docs
- fixed mis-understanding in google photos backend
- fixed similar mis-understading in onedrive backend
- change the interface of listBuckets() removing dir parameter and adding context
- add makeBucket() and use in place of Mkdir("")
- this fixes some corner cases in Copy/Update
- mark all the listed buckets OK in ListR
Thanks to @yparitcher for the review.
Before this change, if the caller didn't provide a hint, we would
calculate all hashes for reads and writes.
The new whirlpool hash is particularly expensive and that has become noticeable.
Now we don't calculate any hashes on upload or download unless hints are provided.
This means that some operations may run slower and these will need to be discovered!
It does not affect anything calling operations.Copy which already puts
the corrects hints in.
When using the VFS with swift and --swift-no-chunk, PutStream was
returning objects with size -1 which was causing corrupted transfer
messages.
This was fixed by counting the bytes transferred in a streamed file
and updating the metadata with that.
This was factored from fstest as we were including the testing
enviroment into the main binary because of it.
This was causing opening the browser to fail because of 8243ff8bc8.
In 53a1a0e3ef we started returning non nil from NewObject when
an object isn't found. This breaks the integration tests and the API
expected of a backend.
This fixes it.
Introduce stats groups that will isolate accounting for logically
different transferring operations. That way multiple accounting
operations can be done in parallel without interfering with each other
stats.
Using groups is optional. There is dedicated global stats that will be
used by default if no group is specified. This is operating mode for CLI
usage which is just fire and forget operation.
For running rclone as rc http server each request will create it's own
group. Also there is an option to specify your own group.
Configuration time option to disable the above for if using Dropbox (does not
allow setting mtime on copy) or Amazon Drive (neither on upload nor on copy).
Before this change rclone was sending a MimeType in the requests for
server side Move and Copy.
The conjecture is that if you attempt to set the MimeType to something
different in a Copy then Google Drive has to do an actual copy of the
file data. This takes a very long time (since it is large) and fails
after a 90s timeout.
After the change we no longer set the MimeType in Move or Copy and the
copies happen instantly and correctly.
Many thanks to @darthShadow for discovering that this was causing the
problem.
Fixes#3070Fixes#3033Fixes#3300Fixes#3155
This was started by Fionera, finished off by Laura with fixes and more
docs from Nick.
Co-authored-by: Fionera <fionera@fionera.de>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
* azureblob - Add support for Azure Storage Emulator to test things locally.
Testing - Verified changes by testing manually.
* docs: update azureblob docs to reflect support of storage emulator
This bug was introduced as part of adding context to the backends and
slipped through the net because the About call did not have an
interface assertion in the sftp backend.
I checked there were no other missing interface assertions on all the
optional methods on all the backends.
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions
Context propagation is needed for stopping transfers and passing other
request-scoped values.
Before this change rclone attempted to set the "updated" field in
uploaded objects to the modification time.
However when this modification time was before 1970, google drive
would return the rather cryptic error:
googleapi: Error 400: Invalid value for UnsignedLong: -42000, invalid
However API docs: https://cloud.google.com/storage/docs/json_api/v1/objects#resource
state the "updated" field is read only and tests confirm that. Even
though the field is read only, it looks like Google parses it.
This change therefore removes the attempt to set the "updated" field
(which was doing nothing anyway) and fixes the problem uploading pre
1970 files.
See #3196 and https://forum.rclone.org/t/invalid-value-for-unsignedlong-file-missing-date-modified/3466
In #2728 and 55b9a4e we decided to allow server side operations
between google drives with different configurations.
This works in some cases (eg between teamdrives) but does not work in
the general case, and this caused breakage in quite a number of
people's workflows.
This change makes the feature conditional on the
--drive-server-side-across-configs flag which defaults to off.
See: https://forum.rclone.org/t/gdrive-to-gdrive-error-404-file-not-found/9621/10Fixes#3119
Under Linux, rclone attempts to preallocate files for efficiency.
Before this change, pre-allocation would fail on ZFS with the error
Failed to pre-allocate: operation not supported
After this change rclone tries a different flag combination for ZFS
then disables pre-allocate if that doesn't work.
Fixes#3066
Before this change rclone would fail with
Failed to set modification time: InvalidObjectState: Operation is not valid for the source object's storage class
when attempting to set the modification time of an object in GLACIER.
After this change rclone will re-upload the object as part of a sync if it needs to change the modification time.
See: https://forum.rclone.org/t/suspected-bug-in-s3-or-compatible-sync-logic-to-glacier/10187
Before this change, rclone would return an error from the listing if
there was an unreadable directory, or if there was a problem stat-ing
a directory entry. This was frustrating because the command
completely aborts at that point when there is work it could do.
After this change rclone lists the directories and reports ERRORs for
unreadable directories or problems stat-ing files, but does return an
error from the listing. It does set the error flag which means the
command will fail (and objects won't be deleted with `rclone sync`).
This brings rclone's behaviour exactly in to line with rsync's
behaviour. It does as much as possible, but doesn't let the errors
pass silently.
Fixes#3179
Before this change we calculated all possible hashes for the file when
the `Hashes` method was called.
After we only calculate the Hash requested.
Almost all uses of `Hash` just need one checksum. This will slow down
`rclone lsjson` with the `--hash` flag. Perhaps lsjson should have a
`--hash-type` flag.
However it will speed up sync/copy/move/check/md5sum/sha1sum etc.
Before it took 12.4 seconds to md5sum a 1GB file, after it takes 3.1
seconds which is the same time the md5sum utility takes.
This fixes rclone returning `listing failed: strconv.ParseInt` errors
when listing files which have a malformed `src_last_modified_millis`.
This is uploaded by the client so care is needed in interpreting it as
it can be malformed.
Fixes#3065
In as many methods as possible we attempt to obey the Retry-After
header where it is provided.
This means that when objects are being requested from OVH cold storage
rclone will sleep the correct amount of time before retrying.
If the sleeps are short it does them immediately, if long then it
returns an ErrorRetryAfter which will cause the outer retry to sleep
before retrying.
Fixes#3041
This implements the Expiry interface so token expiry works properly
This change makes sure that this change from the swift library works
correctly with rclone's custom authenticator.
> Renew the token 60s before the expiry time
>
> The v2 and v3 auth schemes both return the expiry time of the token,
> so instead of waiting for a 401 error, renew the token 60s before this
> time.
>
> This makes transfers more efficient and also works around a bug in
> CEPH which returns 403 instead of 401 when the token expires.
>
> http://tracker.ceph.com/issues/22223
Some WebDAV servers return an empty Available and Used which parses as 0.
This caused About to return the Total as 0 which can confused mounted
file systems.
After this change we ignore the result if Available and Used are both 0.
See: https://forum.rclone.org/t/windows-mounted-webdav-drive-has-no-free-space/8938
Before this change a race condition existed in mkdir
- the directory was attempted to be created
- the parent didn't exist so it failed
- the parent was created
- the directory was created again
The last step failed as the directory was created in a different thread.
This was fixed by checking the error messages of MKCOL for both
directory creations, rather than only the first.
Before this change a range request on a 0 length file would fail
$ rclone cat --head 128 drive:test/emptyfile
ERROR : open file failed: googleapi: Error 416: Request range not satisfiable, requestedRangeNotSatisfiable
To fix this we remove Range: headers on requests for zero length files.
This introduces a new config variable bucket_policy_only. If this is
set then rclone:
- ignores ACLs set on buckets
- ignores ACLs set on objects
- creates buckets with Bucket Policy Only set
Fall back to default application credentials when all other credentials sources fail
This change allows users with default application credentials
configured (notably when running on google compute instances) to
dispense with explicitly configuring google cloud storage credentials
in rclone's own configuration.
This enables MD5 checksum calculation and publication when uploading file above the "Cutoff" limit.
It was explictely ignored in case of multi-block (a.k.a. multipart) uploads to Azure Blob Storage.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
Bitrix Site Manager emits `<D:resourcetype><collection/></D:resourcetype>`
missing the namespace on the `collection` tag. This causes the item
to be identified as a file instead of a directory.
To work around this look at the Microsoft extension prop
`iscollection` which seems to be emitted as well.
Before this change any attempt to access a google doc in an rclone
mount would give the error "partial downloads are not supported while
exporting Google Documents" as the mount uses ranged requests to read
data.
This implements ranged requests for a limited number of scenarios,
just enough so that Google docs can be cat-ed from an rclone mount.
When they are cat-ed then they receive their correct size also.
Before this change the union remote was using whether the writable
union could poll for changes to decide whether the union mount could
poll for changes.
The fix causes the union backend to signal it can poll for changes if
**any** of the remotes can poll for changes.
Before this change it was setting the modification times of the things
that the symlinks pointed to.
Note that this is only implemented for unix style OSes. Other OSes
will not attempt to set the modification time of a symlink.
If the upload concurrency is set > 1 then the hash becomes corrupted.
The upload is fine, and can be downloaded fine, however the hash is no
longer the md5sum of the object. It is not known whether this is
rclone's fault or a bug at QingStor.
Before this change if ContentLength was set in the options but 0 then
we would upload using chunked encoding. Fix this to always upload
with a "Content-Length" header even if the size is 0.
Remove workarounds for this from b2 and onedrive backends.
This fixes the issue for the webdav backend described here:
https://forum.rclone.org/t/code-500-errors-with-webdav-nextcloud/8440/
Before this change azureblob would attempt to create already existing
containers. This causes problems with limited permissions keys.
This change checks the container exists before trying to create it in
the same way the s3 backend does. This uses no more requests in the
usual case of the container existing.
See: https://forum.rclone.org/t/copying-individual-files-to-azure-blob-storage/8397
Before this change buckets were created with the same ACL as objects.
After this change, the user can set just --s3-acl to set the ACL of
buckets and objects, or use --s3-bucket-acl as well to have a
different ACL used for bucket creation.
This also logs at INFO level the creation and deletion of buckets.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
The existing s3 backend passed all integration tests with OSS provided
`force_path_style = false`.
This makes sure that is so and adds documentation and configuration
for OSS.
Thanks to @luolibin for their work on the OSS backend which we ended
up not needing.
Fixes#1641Fixes#1237
The time format provided by webdav servers seems to vary wildly from
that specified in the RFC - rclone already parses times in 5 different
formats!
If an unparseable time is found, then fail softly logging an ERROR
(just once) but returning the epoch.
This will mean that webdav servers with bad time formats will still be
usable by rclone.
Before this fix rclone would just use the authorised bucket regardless
of what bucket you put on the command line.
This uses the new `bucketName` response in the API and checks that the
user is using the correct bucket name to avoid accidents.
Fixes#2839
Before this fix the http backend was returning the wrong error code
when files were not found. This was causing --files-from to error on
missing files instead of skipping them like it should.
The `cleanup` command will delete unfinished large file uploads that
were started more than a day ago (to avoid deleting uploads that are
potentially still in progress).
Fixes#2617
Increasing the --s3-upload-concurrency to 4 (from 2) gives an
additional 45% throughput at the cost of 10MB extra memory per transfer.
After testing the upload perfoc
Before this change rclone would use multipart uploads for any size of
file. However multipart uploads are less efficient for smaller files
and don't have MD5 checksums so it is advantageous to use single part
uploads if possible.
This implements single part uploads for all files smaller than the
upload_cutoff size. Streamed files must be uploaded as multipart
files though.
Before this change we used Remove to remove directories. This works
fine on Unix based systems but not so well on Windows based ones.
Swap to using RemoveDirectory instead.
When a container is deleted, a container with the same name cannot be
created for at least 30 seconds; the container may not be available
for more than 30 seconds if the service is still processing the
request.
We sleep so that we wait at most 60 seconds. This is mostly useful in
the integration tests where containers get deleted and remade
immediately.
Get rid of the api client and use rest/pacer for all API calls
Add Copy, Move, DirMove, PublicLink, About optional interfaces
Improve general error handling
Remove ListR for now due to inconsitent behaviour
fixes#2586, progress on #2740 and #2178
Before this change backend integration tests depended on each other,
so tests could not be retried.
After this change we nest tests to ensure that tests are provided with
the starting state they expect.
Tell the integration test runner that it can retry backend tests also.
This also includes bin/test_independence.go which runs each test
individually for a backend to prove that they are independent.
Wasabi has two location, US East and US West, with different endpoint URLs.
When configuring S3 to use Wasabi, provide the endpoint information for both
locations.
Before this change Rmdir would check the root rather than the
directory specified for being empty and return "directory not empty"
when it shouldn't have done.
When the env_auth option is enabled, the AWS SDK's session constructor
now loads configuration from ~/.aws/config and environment variables,
and credentials per the selected (or default) AWS_PROFILE's settings.
This is accomplished by **NOT** including any Credential provider in the
aws.Config passed to the session constructor: If the Config.Credentials
is non-nil, that will always be used and the user's configuration re
role_arn, credential_source, source_profile, etc... from the shared
config will be completely ignored.
(The conditional creation and configuration of the stscreds Credential
provider is complicated enough that it is not worth re-creating that
logic.)
Before this change the ACL for objects which were server side copied
was left at the default "private" settings. S3 doesn't copy the ACL
from the source when you copy an object, you have to set it afresh
which is what this does.
Until https://github.com/Azure/azure-storage-blob-go/pull/75 is merged
the SDK can't upload a single blob of exactly the chunk size, so
upload files of this size with a multpart upload as a work around.
The previous fix for this 6a773289e7 turned out to cause problems
uploading files with maximum chunk size so needed to be redone.
Fixes#2653
Before this change the Features() method would return a different Fs
to that the Features() method was called on if the remote was
instantiated on a file.
The practical effect of this is that optional features, eg `rclone
about` wouldn't work properly when called on a file, and likely this
has been causing low level problems for users of these backends for
ages.
Ideally there would be a test for this, but it turns out that this is
really hard, so instead of that all the backends have been converted
to not copy the Fs and a big warning comment inserted for future
readers.
Fixes#2182
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Implement optional interfaces
- Purge
- PutStream
- Copy
- Move
- DirMove
- DirCacheFlush
- ChangeNotify
- About
Make Hashes() return the intersection of all the hashes supported by the remotes
When moving a directory in drive, most of the time only a notification
for the directory itself is created, not the old or new parents.
This tires to find the old path in the dirCache and the new path with
the dirCache of the new parent, which can result in two notifications
for a moved directory.
Add a new flag to the drive backend to allow document conversions oni upload.
The existing --drive-formats flag has been renamed to --drive-export-formats.
The old flag is still working to be backward compatible.
Make use of the mime package to find matching extensions and mime types.
For simplicity, all extensions are now prefixed with "." to match the
mime package requirements.
Parsed extensions get converted if needed.
Before this change on Windows, files copied locally could become
heavily fragmented (300+ fragments for maybe 100 MB), no matter how
much contiguous free space there was (even if it's over 1TiB). This
can needlessly yet severely adversely affect performance on hard
disks.
This changes uses NtSetInformationFile to pre-allocate the space to
avoid this.
It does nothing on other OSes other than Windows.
Add --drive-v2-download-min-size flag to allow downloading files via the
drive v2 API. If files are greater than this flag, a download link is
generated when needed. The flag is disabled by default.
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
Fixes#2553
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
* Fix error handling in List and NewObject
* Fix Precision in case we have precision > time.Second
* Fix Features - all binary features are possible
* Fix integration tests using new test facilities
Uploads were broken because chunk size was set to zero. This was a
consequence of the backend config re-organization which meant that
chunk size had lost its default.
Sharing some backend config between swift and hubic fixes the problem
and means hubic gains its own --hubic-chunk-size flag.
The initial work on this was done by Oliver Heyme with updates from
Cnly.
Oliver Heyme:
* Changed to Microsoft graph
* Enable writing
* Added more options for adding a OneDrive Remote
* Better error handling
* Send modDate at create upload session and fix list children
Cnly:
* Simple upload API only supports max 4MB files
* Fix supported hash types for different drive types
* Fix unchecked err
Co-authored-by: Oliver Heyme <olihey@googlemail.com>
Co-authored-by: Cnly <minecnly@gmail.com>
Sometimes pcloud will leave a half uploaded file when the transfer
actually failed. This patch deletes the file if it exists.
This problem was spotted by the integration tests.
Before this change we were using the ChildCount in the Folder facet to
determine if a directory was empty or not. However this seems to be
unreliable, or updated asynchronously which meant that `rclone rmdir`
sometimes deleted directories that had files in.
This problem was spotted by the integration tests.
Listing the directory instead of relying on the ChildCount fixes the
problem and the integration tests, without changing the cost (one http
transaction).
This was causing errors which looked like this when copying a file to
the root of a drive:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
This was caused by an incorrect path splitting routine which was
removing \ of the end of UNC paths when it shouldn't have been. Fixed
by using the standard library `filepath.Dir` instead.
Before this change if only one of storage_url or auth_token were
supplied then rclone would overwrite both of them when authenticating.
This effectively meant you could supply both of them or none of them
only.
Now rclone still does the authentication to read the missing
storage_url or auth_token then afterwards re-writes the auth_token or
storage_url back to what the user desired.
Fixes#2464
* Add docs for --jottacloud-md5-memory-limit
* Factor out readMD5 function and add tests
* Fix accounting
* Make sure temp file is deleted at the start (not Windows)
In e52ecba295 we forgot to unwrap and
re-wrap the accounting which mean the the accounting was no longer
first in the chain of readers. This lead to accounting inaccuracies
in remotes which wrap and unwrap the reader again.
Some webdav backends (eg rclone serve webdav) leave behind half
written files on error. This causes the integration tests to
fail. Here we remove the file if it exists.
Sometimes it takes many more commit retries than expected to commit a
multipart file, so split this number into its own config variable and
default it to 100 which should always be enough.
this change adds the depth parameter to listAll and readMetaDataForPath.
this allows recursive calls of these methods with a different depth
header.
Sharepoint won't list files if the depth header is != 0. If that is the
case, it will just return a error 404 although the file exists.
Since it is not possible to determine if a path should be a file or a
directory, rclone has to make a request with depth = 1 first. On success
we are sure that the path is a directory and the listing will work.
If this request returns error 404, the path either doesn't exist or it
is a file.
To be sure, we can try again with depth set to 0. If it still fails, the
path really doesn't exist, else we found our file.
Before this change the Part structure had an int for the Offset and
uploading large files would produce this error
json: cannot unmarshal number 2147483648 into Go struct field Part.offset of type int
Changing the field to an int64 fixes the problem.
This unifies the 3 methods of reading config
* command line
* environment variable
* config file
And allows them all to be configured in all places. This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.
The backend changes are:
* Use the new configmap.Mapper parameter
* Use configstruct to parse it into an Options struct
* Add all config to []fs.Option including defaults and help
* Remove all uses of pflag
* Remove all uses of config.FileGet
This change includes removing older azureblob storage SDK, and getting
parity to existing code with latest blob storage SDK.
This change is also pre-req for addressing #2091
Go can't redirect PROPFIND requests properly, it changes the method to
GET, so we disable redirects when reading the metadata and assume the
object does not exist if we receive a redirect.
This is to work-around the qnap redirecting requests for directories
without /.
Previously this was reading a stale hash from the object leading to
broken integration tests.
This fixes these integration tests TestSyncDoesntUpdateModtime,
TestSyncAfterChangingFilesSizeOnly, TestSyncAfterChangingContentsOnly,
TestSyncWithUpdateOlder, TestSyncUTFNorm.
Now that https://issuetracker.google.com/issues/64468406 has been
fixed, we can remove part of the workaround which fixed#1675 -
019adc35609c2136
This will make queries marginally more efficient. We still need the
other part of the workaround since the `=` operator is case
insensitive.
Before this commit rclone's handling of symlinks and junction points
under Windows was broken. rclone treated them as files and attempted
to transfer them which gave the error "The handle is invalid".
Ultimately the cause of this was 3e43ff7414 which was a
workaround so files with reparse points (which are a kind of symlink)
would transfer correctly.
The solution implemented is to revert the above commit which will mean
that #614 will break again. However there is now a work-around (which
will be signaled by rclone) to use the -L flag which wasn't available
when the original commit was made.
Fixes#2336
In the drive v3 conversion we forgot the IncludeTeamDriveItems
parameter when calling the changes API. Adding it fixes the changes
polling with team drives.
This implementation hopefully can handle all error requests from the
onedrive for business authentication.
I have only tested it with the "domain in unmanaged state" error.
This was caused by using the sftp.File.Read method which resets the
streaming window after each call. Replacing it with sftp.File.WriteTo
and an io.Pipe fixes the problem bringing the speed to the same as the
sftp binary.
This checks the checksum of the streamed encrypted data against the
checksum of the encrypted object returned from the remote and returns
an error if it is different.
* Fix errcheck and golint warnings
* Remove unused constants and fix comments
* Parse error responses properly
* Fix Open with RangeOption
* Fix Move, Copy and DirMove
* Implement DirCacheFlush
* Check interfaces are correct
* Remove debugs and update overview
* Correct feature flags
* Pare replacement characters down to the minimum set
* Add to the integration tests
* Add Mkdir, Rmdir, Purge, Delete, SetModTime, Copy, Move, DirMove
* Update file size after upload
* Add Open seek
* Set private permission for new folder and uploaded file
* Add docs
* Update List function
* Fix UserSessionInfo struct
* Fix socket leaks
* Don’t close resp.Body in Open method
* Get hash when listing files
The official drive APIs seem to have trouble downloading large
documents sometimes.
This commit adds a --drive-alternate-export flag to use an different,
unofficial set of export URLS which seem to download large files OK.
Google cloud storage doesn't normally need retries, however certain
things (eg bucket creation and removal) are rate limited and do
generate 429 errors.
Before this change the integration tests would regularly blow up with
errors from GCS rate limiting bucket creation and removal.
After this change we low level retry all operations using the same
exponential backoff strategy as used in the google drive backend.