Commit graph

705 commits

Author SHA1 Message Date
Nick Craig-Wood
7e6fac8b1e s3: re-implement multipart upload to fix memory issues
There have been quite a few reports of problems with the multipart
uploader using too much memory and not retrying possible errors.

Before this change the multipart uploader used the s3manager
abstraction in the AWS SDK.  There are numerous bug reports of this
using up too much memory.

This change re-implements a much simplified version of the s3manager
code specialized for rclone's purposes.

This should use much less memory and retry chunks properly.

See: https://forum.rclone.org/t/memory-usage-s3-alike-to-glacier-without-big-directories/13563
See: https://forum.rclone.org/t/copy-from-local-to-s3-has-high-memory-usage/13405
See: https://forum.rclone.org/t/big-file-upload-to-s3-fails/13575
2020-01-03 22:19:28 +00:00
buengese
8a2d1dbe24 jottacloud: add support whitelabel versions 2020-01-02 15:37:33 +01:00
Thomas Kriechbaumer
584e705c0c s3: introduce list_chunk option for bucket listing
The S3 ListObject API returns paginated bucket listings, with
"MaxKeys" items for each GET call.

The default value is 1000 entries, but for buckets with millions of
objects it might make sense to request more elements per request, if
the backend supports it. This commit adds a "list_chunk" option for
the user to specify a lower or higher value.

This commit does not add safe guards around this value - if a user
decides to request a too large list, it might result in connection
timeouts (on the server or client).

In AWS S3, there is a fixed limit of 1000, some other services might
have one too.  In Ceph, this can be configured in RadosGW.
2020-01-02 12:15:01 +00:00
Outvi V
db1c7f9ca8 s3: Add new region Asia Patific (Hong Kong) 2020-01-02 11:10:48 +00:00
Nick Craig-Wood
0ecb8bc2f9 s3: fix url decoding of NextMarker - fixes #3799
Before this patch we were failing to URL decode the NextMarker when
url encoding was used for the listing.

The result of this was duplicated listings entries for directories
with >1000 entries where the NextMarker was a file containing a space.
2019-12-12 13:33:30 +00:00
Ivan Andreev
41ba1bba2b chunker: reduce length of temporary suffix 2019-12-09 16:56:32 +00:00
Nick Craig-Wood
684dbe0e9d local: make source file being updated errors be NoLowLevelRetry errors #3777 2019-12-06 10:54:03 +00:00
Nick Craig-Wood
0d10640aaa s3: add --s3-copy-cutoff for size to switch to multipart copy
Before this change we used the same (relatively low limits) for server
side copy as we did for multipart uploads.  It doesn't make sense to
use the same limits since no data is being downloaded or uploaded for
a server side copy.

This change introduces a new parameter --s3-copy-cutoff to control
when the switch from single to multipart server size copy happens and
defaults it to the maximum 5GB.

This makes server side copies much more efficient.

It also fixes the erroneous error when trying to set the modification
time of a file bigger than 5GB.

See #3778
2019-12-03 10:37:55 +00:00
Nick Craig-Wood
f4746f5064 s3: fix multipart copy - fixes #3778
Before this change multipart copies were giving the error

    Range specified is not valid for source object of size

This was due to an off by one error in the range source introduced in
7b1274e29a "s3: support for multipart copy"
2019-12-03 10:37:55 +00:00
Aleksandar Janković
c05bb63f96 s3: fix DisableChecksum condition 2019-12-02 15:15:59 +00:00
Nick Craig-Wood
d3b0bed091 drive: make sure invalid auth for teamdrives always reports an error
For some reason Google doesn't return an error if you use a service
account with the wrong permissions to list a team drive.  This gives
the user the false impression that the drive is empty.

This change:
- calls teamdrives get on rclone about
- calls teamdrives get on a listing of the root which returned no entries

These will both detect a team drive which has the incorrect auth and
workaround the issue.

Fixes: #3763
See: https://forum.rclone.org/t/rclone-missing-error-code-when-sas-have-no-permission/13086
See: https://forum.rclone.org/t/need-need-bug-verification-rclone-about-doesnt-work-on-teamdrives-empty-output/13105
2019-11-28 10:51:17 +00:00
Nick Craig-Wood
33c80bbb96 jottacloud: add URL to generate Login Token to config wizard 2019-11-28 10:03:48 +00:00
Nick Craig-Wood
705e4694ed webdav: fix case of "Bearer" in Authorization: header to agree with RFC
Before this change rclone used "Authorization: BEARER token".  However
according the the RFC this should be "Bearer"

https://tools.ietf.org/html/rfc6750#section-2.1

This changes it to "Authorization: Bearer token"

Fixes #3751 and interop with Salesforce Webdav server
2019-11-27 12:04:31 +00:00
Nick Craig-Wood
4fbc90d115 webdav: make nextcloud only upload SHA1 checksums
When using nextcloud, before this change we only uploaded one of SHA1
or MD5 checksum in the OC-Checksum header with preference to SHA1 if
both were set.

This makes the MD5 checksums read as empty string which makes syncing
with checksums less useful than they should be as all the MD5
checksums are blank.

This change makes it so that we only upload the SHA1 to nextcloud.

The behaviour of owncloud is unchanged as owncloud uses the checksum
as an upload integrity check only and calculates its own checksums.

See: https://forum.rclone.org/t/how-to-specify-hash-method-to-checksum/13055
2019-11-27 11:58:55 +00:00
buengese
4195bd7880 jottacloud: use new auth method used by official client 2019-11-26 13:49:49 +00:00
Garry McNulty
11f44cff50 drive: add --drive-use-shared-date to use date file was shared instead of modified date - fixes #3624 2019-11-26 12:19:44 +00:00
Nick Craig-Wood
a7d65bd519 sftp: add --sftp-skip-links to skip symlinks and non regular files - fixes #3716
This also corrects the symlink detection logic to only check symlink
files.  Previous to this it was checking all directories too which was
making it do more stat calls than was necessary.
2019-11-24 16:10:53 +00:00
Nick Craig-Wood
1db31d7149 swift: fix parsing of X-Object-Manifest
Before this change we forgot to URL decode the X-Object-Manifest in a dynamic large object.

This problem was introduced by 2fe8285f89 "swift: reserve
segments of dynamic large object when delete objects in container what
was enabled versioning."
2019-11-21 13:25:02 +00:00
Nguyễn Hữu Luân
2fe8285f89 swift: reserve segments of dynamic large object when delete objects in container what was enabled versioning.
add code handle move object when moving the object is contained by the container what was enabled versioning with "X-History-Location".
2019-11-18 16:26:10 +00:00
Ankur Gupta
75a6c49f87 Fix error counter - fixes #3650
For few commands, RClone counts a error multiple times. This was fixed by
creating a new error type which keeps a flag to remember if the error has
already been counted or not. The CountError function now wraps the original
error eith the above new error type and returns it.
2019-11-18 14:13:02 +00:00
Nick Craig-Wood
19229b1215 drive: fix --drive-root-folder-id with team/shared drives
Before this change rclone used the team_drive ID as the root if set
even if the root_folder_id was set too.

This change uses the root_folder_id in preference over the team_drive
which restores the functionality.

This problem was introduced by ba7c2ac443

Fixes #3742
2019-11-16 18:38:21 +00:00
Nick Craig-Wood
479c803fd9 vendor: update all dependencies 2019-11-14 21:51:34 +00:00
Nick Craig-Wood
3dcf1e61cf cache: follow move of upstream library github.com/coreos/bbolt github.com/etcd-io/bbolt 2019-11-14 21:51:34 +00:00
Sebastian Brandt
f158a398f3 sftp: Retry Creation of Connection - fixes #3656
Removes the existing rate limiter because it is implicitly included in
the pacer.
2019-11-14 12:50:01 +00:00
jaKa
acefa5c40d koofr: use rclone HTTP client. 2019-11-14 11:36:44 +00:00
Nick Craig-Wood
1e423d21e1 drive: fix listing of the root directory with drive.files scope
We attempt to find the ID of the root folder by doing a GET on the
folder ID "root". With scope "drive.files" this fails with a 404
message.

After this change if we get the 404 message, we just carry on using
"root" as the root folder ID and we cache that for future lookups.

This means that changenotify messages will not work correctly in the
root folder but otherwise has minor consequences.

See: https://forum.rclone.org/t/fresh-raspberry-pi-build-google-drive-404-error-failed-to-ls-googleapi-error-404-file-not-found/12791
2019-11-11 09:07:34 +00:00
Nick Craig-Wood
9b5308144f s3: Reduce memory usage streaming files by reducing max stream upload size
Before this change rclone would allow the user to stream (eg with
rclone mount, rclone rcat or uploading google photos or docs) 5TB
files.  This meant that rclone allocated 4 * 525 MB buffers per
transfer which is way too much memory by default.

This change makes rclone use the configured chunk size for streamed
uploads.  This is 5MB by default which means that rclone can stream
upload files up to 48GB by default staying below the 10,000 chunks
limit.

This can be increased with --s3-chunk-size if necessary.

If rclone detects that a file is being streamed to s3 it will make a
single NOTICE level log stating the limitation.

This fixes the enormous memory usage.

Fixes #3568
See: https://forum.rclone.org/t/how-much-memory-does-rclone-need/12743
2019-11-09 15:55:19 +00:00
Aleksandar Jankovic
4b20afa94a backend/s3: fix ExpiryWindow value
ExpiryWindow accepts duration but it was set to value 3.
This changes it to 3 * time.Minute since default is 5 min.
2019-11-05 13:55:55 +00:00
Nick Craig-Wood
3f7af64316 config: give config questions default values - fixes #3672 2019-11-05 11:53:44 +00:00
Nick Craig-Wood
7bf056316f local: fix listings of . on Windows - fixes #3676 2019-10-30 16:00:18 +00:00
Nick Craig-Wood
1ce1ea34aa hash: fix hash names for DropboxHash and CRC-32
These were unintentionally renamed as part of 1dc8bcd48c

Fixes #3679
2019-10-30 12:20:10 +00:00
Xiaoxing Ye
191cfb79d1 onedrive: no trailing slash reading metadata...
No trailing slash when reading metadata of an item given item ID.

This should fix #3664.
2019-10-29 13:33:11 +00:00
Nick Craig-Wood
ab895390f4 s3: fix nil pointer reference if no metadata returned for object
Fixes #3651 Fixes #3652
2019-10-25 13:45:47 +01:00
Nick Craig-Wood
a3a5857874 drive: fix change notify polling when using appDataFolder
See: https://forum.rclone.org/t/remote-changes-arent-picked-up/12520
2019-10-24 12:51:01 +01:00
Nick Craig-Wood
0f0079ff71 b2: remove unverified: prefix on sha1 - fixes #3654 2019-10-23 08:41:56 +01:00
dausruddin
7eee2f904a drive: fix typo 2019-10-21 22:28:28 +01:00
Nick Craig-Wood
3ef0c73826 drive: fix ChangeNotify polling for shared drives
Before this fix we neglected to add the shared drive ID to the request
when asking for an initial change notify token and this caused a lot
more results to be returned than was necessary.
2019-10-21 20:51:11 +01:00
Nick Craig-Wood
2bbfcc74e9 drive: fix --drive-shared-with-me from the root with ls and --fast-list
When we changed recursive lists to use --fast-list by default this
broke listing with --drive-shared-with-me from the root.

This turned out to be an unwarranted assumption in the ListR code that
all items would have a parent folder that we had searched for - this
isn't true for shared with me items.

This was fixed when using --drive-shared-with-me to give items that
didn't have any parents a synthetic parent.

Fixes #3639
2019-10-21 12:16:01 +01:00
Nick Craig-Wood
ba7c2ac443 drive: make sure that drive root ID is always canonical
Before this change we used the id "root" as an alias for the root drive ID.

However this causes problems when we receive IDs back from drive which
are not in this format and have been expanded to their canonical ID.

This change looks up the ID "root" and stores it in the
"drive_folder_id" parameter in the config file.

This helps with
- Notifying changes at the root
- Files shared with me at the root

See #3639
2019-10-21 12:16:01 +01:00
Nick Craig-Wood
2d9b8cb981 azureblob: disable logging to the Windows event log
See: https://forum.rclone.org/t/event-log-warning/12430
2019-10-21 11:50:31 +01:00
Carlos Ferreyra
9cb549a227 sftp: include more ciphers with use_insecure_cipher 2019-10-17 14:58:31 +01:00
Nick Craig-Wood
38652d046d drive: disable HTTP/2 by default to work around INTERNAL_ERROR problems
Before this change when rclone was compiled with go1.13 it used HTTP/2
to contact drive by default.

This causes lockups and INTERNAL_ERRORs from the HTTP/2 code.

This is a workaround disabling the HTTP/2 code on an option.

It can be re-enabled with `--drive-disable-http2=false`

See #3631
2019-10-16 11:26:08 +01:00
Cenk Alti
929f275ae5 putio: add ability to resume uploads 2019-10-14 20:01:16 +01:00
Ivan Andreev
77b42aa33a chunker: fix integration tests and hashsum issues 2019-10-13 10:43:46 +01:00
Ivan Andreev
910c80bd02 chunker: option to hash all files 2019-10-13 10:43:46 +01:00
Ivan Andreev
9049bb62ca chunker: prevent chunk corruption, survive meta-like input 2019-10-13 10:43:46 +01:00
Ivan Andreev
7aa2b4191c chunker: reservations for future extensions 2019-10-13 10:43:46 +01:00
Alex Chen
41ed33b08e docs: update onedrive/sharepoint docs on some known issues 2019-10-12 12:08:22 +01:00
Nick Craig-Wood
65a82fe77d dropbox: fix nil pointer exception on restricted files
See: https://forum.rclone.org/t/issues-syncing-dropbox/12233
2019-10-11 16:21:24 +01:00
Jon Fautley
5d33236050 ftp: allow disabling EPSV mode 2019-10-10 21:00:41 +01:00
Nick Craig-Wood
6abaa9e22c fstests: allow skipping of the broken UTF-8 test for the cache backend 2019-10-10 10:36:18 +01:00
Nick Craig-Wood
8c1edf410c dropbox: make disallowed filenames return no retry error - fixes #3569
Before this change we silently skipped uploads to dropbox of
disallowed file names.  However this then caused "corrupted on
transfer" errors because the sizes were wrong.

After this change we return an no retry error which will mean that the
sync fails (as it should - not all files were uploaded) but no
unecessary retries happened.
2019-10-08 19:59:47 +01:00
Henning Surmeier
eff11b44cf webdav: parse and return sharepoint error response
fixes #3176
2019-10-06 20:17:13 +01:00
Nick Craig-Wood
5271fe3b3f yandex: use lib/encoder 2019-10-05 10:22:43 +01:00
庄天翼
7b1274e29a s3: support for multipart copy
Fixes #2375 Fixes #3579
2019-10-04 16:49:06 +01:00
Nick Craig-Wood
d21ddf280c mailru: comment out some debugging statements 2019-10-02 20:10:01 +01:00
Nick Craig-Wood
135717e12b mailru: use lib/encoder 2019-10-02 20:10:01 +01:00
Aleksandar Jankovic
6b55b8b133 s3: add option for multipart failiure behaviour
This is needed for resuming uploads across different sessions.
2019-10-02 16:49:16 +01:00
Nick Craig-Wood
b94b2a3723 mega: fix after lib/encoder change 2019-10-02 12:41:52 +01:00
Nick Craig-Wood
fd51f24906 putio: use lib/encoder
And in the process
- fix a bug with + and & in file name
- fix NewObject returning directories as files
2019-10-02 11:34:08 +01:00
Nick Craig-Wood
4615343b73 premiumizeme: use lib/encoder 2019-10-02 11:34:08 +01:00
Fionera
1dc8bcd48c Remove backend dependency from fs/hash 2019-10-01 16:29:58 +01:00
Nick Craig-Wood
77a520c97c fichier: fix accessing files > 2GB on 32 bit systems - fixes #3581 2019-10-01 16:03:49 +01:00
Nick Craig-Wood
04eb96b50b fichier: fix NewObject after lib/encoder changes
This bug was introduced as part of the lib/encoder changes in
8d8fad724b.  It caused NewObject not to work for a file with
escaped characters in it.
2019-10-01 15:30:51 +01:00
Fabian Möller
b9bd15a8c9 koofr: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:25 +01:00
Nick Craig-Wood
b581f2de26 sharefile: use lib/encoder 2019-09-30 22:00:25 +01:00
Nick Craig-Wood
8d8fad724b ficher: use lib/encoder 2019-09-30 22:00:25 +01:00
Nick Craig-Wood
d122d1d191 qingstor: use lib/encoder 2019-09-30 22:00:25 +01:00
Nick Craig-Wood
35d6ff89bf azureblob: use lib/encoder 2019-09-30 22:00:24 +01:00
Nick Craig-Wood
53bec33027 swift: use lib/encoder 2019-09-30 22:00:24 +01:00
Fabian Möller
3304bb7a56 googlecloudstorage: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Nick Craig-Wood
6e053ecbd0 s3: only ask for URL encoded directory listings if we need them on Ceph
This works around a bug in Ceph which doesn't encode CommonPrefixes
when using URL encoded directory listings.

See: https://tracker.ceph.com/issues/41870
2019-09-30 22:00:24 +01:00
Fabian Möller
7689bd7e21 amazonclouddrive: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Fabian Möller
33f129fbbc s3: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Nick Craig-Wood
a8adce9c59 s3: fix encoding for control characters - Fixes #3345 2019-09-30 22:00:24 +01:00
Nick Craig-Wood
6ae7bd7914 local: encode invalid UTF-8 on macOS 2019-09-30 22:00:24 +01:00
Nick Craig-Wood
32af4cd6f3 ftp: use lib/encoder 2019-09-30 22:00:24 +01:00
Nick Craig-Wood
b90e4a8769 sftp: fix hashes of files with backslashes 2019-09-30 22:00:24 +01:00
Fabian Möller
00b2c02bf4 pcloud: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Fabian Möller
33aea5d43f mega: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Fabian Möller
13d8b7979d b2: use lib/encoder
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2019-09-30 22:00:24 +01:00
Nick Craig-Wood
57c1284df7 fstests: make integration tests to check all backends can store any file name
This tests the encoder is working properly
2019-09-30 22:00:24 +01:00
Fabian Möller
6c0a749a42 crypt: remove checkValidString
Remove the usage of checkValidString in decryptSegment to allow all
paths which can be created by encryptSegment to be decryptable.
2019-09-30 14:05:49 +01:00
Fabian Möller
4b9fdb8475 opendrive: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
dac20093c5 onedrive: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
d211347d46 dropbox: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
4837bc3546 jottacloud: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
69c51325bb drive: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
05e4f10436 box: use lib/encoder 2019-09-30 14:05:49 +01:00
Fabian Möller
a98a750fc9 local: use lib/encoder 2019-09-30 14:05:49 +01:00
Nick Craig-Wood
4627ac5709 New backend for Citrix Sharefile - Fixes #1543
Many thanks to Bob Droog for organizing a test account and extensive
testing.
2019-09-30 12:28:33 +01:00
Nick Craig-Wood
fef8b98be2 ftp: fix listing of an empty root returning: error dir not found
Before this change if rclone listed an empty root directory then it
would return an error dir not found.

After this change we assume the root directory exists and don't
attempt to check it which was failing before.

See: https://forum.rclone.org/t/ftp-empty-directory-yields-directory-not-found-error/12069/
2019-09-28 18:01:12 +01:00
Ivan Andreev
ccecfa9cb1 chunker: finish meta-format before release
changes:
- chunker: remove GetTier and SetTier
- remove wdmrcompat metaformat
- remove fastopen strategy
- make hash_type option non-advanced
- adverise hash support when possible
- add metadata field "ver", run strict checks
- describe internal behavior in comments
- improve documentation

note:
wdmrcompat used to write file name in the metadata, so maximum metadata
size was 1K; removing it allows to cap size by 200 bytes now.
2019-09-25 11:03:33 +01:00
Ivan Andreev
59dba1de88 chunker: implementation + required fstest patch
Note: chunker implements many irrelevant methods (UserInfo, Disconnect etc),
but they are required by TestIntegration/FsCheckWrap and cannot be removed.

Dropped API methods: MergeDirs DirCacheFlush PublicLink UserInfo Disconnect OpenWriterAt

Meta formats:
- renamed old simplejson format to wdmrcompat.
- new simplejson format supports hash sums and verification of chunk size/count.

Change list:
- split-chunking overlay for mailru
- add to all
- fix linter errors
- fix integration tests
- support chunks without meta object
- fix package paths
- propagate context
- fix formatting
- implement new required wrapper interfaces
- also test large file uploads
- simplify options
- user friendly name pattern
- set default chunk size 2G
- fix building with golang 1.9
- fix ci/cd on a separate branch
- fix updated object name (SyncUTFNorm failed)
- fix panic in Box overlay
- workaround: Box rename failed if name taken
- enhance comments in unit test
- fix formatting
- embed wrapped remote rather than inherit
- require wrapped remote to support move (or copy)
- implement 3 (keep fstest)
- drop irrelevant file system interfaces
- factor out Object.mainChunk
- refactor TestLargeUpload as InternalTest
- add unit test for chunk name formats
- new improved simplejson meta format
- tricky case in test FsIsFile (fix+ignore)
- remove debugging print
- hide temporary objects from listings
- fix bugs in chunking reader:
  - return EOF immediately when all data is sent
  - handle case when wrapped remote puts by hash (bug detected by TestRcat)
- chunked file hashing (feature)
- server-side copy across configs (feature)
- robust cleanup of temporary chunks in Put
- linear download strategy (no read-ahead, feature)
- fix unexpected EOF in the box multipart uploader
- throw error if destination ignores data
2019-09-24 12:45:12 +01:00
Anthony Rusdi
899f285319 s3: fix signature v2_auth headers
When used with v2_auth = true, PresignRequest doesn't return
signed headers, so remote dest authentication would be fail.
This commit copying back HTTPRequest.Header to headers.

Tested with RiakCS v2.1.0.

Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
2019-09-21 14:38:51 +01:00
David
4788545b05 box: add options to get access token via JWT auth 2019-09-20 17:15:16 +01:00
Ivan Andreev
8fe87c8157 mailru: skip extra http request if data fits in hash 2019-09-17 10:04:51 +01:00
Ivan Andreev
8fb44a822d mailru: fix rare nil pointer panic 2019-09-17 10:04:51 +01:00
Nick Craig-Wood
3cff258577 sftp: fix --sftp-ask-password trying to contact the ssh agent
See: https://forum.rclone.org/t/rclone-command-line/11766
2019-09-16 11:16:27 +01:00
Nick Craig-Wood
25786cafd3 s3: fix SetModTime on GLACIER/ARCHIVE objects and implement set/get tier
- Read the storage class for each object
- Implement SetTier/GetTier
- Check the storage class on the **object** before using SetModTime

This updates the fix in 1a2fb52 so that SetModTime works when you are
using objects which have been migrated to GLACIER but you aren't using
GLACIER as a storage class.

Fixes #3522
2019-09-14 09:18:55 +01:00