Tests have been randomly failing with messages like
listen tcp 127.0.0.1:51778: bind: address already in use
Rework all the test servers so they choose a random free port on
startup and use that for the tests to avoid.
In c5ac96e9e7 we made --files-from only read the objects specified and
don't scan directories.
This caused problems with Google drive (very very slow) and B2
(excessive API consumption) so it was decided to make the old
behaviour (traversing the directories) the default with --files-from
and use the existing --no-traverse flag (which has exactly the right
semantics) to enable the new non scanning behaviour.
See: https://forum.rclone.org/t/using-files-from-with-drive-hammers-the-api/8726Fixes#3102Fixes#3095
Before the fix we were only de-duping the ListR batches.
Afterwards we dedupe everything.
This will have the consequence that rclone uses more memory as it will
build a map of all the directory names, not just the names in a given
directory.
This dramatically increases the speed (7x in my tests) of the de-dupe
as google drive supports ListR directly and dedupe did not work with
`--fast-list`.
Fixes#2902
This will increase speed for backends which support ListR and will not
have the memory overhead of using --fast-list.
It also means that errors are queued until the end so as much of the
remote will be listed as possible before returning an error.
Commands affected are:
- lsf
- ls
- lsl
- lsjson
- lsd
- md5sum/sha1sum/hashsum
- size
- delete
- cat
- settier
It otherwise has the nearly the same interface as walk.Walk which it
will fall back to if it can't use ListR.
Using walk.ListR will speed up file system operations by default and
use much less memory and start immediately compared to if --fast-list
had been supplied.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
This brings it up to par with lsjson.
This commit also reworks the framework to use ListJSON internally
which removes duplicated code and makes testing easier.
This puts a shim on the reader opened by Copy so that if an error is
returned, the reader is re-opened at the correct seek point.
This should make downloading very large files more reliable.
This replaces the `sync.Pool` allocator with lib/pool. This
implements a pool of buffers of up to 64MB which can be re-used but is
flushed every 5 seconds.
If `--use-mmap` is set then rclone will use mmap for memory
allocations which is much better at returning memory to the OS.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
This fixes several things wrong with the layout of the stats.
Transfers which haven't started are printed in the same format as
those which have so the stats with `--progress` don't show horrible
artifacts.
Checkers and transfers now get a ": checkers" and ": transfers" label
on the end of the stats line. Transfers will have the transfer stats
when the transfer has started instead of this.
There was a bug in the routine which shortened the file names (it
always produces strings 1 too long). This is now fixed with a test.
The formatting string was wrong with a fixed width of 45 - this is now
replaces with the value of `--stats-file-name-length`.
This also meant that there were unecessary leading spaces in the file
names. So the default `--stats-file-name-length` was raised to 45
from 40.
Cookies are handled by cookiejar in memory with fshttp module through
the entire session.
One useful scenario is, with HTTP storage system where index server
adds authentication cookie while redirecting to CDN for actual files.
Also, it can be helpful to reuse fshttp in other storage systems
requiring cookie.
This means that rclone will pick up tokens from concurrently running
rclones. This helps for Box which only allows each refresh token to
be used once.
Without this fix, rclone caches the refresh token at the start of the
run, then when the token expires the refresh token may have been used
already by a concurrently running rclone.
This also will retry the oauth up to 5 times at 1 second intervals.
See: https://forum.rclone.org/t/box-token-refresh-timing/8175
Before this change rclone would read the list of files from the
files-from parameter and check they existed one at a time. This could
take a very long time for lots of files.
After this change, rclone will check up to --checkers in parallel.
The --no-traverse flag was not implemented when the new sync routines
(using the march package) was implemented.
This re-implements --no-traverse in march by trying to find a match
for each object with NewObject rather than from a directory listing.
Before this change TestPurge would remove a container and subsequent
tests would fail because the container was still being deleted so
couldn't be created.
This was fixed by introducing an fstest.NewRunIndividual() test runner
for TestPurge which causes the test to be run on a new container.
If `--rc-user` or `--rc-pass` is set then the URL that is opened with
`--rc-files` will have the authorization in the URL in the
`http://user:pass@localhost/` style.
Before this change, Purge on the fallback path would try to delete
directories starting from the root rather than the dir passed in.
Rmdirs would also attempt to delete the root.
Before this change using --files-from would scan all the directories
that the files could possibly be in causing rclone to do more work
that was necessary.
After this change, rclone constructs an in memory tree using the
--fast-list mechanism but from all of the files in the --files-from
list and without scanning any directories.
Any objects that are not found in the --files-from list are ignored
silently.
This mechanism is used for sync/copy/move (march) and all of the
listing commands ls/lsf/md5sum/etc (walk).
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Before this change remotes without server side Move (eg swift, s3,
gcs) would not be able to rename files.
After it means nearly all remotes will be able to rename files on
rclone mount with the notable exceptions of b2 and yandex.
This changes checks to see if the remote can do Move or Copy then
calls `operations.Move` to do the actual move. This will do a server
side Move or Copy but won't download and re-upload the file.
It also checks to see if the destination exists first which avoids
conflicts or duplicates.
Fixes#1965Fixes#2569
Before this change the moving average for the individual file stats
would start at 0 and only converge to the correct value over 15-30
seconds.
This change starts the weighting period as 1 and moves it up once per
sample which gets the average to a better value instantly.
Previous to this change package used for this
github.com/VividCortex/ewma took a 0 average to mean reset the
statistics. This happens quite often when transferring files though a
buffer.
Replace that implementation with a simple home grown one (with about
the same constant), without that feature.
This change allows remotes to be created on the fly without a config
file by using the remote type prefixed with a : as the remote name, Eg
:s3: to make an s3 remote.
This assumes the user is supplying the backend config via command line
flags or environment variables.
The race detector currently detects a race with len(chan) against
close(chan).
See: https://github.com/golang/go/issues/27070
Skip the tests which trip this bug under the race detector.
--max-backlog controls the queue length.
Add statistics for the check/upload/rename queues.
This means that checking can complete before the uploads which will
give rclone the ability to show exactly what is outstanding.
This unifies the 3 methods of reading config
* command line
* environment variable
* config file
And allows them all to be configured in all places. This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.
The backend changes are:
* Use the new configmap.Mapper parameter
* Use configstruct to parse it into an Options struct
* Add all config to []fs.Option including defaults and help
* Remove all uses of pflag
* Remove all uses of config.FileGet
Before this change if the rclone was running in an environment which
couldn't find the HOME directory, it would print a warning about
supplying a --config flag even if the user had done so.
Before this copyto would parse windows paths incorrectly.
This change moves the parsing code into fspath and makes sure
fspath.Split calls fspath.Parse which does the parsing correctly for
This also renames fspath.RemoteParse to fspath.Parse for consistency
--one-way argument will check that all files on source matches the files on detination,
but not the other way. For example files present on destination but not on source will not
trigger an error.
Fixes: #1526
A deadlock could occur since we have now put a mutex on GetBytes from
StatsInfo.String (s.mu) - progress (acc.statmu) and read (acc.statmu)
- GetBytes (s.mu).
Fix this by giving stringSet its own locking and excluding the call
which caused the deadlock from the mutex in StatsInfo.String.
- make Close permanent and return errors afterwards
- use RangeSeek from the wrapped reader if present
- add a limit to chunk growth
- correct RangeSeek interface behavior
- add tests
Unfortunately this commit attempts to create every directory rather
than just the empty ones, so will need re-working.
Removing this feature for the 1.41 release
This reverts commit 0daced29db.
Somehow in the code reorganisation of
11da2a6c9b the check for --min-age and
--max-age got switched around. This commit fixes that and means you
can use --min-age and --max-age together.
* Implement about for:
* local, crypt, cache, drive, swift, hubic, onedrive, pcloud, dropbox
* Implement `--json` and `---full` flag for `rclone about`
* change About interface to return a Usage structure
* Remove operations.About as it is too thin an interface
* Implement Integration test
Relates to #1138 and #1564
This introduces a method of making provider specific configuration
within a remote. This is useful particularly in s3.
This commit does the basic configuration in S3 for IBM COS.
This problem was introduced with eca99b33c0. It seems Box is the only
remote which converts time zones, so if you give it a GMT time zone,
it returns a PST time zone which represents the same instant.
This implements a remote control protocol activated with the --rc flag
and a new command `rclone rc` to use that interface.
Still to do
* docs - need finishing
* tests
This is a problem when syncing a file which just needed its modtime
set with dropbox which can't set the mod time of a file without
re-uploading it.
Before this change we would delete the file, then the server side move
would fail moving the file to the backup-dir because it no longer
existed.
After this change the destination file is moved to the backup-dir
instead of being deleted and the new file is uploaded.
Fixes#2134
This removes the old system of part accounting and replaces it with a
system of popping off the accounting reader and wrapping up new ones
as necessary.
This makes it much easier to carry the context down the chain of
wrapped readers and get the limiting as near as possible to the
output. This makes the accounting more accurate and the bandwidth
limiting smoother.
Fixes#2029 and Fixes#1443
A Range request can never request 0 bytes however this change was made
to make a clearer signal that the limit means read to the end.
Add test and more documentation and fixup uses
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.
The new code layout is documented in CONTRIBUTING.
Before this if the client_id/client_secret was edited it would
disappear when asking for the new token.
This means the post config is done after the user has confirmed the
config is OK which can't be helped.
RepeatableReaderSized has a pre-allocated buffer which should help
with memory usage - before it grew the buffer. Since we know the size
of the chunks, pre-allocating it should be much more efficient.
RepeatableReaderBuffer uses the buffer passed in.
RepeatableLimit* are convenience funcitions for wrapping a reader in
an io.LimitReader and then a RepeatableReader with the same buffer
size.
Now --dump-flag is written as --dump flag. This is a comma separated list which can contain
* headers - HTTP headers as before
* bodies - HTTP bodies as before
* requests - HTTP request bodies
* responses - HTTP response bodies
* auth - HTTP auth
* filters - Filter rexeps
Leave --dump-headers and --dump-bodies for the time being but remove
the other --dump-* flags as they aren't used very often.
This was leaking goroutines in the short file case beause it wasn't
calling Close() on the Account object. This became apparent when
testing with mount.
Previously config sub commands were manually parsed rather than using
cobra.
Make config command have the following sub commands:
* create Create a new remote with name, type and options.
* delete Delete an existing remote <name>.
* dump Dump the config file as JSON.
* edit Enter an interactive configuration session.
* file Show path of configuration file in use.
* providers List in JSON format all the providers and options.
* show Print (decrypted) config file, or the config for a single remote.
* update Update options in an existing remote.
The following changes were made to existing commands
* listproviders was renamed to providers
* listoptions was removed in favour of providing the output in providers
* jsonconfig was renamed to create
* an optional parameter was added to the show command
This works by using a transform function to transform file names when
doing a compare when matching file names in a directory. rclone now
UTF-8 normalizes the file names and does a case insensitive compare if
the destination remote is case insensitive.
This deprecates the --local-no-unicode-normalization flag.
Fixes#1477
During the sync we collect a list of directories which should be empty
and attempt to rmdir them at the end of the sync. If the directories
are not empty then the rmdir will fail, logging a message but not
erroring the sync.
* Fixup bitrot (rclone and Azure library)
* Implement Copy
* Add modtime to metadata under mtime key as RFC3339Nano
* Make multipart upload work
* Make it pass the integration tests
* Fix uploading of zero length blobs
* Rename to azureblob as it seems likely we will do azurefile
* Add docs
Add new package qingstor to support QingStor API.
Add new unit test for its and tested through; But I commented
on some tests case because of some of the features of QingStor.
Add new docs for it.
This is useful if there are duplicates. Assuming the remote delivers
the entries in a consistent order, this will give the best user
experience in syncing as it will consistently use the first entry for
the sync comparison.
This simplifies the implementation of remotes. The only required
interface is now `List` which is a simple one level directory list.
Optionally remotes may implement `ListR` if they have an efficient way
of doing a recursive list.
The ListR interface will be implemented by remotes that can do a
recursive directory listing more efficiently than just recursing
through the directories. These include the bucket based remotes.
This is a fix left over from the v2 conversion. Dropbox ignores the
client modification on an incoming file if it was identical to the
existing file. This change deletes the existing file first before
re-uploading the new one.
If set in the config file, these override the ones configured into the
remote. This enables alternative oauth servers to be used for all
oauth remotes. This can only be altered by editing the config file
for the moment.
* fully write new config file before moving to target location (fixes#1287)
* do not fail if there is no previous config; print temporary config path on failure
* Add options to Put, PutUnchecked and Update for all Fses
* Use these to create HashOption
* Implement this in local
* Pass the option in fs.Copy
This has the effect that we only calculate hashes we need to in the
local Fs which speeds up transfers significantly.
* add support to hashing module
* add dbhashsum to list the hashes
* add support to dropbox module
This means objects up and downloaded to/from Dropbox will have their
hashes checked.
Note after this change local objects are calculating MD5, SHA1 and
DBHASH which is excessive and needs to be fixed.
This makes rclone with encrypted config better suited for use in
pipelines. E.g.:
$ rclone lsl mydrive:Some/Dir | sort -k 4
If the password prompt ("Enter configuration password") is printed to
stdout, it will be swallowed by sort. By printing it to stderr, you
still see the prompt, without sacrificing compatibility with the unix
pipeline.
What was happening is that when Move was implemented as Copy + Delete,
MoveFile was seeing the files didn't need transferring (because they
were identical) then deleted the source.
The fix uses Move instead and patches onedrive to disallow a case
folded identical copy (which errors with 500 error)
* -vv or --log-level DEBUG
* -v or --log-level INFO
* --log-level NOTICE (default)
* -q --log-level ERROR
Replace Config.Verbose and Config.Quiet with Config.LogLevel
Fixes#739Fixes#1108Fixes#1000
Multiple directories (up to --checkers worth) are scanned at once.
This uses much less memory than the previous scheme - only the amount
of memory needed to hold an entire directory listing of objects.
For directory based remotes the speed is unchanged.
For bucket based remotes, instead of doing one API call to list the
whole bucket, it does multiple calls, one for each pseudo directory.
However these are done in parallel so in practice this seems to speed
up directory listings.
This replaces the existing sync method as it performs faster and uses
less memory.
The old sync method is available with the temporary --old-sync-method
flag.
Fixes#517Fixes#439Fixes#236Fixes#1067
Optional interfaces are becoming more important in rclone,
--track-renames and --backup-dir both rely on them.
Up to this point rclone has used interface upgrades to define optional
behaviour on Fs objects. However when one Fs object wraps another it
is very difficult for this scheme to work accurately. rclone has
relied on specific error messages being returned when the interface
isn't supported - this is unsatisfactory because it means you have to
call the interface to see whether it is supported.
This change enables accurate detection of optional interfaces by use
of a Features struct as returned by an obligatory Fs.Features()
method. The Features struct contains flags and function pointers
which can be tested against nil to see whether they can be used.
As a result crypt and hubic can accurately reflect the capabilities of
the underlying Fs they are wrapping.
This changes `ListFn`'s implementation so that if it encounters a not
found error, instead of sending a fatal error to log, it coordinates the
return of the error between checker goroutines and sends it back to the
caller.
The main impetus here is that it allows an external program compiling
against rclone as a package to handle a not found, where it currently it
cannot.
This does change error output on a not found a little bit, we go from
this:
2017/01/09 21:14:03 directory not found
To this:
2017/01/09 21:13:44 Failed to ls: directory not found
- Only start the token ticker when the timetable entry has more than one
entry.
- This fixes the "Scheduled bandwidth change" log message when no
bwlimit is specified.
- Fixes#987
These are set in the form RCLONE_CONFIG_remote_option where remote is
the uppercased remote name and option is the uppercased config file
option name. Note that RCLONE_CONFIG_remote_TYPE must be set if
defining a new remote.
Fixes#616
- Change the --bwlimit command line parameter to accept both a limit (as
before) or a full timetable (formatted as "hh:mm,limit
hh:mm,limit...")
- The timetable is checked once a minute by a ticker function. A new
tokenBucket is created every time a bandwidth change is necessary.
- This change is compatible with the SIGUSR2 change to toggle bandwidth
limits.
This resolves#221.
This commits adds support for tracking of file renames if `track-renames` flag is set,
and it then performs server-side renames for remotes that support it, i.e.
remotes that implement either the `Mover` or the `Copier` interface.
Fixes#888