Commit graph

281 commits

Author SHA1 Message Date
Nick Craig-Wood
16d1da2c1e vfs: remove item.metaDirty as it was confusing and not used
See discussion in #5277
2021-04-28 09:33:22 +01:00
Nick Craig-Wood
00a0ee1899 vfs: fix modtime changing when reading file into cache - fixes #5277
Before this change but after:

aea8776a43 vfs: fix modtimes not updating when writing via cache #4763

When a file was opened read-only the modtime was read from the cached
file. However this modtime wasn't correct leading to an incorrect
result.

This change fixes the definition of `item.IsDirty` to be true only
when the data is dirty. This fixes the problem as a read only file
isn't considered dirty.
2021-04-28 09:33:22 +01:00
albertony
2925e1384c Use binary prefixes for size and rate units
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
2021-04-27 02:25:52 +03:00
Leo Luan
8f23cae1c0 vfs: Add cache reset for --vfs-cache-max-size handling at cache poll interval
The vfs-cache-max-size parameter is probably confusing to many users.
The cache cleaner checks cache size periodically at the --vfs-cache-poll-interval
(default 60 seconds) interval and remove cache items in the following order.

(1) cache items that are not in use and with age > vfs-cache-max-age
(2) if the cache space used at this time still is larger than
vfs-cache-max-size, the cleaner continues to remove cache items that are
not in use.

The cache cleaning process does not remove cache items that are currently in use.
If the total space consumed by in-use cache items exceeds vfs-cache-max-size, the
periodical cache cleaner thread does not do anything further and leaves the in-use
cache items alone with a total space larger than vfs-cache-max-size.

A cache reset feature was introduced in 1.53 which resets in-use (but not dirty,
i.e., not being updated) cache items when additional cache data incurs an ENOSPC
error.  But this code was not activated in the periodical cache cleaning thread.

This patch adds the cache reset step in the cache cleaner thread during cache
poll to reset cache items until the total size of the remaining cache items is
below vfs-cache-max-size.
2021-04-26 17:55:52 +01:00
Nick Craig-Wood
2a40f00077 vfs: fix a code path which allows dirty data to be removed causing data loss
Before this change the VFS layer could remove a locally cached file
even if it had data which needed to be written back, thus causing data loss.

See: https://forum.rclone.org/t/rclone-1-55-doesnt-save-file-changes-if-the-file-has-been-reopened-during-upload-google-drive-mount/23646
2021-04-20 16:36:38 +01:00
Nick Craig-Wood
20e15e52a9 vfs: fix Create causing windows explorer to truncate files on CTRL-C CTRL-V
Before this fix, doing CTRL-C and CTRL-V on a file in Windows explorer
caused the **source** and the the destination to be truncated to 0.

This is because Windows opens the source file with Create with flags
`O_RDWR|O_CREATE|O_EXCL` but doesn't write to it - it only reads from
it. Rclone was taking the call to Create as a signal to always make a
new file, but this is incorrect.

This fix reads an existing file from the directory if it exists when
Create is called rather than always creating a new one. This fixes the
problem.

Fixes #5181
2021-03-31 14:48:02 +01:00
Nick Craig-Wood
b47d6001a9 vfs: fix directory renaming by renaming dirs cached in memory
Before this change, if a directory was renamed and it or any children
had virtual entries in it they weren't flushed.

The consequence of this was that the directory path got out sync with
the actual position of the directory in the tree, leading to listings
of the old directory rather than the new one.

The fix renames any directories remaining after the ForgetAll to have
the correct path which fixes the problem.

See: https://forum.rclone.org/t/after-a-directory-renmane-using-mv-files-are-not-visible-any-longer/22797
2021-03-22 09:07:01 +00:00
Nick Craig-Wood
a4c4ddf052 vfs: rename files in cache and cancel uploads on directory rename
Before this change rclone did not cancel an uploads or rename the
cached files in the directory cache when a directory was renamed.

This caused issues with uploads arriving in the wrong place on bucket
based file systems.

See: https://forum.rclone.org/t/after-a-directory-renmane-using-mv-files-are-not-visible-any-longer/22797
2021-03-22 09:07:01 +00:00
Nick Craig-Wood
c72d2c67ed vfs: add debug dump function to dump the state of the VFS cache 2021-03-22 09:06:44 +00:00
Nick Craig-Wood
5e95877840 vfs: fix modtime set if --vfs-cache-mode writes/full and no write
When using --vfs-cache-mode writes or full if a file was opened for
write intent, the modtime was set and the file was closed without
being modified the modtime would never be written back to storage.

The sequence of events

- app opens file with write intent
- app does set modtime
- rclone sets the modtime on the cache file, but not the remote file
  because it is open for write and can't be set yet
- app closes the file without changing it
- rclone doesn't upload the file because the file wasn't changed so
  the modtime doesn't get updated

This fixes the problem by making sure any unapplied modtime changes
are applied even if the file is not modified when being closed.

Fixes #4795
2021-03-16 13:36:48 +00:00
Nick Craig-Wood
8b491f7f3d vfs: fix modtimes changing by fractional seconds after upload #4763
Before this change, rclone would return the modification times of the
cache file or the pending modtime which would be more accurate than
the modtime that the backend was capable of.

This meant that the modtime would be change slightly when the item was
actually uploaded.

For example modification times on Google Drive would be rounded to the
nearest millisecond.

This fixes the VFS layer to always return modtimes directly from an
object stored on the remote, or rounded to the precision that the
remote is capable of.
2021-03-16 13:31:47 +00:00
Nick Craig-Wood
aea8776a43 vfs: fix modtimes not updating when writing via cache - fixes #4763
This reads modtime from a dirty cache item if it exists. This mirrors
the way reading the size works.

This fixes the mod time not updating when the file is written, only
when the writeback completes.

See: https://forum.rclone.org/t/rclone-mount-and-changing-timestamps-after-writes/22629
2021-03-16 13:31:47 +00:00
Nick Craig-Wood
c387eb8c09 vfs: don't set modification time if it was already correct
Before this change, rclone would set the modification time of an
object after it had been uploaded. However with --vfs-cache-mode
writes and above, the modification time of the object is already
correct as the cache backing file gets set with the correct
modification time before upload.

Setting the modification time causes another version to be created on
backends such as S3 so it should be avoided if possible.

This change checks to see if the modification time needs changing and
only sets it if necessary.

See: https://forum.rclone.org/t/produce-2-versions-when-overwrite-an-object-in-min-io/19634
2021-03-16 13:31:47 +00:00
Nick Craig-Wood
687a3b1832 vfs: fix data race discovered by the race detector
This fixes a place where we read from item.o without the item.mu held.
2021-03-15 19:22:07 +00:00
tYYGH
c0cf54067a
vfs: --vfs-used-is-size to report used space using recursive scan (#4043)
Some backends, most notably S3, do not report the amount of bytes used.
This patch introduces a new flag that allows instead of relying on the
backend, use recursive scan similar to `rclone size` to compute the total
used space. However, this is ineffective and should be used as a last resort.

Co-authored-by: Yves G <theYinYeti@yalis.fr>
2021-02-17 23:36:13 +03:00
Nick Craig-Wood
9d37c208b7 vfs: document simultaneous usage with the same cache shouldn't be used
Fixes #2227
2021-02-16 17:15:05 +00:00
Nick Craig-Wood
ae3963e4b4 fs: Add string alternatives for setting options over the rc
Before this change options were read and set in native format. This
means for example nanoseconds for durations or an integer for
enumerated types, which isn't very convenient for humans.

This change enables these types to be set with a string with the
syntax as used in the command line instead, so `"10s"` rather than
`10000000000` or `"DEBUG"` rather than `8` for log level.
2021-02-07 14:56:41 +00:00
albertony
459cc70a50 vfs: fix invalid cache path on windows when using :backend: as remote
The initial ':' is included in the ad-hoc remote name, but is illegal character
in Windows path. Replacing it with '^', which is legal in filesystems but illegal
in regular remote names, so name conflict is avoided.

Fixes #4544
2021-01-30 16:18:15 +00:00
Martin Michlmayr
cd075f1703 docs: fix markup of arguments #4276
Command line arguments have to be marked as code.
2021-01-25 22:40:46 +03:00
Nick Craig-Wood
b80d498304 vfs: fix file leaks with --vfs-cache-mode full and --buffer-size 0
Before this change using --vfs-cache-mode full and --buffer-size 0
together caused the vfs downloader to open more and more downloaders.

This is fixed by introducing a minimum size of 1M for the window to
look for an existing downloader.

Fixes #4892
2021-01-21 18:35:04 +00:00
albertony
9db51117dc vfs: docs: add note on os specific option 2020-12-28 13:59:34 +00:00
albertony
a9c9467210 vfs: docs: add note about use of --transfers in section about performance 2020-12-28 13:59:34 +00:00
albertony
e92cb9e8f8 mount: more user friendly mounting as network drive on windows
Add --network-mode option to activate mounting as network drive without having to set volume prefix.
Add support for automatic drive letter assignment (not specific to network drive mounting).
Allow full network share unc path in --volname, which will also implicitely activate network drive mounting.
Allow full network share unc path as mountpoint, which will also implicitely activate network drive mounting, and the specified path will be used as volume prefix and the remote will be mounted on an automatically assigned drive letter instead.
2020-12-28 13:59:34 +00:00
Nick Craig-Wood
4f8ee736b1 vfs: make cache dir absolute before using it to fix path too long errors
If --cache-dir is passed in as a relative path, then rclone will not
be able to turn it into a UNC path under Windows, which means that
file names longer than 260 chars will fail when stored in the cache.

This patch makes the --cache-dir path absolute before using it.

See: https://forum.rclone.org/t/handling-of-long-paths-on-windows-260-characters/20913
2020-12-11 10:00:51 +00:00
Nick Craig-Wood
2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
Nick Craig-Wood
2347762b0d vfs: fix "file already exists" error for stale cache files
Before this change if a file was uploaded through a mount, then
deleted externally, trying to upload that file again could give EEXIST
"file already exists".

This was because the file already existing in the cache was confusing
rclone into thinking it already had the file.

The fix is to check that if rclone has a stale cache file then to
ignore it in this situation.

See: https://forum.rclone.org/t/rclone-cant-reuse-filenames/20400
2020-11-13 10:32:21 +00:00
Nick Craig-Wood
f980f230c5 vfs: fix virtual entries causing deleted files to still appear
Before this change, if a file was created on a remote but deleted
externally from that remote then there was potential for the delete to
never be noticed.

The sequence of events was:

- Create file on VFS - creates virtual directory entry
- File deleted externally to remote before the directory refreshed
- Now the file has a virtual add but is not in the listings so will never disappear

This patch fixes it by removing all virtual directory entries except
the following when the directory is re-read.

- On remotes which can't have empty directories: virtual directory
  adds are not flushed. These will remain virtual as long as the
  directory is empty.

- For virtual file add: files that are in the process of being
  uploaded are not flushed

This patch also adds the distinction between virtually added files and
directories.

It also refactors the virtual directory logic to make it easier to follow.

Fixes #4446
2020-11-10 16:47:25 +00:00
Nick Craig-Wood
1fb6ad700f accounting: add context.Context #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood
d69b96a94c test: Add context to mockfs.NewFs #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood
d846210978 fs: Add context to NewFs #3257 #4685
This adds a context.Context parameter to NewFs and related calls.

This is necessary as part of reading config from the context -
backends need to be able to read the global config.
2020-11-09 18:05:54 +00:00
Nick Craig-Wood
43e0929339 vfs: fix vfs/refresh calls with fs= parameter
Before this change rclone gave an error when the fs parameter was
provided.

This change removes the fs parameter from the parameters once it has
been read which avoids the error.

See: https://forum.rclone.org/t/precaching-with-vfs-refresh-fails-with-an-error-when-having-multiple-cloud-drives/20267
2020-11-07 14:26:33 +00:00
Josh Soref
3e1cb8302a docs: spelling: etc.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
e4a87f772f docs: spelling: e.g.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
bbe7eb35f1 docs: spelling: server-side
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref
d0888edc0a Spelling fixes
Fix spelling of: above, already, anonymous, associated,
authentication, bandwidth, because, between, blocks, calculate,
candidates, cautious, changelog, cleaner, clipboard, command,
completely, concurrently, considered, constructs, corrupt, current,
daemon, dependencies, deprecated, directory, dispatcher, download,
eligible, ellipsis, encrypter, endpoint, entrieslist, essentially,
existing writers, existing, expires, filesystem, flushing, frequently,
hierarchy, however, implementation, implements, inaccurate,
individually, insensitive, longer, maximum, metadata, modified,
multipart, namedirfirst, nextcloud, obscured, opened, optional,
owncloud, pacific, passphrase, password, permanently, persimmon,
positive, potato, protocol, quota, receiving, recommends, referring,
requires, revisited, satisfied, satisfies, satisfy, semver,
serialized, session, storage, strategies, stringlist, successful,
supported, surprise, temporarily, temporary, transactions, unneeded,
update, uploads, wrapped

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-14 15:21:31 +01:00
Nick Craig-Wood
a86cedbc24 vfs: Fix --no-modtime to not attempt to set modtimes (as documented)
See: https://forum.rclone.org/t/rclone-mount-with-azure-blob-archive-tier/19414
2020-10-09 17:01:16 +01:00
Leo Luan
c5c56cda02 vfs: Add a missed update of used cache space
The missed update can cause incorrect before-cleaning cache stats
and a pre-mature condition broadcast in purgeOld before the cache
space use is reduced below the quota.
2020-10-06 16:35:23 +01:00
Leo Luan
2295123cad vfs: Add exponential backoff during ENOSPC retries
Add an exponentially increasing delay during retries up ENOSPC error
to avoid exhausting the 10 retries too soon when the cache space
recovery from item resets is not available from the file system yet
or consumed by other large cache writes.
2020-10-06 16:35:23 +01:00
Leo Luan
ff0280c0cb vfs: Fix missed concurrency control between some item operations and reset
Item reset is invoked by cache cleaner for synchronous recovery
from ENOSPC errors. The reset operation removes the cache file and
closes/reopens the downloaders.  Although most parts of reset and
other item operations are done with the item mutex held, the mutex
is released during fd.WriteAt and downloaders calls. We used preAccess
and postAccess calls to serialize Reset, ReadAt, and Open, but missed
some other item operations. The patch adds preAccess/postAccess
calls in Sync, Truncate, Close, WriteAt, and rename.
2020-10-06 16:35:23 +01:00
Leo Luan
64d736a57b vfs: Fix a race condition in retryFailedResets
A failed item reset is saved in the errItems for retryFailedResets
to process.  If the item gets closed before the retry, the item may
have been removed from the c.item array. Previous code did not
account for this condition. This patch adds the check for the
exitence of the retry items in retryFailedResets.
2020-10-06 16:35:23 +01:00
Leo Luan
5f1d5a1897 vfs: Fix a deadlock vulnerability in downloaders.Close
The downloaders.Close() call acquires the downloaders' mutex before
calling the wait group wait and the main downloaders thread has a
periodical (5 seconds interval) call to kick its waiters and the
waiter dispatch function tries to get the mutex. So a deadlock can
occur if the Close() call starts, gets the mutex, while the main
downloader thread already got the timer's tick and proceeded to
call kickWaiters. The deadlock happens when the Close call gets
the mutex between the timer's kick and the main downloader thread
gets the mutex first. So it's a pretty short period of time and
it probably explains why the problem has not surfaced, maybe
something like tens of nanoseconds out of 5 seconds (~10^^-8).
It took 5 days of continued stressing the Close calls for the
deadlock to appear.
2020-10-06 16:35:23 +01:00
Hekmon
66def93373 mount cmd: update systemd status with cache stats 2020-10-06 16:21:30 +01:00
Nick Craig-Wood
18ccf0f871 vfs: detect and recover from a file being removed externally from the cache
Before this change if a file was removed from the cache while rclone
is running then rclone would not notice and proceed to re-create it
full of zeros.

This change notices files that we expect to have data in going missing
and if they do logs an ERROR recovers.

It isn't recommended deleting files from the cache manually with
rclone running!

See: https://forum.rclone.org/t/corrupted-data-streaming-after-vfs-meta-files-removed/18997
Fixes #4602
2020-09-18 10:30:02 +01:00
Nick Craig-Wood
6a56ac1032 vfs,local: Log an ERROR if we fail to set the file to be sparse
See: https://forum.rclone.org/t/rclone-1-53-release/18880/73
2020-09-11 15:36:47 +01:00
Nick Craig-Wood
27b9ae4fc3 vfs: fix spurious error "vfs cache: failed to _ensure cache EOF"
Before this change the error message was produced for every file which
was confusing users.

After this change we check for EOF and return from ReadAt at that
point.

See: https://forum.rclone.org/t/rclone-1-53-release/18880/10
2020-09-03 10:25:00 +01:00
Nick Craig-Wood
3305079a03 vfs: fix typos in help 2020-09-02 14:24:44 +01:00
Sam Edwards
23b2c58018 vfs: Quiet removeNotInUse logging to debug when not removing 2020-09-02 11:55:20 +01:00
Nick Craig-Wood
70c8566cb8 fs: Pin created backends until parents are finalized
This attempts to solve the backend lifecycle problem by

- Pinning backends mentioned on the command line into the cache
  indefinitely

- Unpinning backends when the containing structure (VFS, wrapping
  backend) is destroyed

See: https://forum.rclone.org/t/rclone-rc-backend-command-not-working-as-expected/18834
2020-09-01 18:21:03 +01:00
Leo Luan
c665201b85 vfs: support synchronous cache space recovery upon ENOSPC
This patch provides the support of synchronous cache space recovery
to allow read threads to recover from ENOSPC errors when cache space
can be recovered from cache items that are not in use or safe to be
reset/emptied .

The patch complements the existing cache cleaning process in two ways.

Firstly, the existing cache cleaning process is time-driven that runs
periodically. The cache space can run out while the cache cleaner
thread is still waiting for its next scheduled run. The io threads
encountering ENOSPC return an internal error to the applications
in this case even when cache space can be recovered to avoid this
error. This patch addresses this problem by having the read threads
kick the cache cleaner thread in this condition to recover cache
space preventing unnecessary ENOSPC errors from being seen by the
applications.

Secondly, this patch enhances the cache cleaner to support cache
item reset. Currently the cache purge process removes cache
items that are not in use. This may not be sufficient when the
total size of the working set exceeds the cache directory's
capacity. Like in the current code, this patch starts the purge
process by removing cache files that are not in use. Cache items
whose access times are older than vfs-cache-max-age are removed first.
After that, other not-in-use items are removed in LRU order until
vfs-cache-max-size is reached. If the vfs-cache-max-size (the quota)
is still not reached at this time, this patch adds a cache reset
step to reset/empty cache files that are still in use but not
dirtied.  This enables application processes to continue without
seeing an error even when the working set depletes the cache space
as long as there is not a large write working set hoarding the
entire cache space.

By design this patch does not add ENOSPC error recovery for write
IOs. Rclone does not empty a write cache item until the file data
is written back to the backend upon close. Allowing more cache
space to be consumed by dirty cache items when the cache space is
already running low would increase the risk of exhausting the cache
space in a way that the vfs mount becomes unreadable.
2020-08-25 21:12:06 +01:00
Nick Craig-Wood
94a0991584 vfs: set the modtime of the cache file immediately
Before this change we set the modtime of the cache file when all
writers had finished.

This has the unfortunate effect that the file is uploaded with the
wrong modtime which means on backends which can't set modtimes except
when uploading files it is wrong.

This change sets the modtime of the cache file immediately in the
cache and in turn sets the modtime in the file info.
2020-08-20 16:24:04 +01:00