"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
Find a file
Michał Matczuk f396550934 backend/local: Avoid polluting page cache when uploading local files to remote backends
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.

```
fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768
read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768
read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768
read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768
fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768
read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768
read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768
read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768
fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768
read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768
read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768
read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768
fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0
```

Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat).

This patch does not have a performance impact. Please find below results
of an experiment comparing local copy of 1GB file with and without this
patch.

With the patch:

```
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
(mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 13.19
        System time (seconds): 1.12
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2212
        Voluntary context switches: 5755
        Involuntary context switches: 9782
        Swaps: 0
        File system inputs: 4155264
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
```

Without the patch:

```
(master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 14.46
        System time (seconds): 0.81
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2228
        Voluntary context switches: 7190
        Involuntary context switches: 1980
        Swaps: 0
        File system inputs: 2097152
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(master)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 262144    | 100.000 |
+-----------+----------------+------------+-----------+---------+
```
2019-08-08 23:41:52 +01:00
.circleci build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
.github build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
backend backend/local: Avoid polluting page cache when uploading local files to remote backends 2019-08-08 23:41:52 +01:00
bin serve: add auth proxy infrastructure 2019-08-06 11:43:42 +01:00
cmd Revert "cmd: shorten the locking window when using --progress to avoid deadlock" 2019-08-08 15:19:41 +01:00
docs docs: update bugs and limitations document 2019-08-04 12:33:39 +01:00
fs accounting: fix locking in Transfer to avoid deadlock with --progress 2019-08-08 15:46:46 +01:00
fstest lib/random: unify random string generation into random.String 2019-08-06 12:44:08 +01:00
graphics New graphics used by forum.rclone.org 2016-10-04 11:31:42 +01:00
lib lib/random: unify random string generation into random.String 2019-08-06 12:44:08 +01:00
vendor build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
vfs vfs: make write without cache more efficient 2019-08-08 12:37:50 +01:00
.appveyor.yml build: fix appveyor secure variables after project move 2019-07-28 22:46:26 +01:00
.gitattributes build: add azure pipelines build 2019-08-06 10:31:32 +01:00
.gitignore chore: update .gitignore 2019-06-19 11:59:46 +01:00
.golangci.yml build: move linter build tags into Makefile to fix golangci-lint 2019-04-12 15:48:36 +01:00
.pkgr.yml Updated .pkgr.yml file to use rclone as its own cli. 2017-03-29 17:48:53 +01:00
.travis.yml build: fix up CI and CI badges after repo move 2019-07-28 20:07:04 +01:00
azure-pipelines.yml build: add azure pipelines build 2019-08-06 10:31:32 +01:00
CONTRIBUTING.md build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
COPYING Initial commit - some small parts working 2012-11-18 17:32:31 +00:00
go.mod build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
go.sum vendor: add github.com/sirupsen/logrus 2019-07-28 12:05:50 +01:00
MAINTAINERS.md build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
Makefile build: add azure pipelines build 2019-08-06 10:31:32 +01:00
MANUAL.html build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
MANUAL.md build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
MANUAL.txt build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
notes.txt Replace test_all.sh with test_all.go which is cross platform and parallel 2015-12-30 09:26:34 +00:00
rclone.1 build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
rclone.go build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
README.md build: add Azure Pipelines build status to README 2019-08-06 10:46:36 +01:00
RELEASE.md vendor: update all dependencies 2019-04-15 20:12:56 +01:00

Logo

Website | Documentation | Download | Contributing | Changelog | Installation | Forum |

Build Status Windows Build Status Build Status CircleCI Go Report Card GoDoc

Rclone

Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers.

Storage providers

Please see the full list of all storage providers and their features

Features

  • MD5/SHA-1 hashes checked at all times for file integrity
  • Timestamps preserved on files
  • Partial syncs supported on a whole file basis
  • Copy mode to just copy new/changed files
  • Sync (one way) mode to make a directory identical
  • Check mode to check for file hash equality
  • Can sync to and from network, e.g. two different cloud accounts
  • Optional encryption (Crypt)
  • Optional cache (Cache)
  • Optional FUSE mount (rclone mount)
  • Multi-threaded downloads to local disk
  • Can serve local or remote files over HTTP/WebDav/FTP/SFTP/dlna

Installation & documentation

Please see the rclone website for:

Downloads

License

This is free software under the terms of MIT the license (check the COPYING file included in this package).