rclone/docs/content/tardigrade.md
Elek, Márton bcb07a67f6 tardigrade: update docs to explain differences between s3 and this backend
Co-authored-by: Caleb Case <calebcase@gmail.com>
2022-02-08 11:40:29 +00:00

411 lines
14 KiB
Markdown

---
title: "Tardigrade"
description: "Rclone docs for Tardigrade"
---
# {{< icon "fas fa-dove" >}} Tardigrade
[Tardigrade](https://tardigrade.io) is an encrypted, secure, and
cost-effective object storage service that enables you to store, back up, and
archive large amounts of data in a decentralized manner.
## Backend options
Storj can be used both with this native backend and with the [s3
backend using the Storj S3 compatible gateway](/s3/#storj) (shared or private).
Use this backend to take advantage of client-side encryption as well
as to achieve the best possible download performance. Uploads will be
erasure-coded locally, thus a 1gb upload will result in 2.68gb of data
being uploaded to storage nodes across the network.
Use the s3 backend and one of the S3 compatible Hosted Gateways to
increase upload performance and reduce the load on your systems and
network. Uploads will be encrypted and erasure-coded server-side, thus
a 1GB upload will result in only in 1GB of data being uploaded to
storage nodes across the network.
Side by side comparison with more details:
* Characteristics:
* *Tardigrade backend*: Uses native RPC protocol, connects directly
to the storage nodes which hosts the data. Requires more CPU
resource of encoding/decoding and has network amplification
(especially during the upload), uses lots of TCP connections
* *S3 backend*: Uses S3 compatible HTTP Rest API via the shared
gateways. There is no network amplification, but performance
depends on the shared gateways and the secret encryption key is
shared with the gateway.
* Typical usage:
* *Tardigrade backend*: Server environments and desktops with enough
resources, internet speed and connectivity - and applications
where tardigrades client-side encryption is required.
* *S3 backend*: Desktops and similar with limited resources,
internet speed or connectivity.
* Security:
* *Tardigrade backend*: __strong__. Private encryption key doesn't
need to leave the local computer.
* *S3 backend*: __weaker__. Private encryption key is [shared
with](https://docs.storj.io/dcs/api-reference/s3-compatible-gateway#security-and-encryption)
the authentication service of the hosted gateway, where it's
stored encrypted. It can be stronger when combining with the
rclone [crypt](/crypt) backend.
* Bandwidth usage (upload):
* *Tardigrade backend*: __higher__. As data is erasure coded on the
client side both the original data and the parities should be
uploaded. About ~2.7 times more data is required to be uploaded.
Client may start to upload with even higher number of nodes (~3.7
times more) and abandon/stop the slow uploads.
* *S3 backend*: __normal__. Only the raw data is uploaded, erasure
coding happens on the gateway.
* Bandwidth usage (download)
* *Tardigrade backend*: __almost normal__. Only the minimal number
of data is required, but to avoid very slow data providers a few
more sources are used and the slowest are ignored (max 1.2x
overhead).
* *S3 backend*: __normal__. Only the raw data is downloaded, erasure coding happens on the shared gateway.
* CPU usage:
* *Tardigrade backend*: __higher__, but more predictable. Erasure
code and encryption/decryption happens locally which requires
significant CPU usage.
* *S3 backend*: __less__. Erasure code and encryption/decryption
happens on shared s3 gateways (and as is, it depends on the
current load on the gateways)
* TCP connection usage:
* *Tardigrade backend*: __high__. A direct connection is required to
each of the Storj nodes resulting in 110 connections on upload and
35 on download per 64 MB segment. Not all the connections are
actively used (slow ones are pruned), but they are all opened.
[Adjusting the max open file limit](/tardigrade/#known-issues) may
be required.
* *S3 backend*: __normal__. Only one connection per download/upload
thread is required to the shared gateway.
* Overall performance:
* *Tardigrade backend*: with enough resources (CPU and bandwidth)
*tardigrade* backend can provide even 2x better performance. Data
is directly downloaded to / uploaded from to the client instead of
the gateway.
* *S3 backend*: Can be faster on edge devices where CPU and network
bandwidth is limited as the shared S3 compatible gateways take
care about the encrypting/decryption and erasure coding and no
download/upload amplification.
* Decentralization:
* *Tardigrade backend*: __high__. Data is downloaded directly from
the distributed cloud of storage providers.
* *S3 backend*: __low__. Requires a running S3 gateway (either
self-hosted or Storj-hosted).
* Limitations:
* *Tardigrade backend*: `rclone checksum` is not possible without
download, as checksum metadata is not calculated during upload
* *S3 backend*: secret encryption key is shared with the gateway
## Configuration
To make a new Tardigrade configuration you need one of the following:
* Access Grant that someone else shared with you.
* [API Key](https://documentation.tardigrade.io/getting-started/uploading-your-first-object/create-an-api-key)
of a Tardigrade project you are a member of.
Here is an example of how to make a remote called `remote`. First run:
rclone config
This will guide you through an interactive setup process:
### Setup with access grant
```
No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> remote
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
[snip]
XX / Tardigrade Decentralized Cloud Storage
\ "tardigrade"
[snip]
Storage> tardigrade
** See help for tardigrade backend at: https://rclone.org/tardigrade/ **
Choose an authentication method.
Enter a string value. Press Enter for the default ("existing").
Choose a number from below, or type in your own value
1 / Use an existing access grant.
\ "existing"
2 / Create a new access grant from satellite address, API key, and passphrase.
\ "new"
provider> existing
Access Grant.
Enter a string value. Press Enter for the default ("").
access_grant> your-access-grant-received-by-someone-else
Remote config
--------------------
[remote]
type = tardigrade
access_grant = your-access-grant-received-by-someone-else
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
```
### Setup with API key and passphrase
```
No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> remote
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
[snip]
XX / Tardigrade Decentralized Cloud Storage
\ "tardigrade"
[snip]
Storage> tardigrade
** See help for tardigrade backend at: https://rclone.org/tardigrade/ **
Choose an authentication method.
Enter a string value. Press Enter for the default ("existing").
Choose a number from below, or type in your own value
1 / Use an existing access grant.
\ "existing"
2 / Create a new access grant from satellite address, API key, and passphrase.
\ "new"
provider> new
Satellite Address. Custom satellite address should match the format: `<nodeid>@<address>:<port>`.
Enter a string value. Press Enter for the default ("us-central-1.tardigrade.io").
Choose a number from below, or type in your own value
1 / US Central 1
\ "us-central-1.tardigrade.io"
2 / Europe West 1
\ "europe-west-1.tardigrade.io"
3 / Asia East 1
\ "asia-east-1.tardigrade.io"
satellite_address> 1
API Key.
Enter a string value. Press Enter for the default ("").
api_key> your-api-key-for-your-tardigrade-project
Encryption Passphrase. To access existing objects enter passphrase used for uploading.
Enter a string value. Press Enter for the default ("").
passphrase> your-human-readable-encryption-passphrase
Remote config
--------------------
[remote]
type = tardigrade
satellite_address = 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S@us-central-1.tardigrade.io:7777
api_key = your-api-key-for-your-tardigrade-project
passphrase = your-human-readable-encryption-passphrase
access_grant = the-access-grant-generated-from-the-api-key-and-passphrase
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
```
{{< rem autogenerated options start" - DO NOT EDIT - instead edit fs.RegInfo in backend/tardigrade/tardigrade.go then run make backenddocs" >}}
### Standard options
Here are the standard options specific to tardigrade (Tardigrade Decentralized Cloud Storage).
#### --tardigrade-provider
Choose an authentication method.
- Config: provider
- Env Var: RCLONE_TARDIGRADE_PROVIDER
- Type: string
- Default: "existing"
- Examples:
- "existing"
- Use an existing access grant.
- "new"
- Create a new access grant from satellite address, API key, and passphrase.
#### --tardigrade-access-grant
Access grant.
- Config: access_grant
- Env Var: RCLONE_TARDIGRADE_ACCESS_GRANT
- Type: string
- Default: ""
#### --tardigrade-satellite-address
Satellite address.
Custom satellite address should match the format: `<nodeid>@<address>:<port>`.
- Config: satellite_address
- Env Var: RCLONE_TARDIGRADE_SATELLITE_ADDRESS
- Type: string
- Default: "us-central-1.tardigrade.io"
- Examples:
- "us-central-1.tardigrade.io"
- US Central 1
- "europe-west-1.tardigrade.io"
- Europe West 1
- "asia-east-1.tardigrade.io"
- Asia East 1
#### --tardigrade-api-key
API key.
- Config: api_key
- Env Var: RCLONE_TARDIGRADE_API_KEY
- Type: string
- Default: ""
#### --tardigrade-passphrase
Encryption passphrase.
To access existing objects enter passphrase used for uploading.
- Config: passphrase
- Env Var: RCLONE_TARDIGRADE_PASSPHRASE
- Type: string
- Default: ""
{{< rem autogenerated options stop >}}
## Usage
Paths are specified as `remote:bucket` (or `remote:` for the `lsf`
command.) You may put subdirectories in too, e.g. `remote:bucket/path/to/dir`.
Once configured you can then use `rclone` like this.
### Create a new bucket
Use the `mkdir` command to create new bucket, e.g. `bucket`.
rclone mkdir remote:bucket
### List all buckets
Use the `lsf` command to list all buckets.
rclone lsf remote:
Note the colon (`:`) character at the end of the command line.
### Delete a bucket
Use the `rmdir` command to delete an empty bucket.
rclone rmdir remote:bucket
Use the `purge` command to delete a non-empty bucket with all its content.
rclone purge remote:bucket
### Upload objects
Use the `copy` command to upload an object.
rclone copy --progress /home/local/directory/file.ext remote:bucket/path/to/dir/
The `--progress` flag is for displaying progress information.
Remove it if you don't need this information.
Use a folder in the local path to upload all its objects.
rclone copy --progress /home/local/directory/ remote:bucket/path/to/dir/
Only modified files will be copied.
### List objects
Use the `ls` command to list recursively all objects in a bucket.
rclone ls remote:bucket
Add the folder to the remote path to list recursively all objects in this folder.
rclone ls remote:bucket/path/to/dir/
Use the `lsf` command to list non-recursively all objects in a bucket or a folder.
rclone lsf remote:bucket/path/to/dir/
### Download objects
Use the `copy` command to download an object.
rclone copy --progress remote:bucket/path/to/dir/file.ext /home/local/directory/
The `--progress` flag is for displaying progress information.
Remove it if you don't need this information.
Use a folder in the remote path to download all its objects.
rclone copy --progress remote:bucket/path/to/dir/ /home/local/directory/
### Delete objects
Use the `deletefile` command to delete a single object.
rclone deletefile remote:bucket/path/to/dir/file.ext
Use the `delete` command to delete all object in a folder.
rclone delete remote:bucket/path/to/dir/
### Print the total size of objects
Use the `size` command to print the total size of objects in a bucket or a folder.
rclone size remote:bucket/path/to/dir/
### Sync two Locations
Use the `sync` command to sync the source to the destination,
changing the destination only, deleting any excess files.
rclone sync -i --progress /home/local/directory/ remote:bucket/path/to/dir/
The `--progress` flag is for displaying progress information.
Remove it if you don't need this information.
Since this can cause data loss, test first with the `--dry-run` flag
to see exactly what would be copied and deleted.
The sync can be done also from Tardigrade to the local file system.
rclone sync -i --progress remote:bucket/path/to/dir/ /home/local/directory/
Or between two Tardigrade buckets.
rclone sync -i --progress remote-us:bucket/path/to/dir/ remote-europe:bucket/path/to/dir/
Or even between another cloud storage and Tardigrade.
rclone sync -i --progress s3:bucket/path/to/dir/ tardigrade:bucket/path/to/dir/
## Limitations
`rclone about` is not supported by the rclone Tardigrade backend. Backends without
this capability cannot determine free space for an rclone mount or
use policy `mfs` (most free space) as a member of an rclone union
remote.
See [List of backends that do not support rclone about](https://rclone.org/overview/#optional-features)
See [rclone about](https://rclone.org/commands/rclone_about/)
## Known issues
If you get errors like `too many open files` this usually happens when the default `ulimit` for system max open files is exceeded. Native Storj protocol opens a large number of TCP connections (each of which is counted as an open file). For a single upload stream you can expect 110 TCP connections to be opened. For a single download stream you can expect 35. This batch of connections will be opened for every 64 MiB segment and you should also expect TCP connections to be reused. If you do many transfers you eventually open a connection to most storage nodes (thousands of nodes).
To fix these, please raise your system limits. You can do this issuing a `ulimit -n 65536` just before you run rclone. To change the limits more permanently you can add this to your shell startup script, e.g. `$HOME/.bashrc`, or change the system-wide configuration, usually `/etc/sysctl.conf` and/or `/etc/security/limits.conf`, but please refer to your operating system manual.