Commit graph

147 commits

Author SHA1 Message Date
Nick Craig-Wood
03295bbc3c azureblob: fix data corruption bug #7590
It was reported that rclone copy occasionally uploaded corrupted data
to azure blob.

This turned out to be a race condition updating the block count which
caused blocks to be duplicated.

This bug was introduced in this commit in v1.64.0 and will be fixed in v1.65.2

0427177857 azureblob: implement OpenChunkWriter and multi-thread uploads #7056

This race only seems to happen if `--checksum` is used but can happen otherwise.

Unfortunately Azure blob does not check the MD5 that we send them so
despite sending incorrect data this corruption is not detected. The
corruption is detected when rclone tries to download the file, so
attempting to copy the files back to local disk will result in errors
such as:

    ERROR : file.pokosuf5.partial: corrupted on transfer: md5 hash differ "XXX" vs "YYY"

This adds a check to test the blocklist we upload is as we expected
which would have caught the problem had it been in place earlier.
2024-01-24 11:28:05 +00:00
Nick Craig-Wood
519fe98e6e azureblob: implement --azureblob-delete-snapshots
This flag controls what happens when we try to delete a blob with a
snapshot. The UI follows the azcopy tool.

See: https://forum.rclone.org/t/how-to-delete-undeleted-blobs-on-azure/43911/
2024-01-13 14:27:54 +00:00
Ivan Yanitra
0ee6d0b4bf azureblob: add support cold tier 2023-10-18 17:54:25 +01:00
Dimitri Papadopoulos Orfanos
3d473eb54e
docs: fix typos found by codespell in docs and code comments 2023-09-23 12:20:01 +01:00
Nick Craig-Wood
55c12c9a2d azureblob: fix "fatal error: concurrent map writes"
Before this change, the metadata map could be accessed from multiple
goroutines at once, sometimes causing this error.

This fix adds a global mutex for adjusting the metadata map to make
all accesses safe.

See: https://forum.rclone.org/t/azure-blob-storage-with-vfs-cache-concurrent-map-writes-exception/41686
2023-09-16 11:33:03 +01:00
Nick Craig-Wood
bd23ea028e azureblob: fix purging with directory markers 2023-09-05 17:07:44 +01:00
Nick Craig-Wood
b1c0ae5e7d azureblob: fix creation of directory markers
This also fixes the integration tests which is why we didn't notice this before!
2023-09-03 18:09:31 +01:00
Nick Craig-Wood
2db0e23584 backends: change OpenChunkWriter interface to allow backend concurrency override
Before this change the concurrency used for an upload was rather
inconsistent.

- if size below `--backend-upload-cutoff` (default 200M) do single part upload.

- if size below `--multi-thread-cutoff` (default 256M) or using streaming
  uploads (eg `rclone rcat) do multipart upload using
  `--backend-upload-concurrency` to set the concurrency used by the uploader.

- otherwise do multipart upload using `--multi-thread-streams` to set the
  concurrency.

This change makes the default for the concurrency used be the
`--backend-upload-concurrency`. If `--multi-thread-streams` is set and larger
than the `--backend-upload-concurrency` then that will be used instead.

This means that if the user sets `--backend-upload-concurrency` then it will be
obeyed for all multipart/multi-thread transfers and the user can override them
all with `--multi-thread-streams`.

See: #7056
2023-09-03 11:47:05 +01:00
Nick Craig-Wood
0427177857 azureblob: implement OpenChunkWriter and multi-thread uploads #7056
This implements the OpenChunkWriter interface for azureblob which
enables multi-thread uploads.

This makes the memory controls of the s3 backend inoperative; they are
replaced with the global ones.

    --azureblob-memory-pool-flush-time
    --azureblob-memory-pool-use-mmap

By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
2023-08-24 12:39:27 +01:00
Anagh Kumar Baranwal
0ef0e908ca build: update to go1.21rc3 and make go1.19 the minimum required version
Signed-off-by: Anagh Kumar Baranwal <6824881+darthShadow@users.noreply.github.com>
2023-07-16 10:09:25 +01:00
Nick Craig-Wood
d0d41fe847 rclone config redacted: implement support mechanism for showing redacted config
This introduces a new fs.Option flag, Sensitive and uses this along
with IsPassword to redact the info in the config file for support
purposes.

It adds this flag into backends where appropriate. It was necessary to
add oauthutil.SharedOptions to some backends as they were missing
them.

Fixes #5209
2023-07-07 16:25:14 +01:00
Nick Craig-Wood
72b79504ea azureblob: fix "Entry doesn't belong in directory" errors when using directory markers
Before this change we were incorrectly identifying the root directory
of the listing and adding it into the listing.

This caused higher layers of rclone to emit the error above.

See #7038
2023-06-23 18:01:11 +01:00
Nick Craig-Wood
f080ec437c azureblob: empty directory markers #3453 2023-05-07 12:47:09 +01:00
Nick Craig-Wood
70fe2ac852 azureblob: fix azure blob uploads with multiple bits of metadata 2023-05-07 12:47:09 +01:00
Roel Arents
a79db20bcd azureblob: send nil tier if empty string 2023-04-05 15:08:32 +01:00
albertony
fb316123ec azureblob: remove unused code (fixes issue reported by the unused linter) 2023-03-26 14:28:15 +02:00
Dimitri Papadopoulos
9183618082 backend: fix typos found by codespell 2023-03-24 20:42:45 +00:00
Nick Craig-Wood
22daeaa6f3 build: update dependencies
This fixes the azureblob backend so it builds again after the SDK
changes.

This doesn't update bazil.org/fuse because it doesn't build on FreeBSD

https://github.com/bazil/fuse/issues/295
2023-03-10 11:15:07 +00:00
Nick Craig-Wood
559157cb58 azureblob: remove workarounds for SDK bugs after v0.6.1 update 2023-01-16 11:19:16 +00:00
Nick Craig-Wood
10bf8a769e build: update dependencies
This fixes the azureblob backend so it builds again after the SDK
changes.
2023-01-16 11:19:16 +00:00
Nick Craig-Wood
54c0f17f2a azureblob: fix "409 Public access is not permitted on this storage account"
This error was caused by rclone supplying an empty
`x-ms-blob-public-access:` header when creating a container for
private access, rather than omitting it completely.

This is a valid way of specifying containers should be private, but if
the storage account has the flag "Blob public access" unset then it
gives "409 Public access is not permitted on this storage account".

This patch fixes the problem by only supplying the header if the
access is set.

Fixes #6645
2022-12-23 12:28:07 +00:00
Abdullah Saglam
7be9855a70 azureblob: implement --use-server-modtime
This patch implements --use-server-modtime for the Azureblob backend.

It does this by not reading the time from the metadata if the global
flag is set.
2022-12-15 15:58:36 +00:00
Nick Craig-Wood
3d291da0f6 azureblob: fix directory marker detection after SDK upgrade
When the SDK was upgraded it started delivering metadata where the
keys were not in lower case as per the old SDK.

Rclone normalises the case of the keys for storage in the Object, but
the directory marker check was being done with the unnormalised keys
as it needs to be done before the Object is created.

This fixes the directory marker check to do a case insensitive compare
of the metadata keys.
2022-12-14 14:24:26 +00:00
Nick Craig-Wood
d7cb17848d azureblob: revamp authentication to include all methods and docs
The updates the authentication to include

- Auth from the environment
    1. Environment Variables
    2. Managed Service Identity Credentials
    3. Azure CLI credentials (as used by the az tool)
- Account and Shared Key
- SAS URL
- Service principal with client secret
- Service principal with certificate
- User with username and password
- Managed Service Identity Credentials

And rationalises the auth order.
2022-12-06 15:07:01 +00:00
Nick Craig-Wood
f3c8b7a948 azureblob: add --azureblob-no-check-container to assume container exists
Normally rclone will check the container exists before uploading if it
hasn't listed the container yet.

Often rclone will be running with a limited set of permissions which
means rclone can't create the container anyway, so this stops the
check.

This will save a transaction.
2022-12-06 15:07:01 +00:00
Nick Craig-Wood
914fbe242c azureblob: ignore AuthorizationFailure when trying to create a create a container
If we get AuthorizationFailure when trying to create a container, then
assume the container has already been created
2022-12-06 15:07:01 +00:00
Nick Craig-Wood
f746b2fe85 azureblob: port old authentication methods to new SDK
Co-authored-by: Brad Ackerman <brad@facefault.org>
2022-12-06 15:07:01 +00:00
Nick Craig-Wood
a131da2c35 azureblob: Port to new SDK
This commit switches from using the old Azure go modules

    github.com/Azure/azure-pipeline-go/pipeline
    github.com/Azure/azure-storage-blob-go/azblob
    github.com/Azure/go-autorest/autorest/adal

To the new SDK

    github.com/Azure/azure-sdk-for-go/

This stops rclone using deprecated code and enables the full range of
authentication with Azure.

See #6132 and #5284
2022-12-06 15:07:01 +00:00
Nathaniel Wesley Filardo
f7cdf318db azureblob: support simple "environment credentials"
As per
https://learn.microsoft.com/en-us/dotnet/api/azure.identity.environmentcredential?view=azure-dotnet

This supports only AZURE_CLIENT_SECRET-based authentication, as with the
existing service principal support.

Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
2022-11-24 12:06:14 +00:00
Nathaniel Wesley Filardo
6f3682c12f azureblob: make newServicePrincipalTokenRefresher take parsed principal structure 2022-11-24 12:06:14 +00:00
rkettelerij
87fa9f8e46 azureblob: Add support for custom upload headers 2022-11-14 15:12:28 +00:00
Roel Arents
e455940f71
azureblob: allow emulator account/key override 2022-11-08 20:24:06 +00:00
albertony
5d6b8141ec Replace deprecated ioutil
As of Go 1.16, the same functionality is now provided by package io or
package os, and those implementations should be preferred in new code.
2022-11-07 11:41:47 +00:00
albertony
555def2da7 build: add package comments to silence revive linter 2022-08-28 13:43:51 +02:00
Nick Craig-Wood
0501773db1 azureblob,b2,s3: fix chunksize calculations producing too many parts
Before this fix, the chunksize calculator was using the previous size
of the object, not the new size of the object to calculate the chunk
sizes.

This meant that uploading a replacement object which needed a new
chunk size would fail, using too many parts.

This fix fixes the calculator to take the size explicitly.
2022-08-09 12:57:38 +01:00
Nick Craig-Wood
6fd9e3d717 build: reformat comments to pass go1.19 vet
See: https://go.dev/doc/go1.19#go-doc
2022-08-05 16:35:41 +01:00
Lorenzo Maiorfi
b5efffee9d
azureblob: allow remote emulator (azurite) - fixes #6290 2022-07-06 11:54:04 +01:00
albertony
7822df565e staticcheck: unused func 2022-07-04 11:24:59 +02:00
albertony
fdd2f8e6d2 Error strings should not be capitalized
Reported by staticcheck 2022.1.2 (v0.3.2)

See: staticcheck.io
2022-06-23 23:26:02 +02:00
Nick Craig-Wood
bb6edb3c39 build: update dependencies
Also:

- azureblob: fix compile after API change in upstream library
2022-06-08 18:29:42 +01:00
Rob Pickerill
6d342a3c5b azureblob: case insensitive access tier 2022-05-24 09:19:08 +01:00
Derek Battams
c0985e93b7 azureblob: use chunksize lib to determine chunksize dynamically 2022-05-13 09:25:48 +01:00
Leroy van Logchem
75455d4000
azureblob: Calculate Chunksize/blocksize to stay below maxUploadParts 2022-04-26 17:37:40 +01:00
Nick Craig-Wood
e4f5912294 azureblob: fix lint error with golangci-lint 1.45.0 2022-03-22 16:33:24 +00:00
Nick Craig-Wood
c2557cc432 azureblob: fix crash with SAS URL and no container - fixes #5820
Before this change attempting NewObject on a SAS URL's root would
crash the Azure SDK.

This change detects that using the code from this previous fix

f7404f52e7 azureblob: fix crash when listing outside a SAS URL's root - fixes #4851

And returns not object not found instead.

It also prevents things being uploaded to the root of the SAS URL
which also crashes the Azure SDK.
2021-11-27 16:18:18 +00:00
Nick Craig-Wood
df07964db3 azureblob: raise --azureblob-upload-concurrency to 16 by default
After speed testing it was discovered that upload speed goes up pretty
much linearly with upload concurrency. This patch changes the default
from 4 to 16 which means that rclone will use 16 * 4M = 64M per
transfer which is OK even for low memory devices.

This adds a note that performance may be increased by increasing
upload concurrency.

See: https://forum.rclone.org/t/performance-of-rclone-vs-azcopy/27437/9
2021-11-18 16:09:02 +00:00
Nick Craig-Wood
fbc4c4ad9a azureblob: remove 100MB upper limit on chunk_size as it is no longer needed 2021-11-18 16:09:02 +00:00
Nick Craig-Wood
4454b3e1ae azureblob: implement --azureblob-upload-concurrency parameter to speed uploads
See: https://forum.rclone.org/t/performance-of-rclone-vs-azcopy/27437
2021-11-18 16:08:57 +00:00
Nick Craig-Wood
e43b5ce5e5 Remove github.com/pkg/errors and replace with std library version
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.

This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
2021-11-07 11:53:30 +00:00
albertony
e2f47ecdeb docs: punctuation cleanup
See #5538
2021-10-20 22:56:19 +02:00