TrueCloudLab/frostfs-node

Fork 28

New epoch event affects stream of object.Delete operations #1433

New issue

Closed

opened 2024-10-16 16:08:31 +00:00 by alexvanin · 1 comment

alexvanin commented

2024-10-16 16:08:31 +00:00

Owner

Expected Behavior

Consecutive object.Delete operations are not affected by new epoch event.

Current Behavior

Some object.Delete operation fail with remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard

Investigation

The issue was found in AIO environment with single storage node. Node behaviour in dev-env environment with various container placement policies may be different.

This is still hypothesis, so read it with the grain of salt

The issue appears when new epoch triggers collectExpiredTombstones routine. This routine may be executing between these two operations:

Inhume removing target object by calling storage.Delete()
Store tombstone object

Between (1) and (2) GC may kick in and collect all available records from graveyard bucket in metabase. After that, GC checks if object should remain or be removed by applying GCMark.

To make decision, GC searches for object expiration data in cache and in the network. In this case, when (1) has happened and (2) has not, both checks fails. And GC decides to mark tombstone ID with GCMark.

Then, during (2) step, storage engine calls Exists method of the metabase. However, tombstone ID is already marked with GCMark, objectStatus check returns error, and put operation eventually fails.

Possible Solution

For this exact issue solution may be quite simple: switch order or operation in local target writer:

store tombstone first,
inhume object later.

Maybe there is a reason to keep such order, but this solution does not break any tests and fixes the issue.

Steps to Reproduce (for bugs)

Start frostfs-aio
Begin a stream of object.Delete operations
Tick new epoch during this stream

I wrote simple main.go.txt to create this load, but one can use k6 for the same purpose.

$ ./gcrace --wif KzPXA6669m2pf18XmUdoR8MnP1pi1PMmefiFujStVFnv7WR5SRmK
2024/10/16 18:01:14 INFO container has been created id=9yUfvtvh8ytyWktvet8yhNsdtXe3Tkh3fcBR5MZikVtd
2024/10/16 18:01:24 INFO objects has been uploaded amount=1000
2024/10/16 18:01:24 INFO now start to tick epochs
2024/10/16 18:01:27 WARN delete object error id=hkZePXKTbjsWpZeo7v9Qvm3oNZisvrKJj1t1ZouvGUE error="remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard"
2024/10/16 18:01:27 WARN delete object error id=HJBGHW8DoMT8v8mKtx99ezgKuUsp1RsL5y9MYVq3HQcb error="remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard"

Context

This issue was found during minio warp runs. In the end of the benchmark test, it removes plenty of objects and it fails sometimes.

$ ./warp client &
$ ./warp put --host=localhost:8084 --warp-client=localhost --access-key=GqwtfpuEAype3jQL8f3PZbroVZ5PyJgfZYpGHYKguEa80FFogcYoFmZN9aEmGHPYuYpogswvJWVCFXrEibAP951RB --secret-key=1a283733518fdc6d297d35607096b2459cef78731c329dc38ba374eae4b68b56 --obj.size=1KiB --concurrent=4 --duration=5m --disable-multipart --bucket=test-bucket

Regression

Your Environment

frostfs-aio v1.6.1
frostfs-node v0.42.15 / v0.44.0-rc.5-14-g90f36693 (master)

## Expected Behavior Consecutive object.Delete operations are not affected by new epoch event. ## Current Behavior Some object.Delete operation fail with `remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard` ## Investigation The issue was found in AIO environment with single storage node. Node behaviour in dev-env environment with various container placement policies may be different. **This is still hypothesis, so read it with the grain of salt** The issue appears when new epoch triggers [collectExpiredTombstones](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/local_object_storage/shard/control.go#L121) routine. This routine may be executing between [these two](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/services/object/common/writer/local.go#L34) operations: 1. Inhume removing target object by calling `storage.Delete()` 2. Store tombstone object Between (1) and (2) GC may kick in and collect all available records from graveyard bucket in [metabase](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/local_object_storage/shard/gc.go#L506). After that, GC [checks](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/a83dd231ec59688281839f44d5a58cbdb5606ac0/pkg/local_object_storage/shard/gc.go#L522) if object should remain or be removed by applying GCMark. To make decision, GC searches for object expiration data in [cache](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/a83dd231ec59688281839f44d5a58cbdb5606ac0/pkg/services/object_manager/tombstone/checker.go#L53) and in the [network](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/a83dd231ec59688281839f44d5a58cbdb5606ac0/pkg/services/object_manager/tombstone/checker.go#L58). In this case, when (1) has happened and (2) has not, both checks fails. And GC decides to [mark](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/a83dd231ec59688281839f44d5a58cbdb5606ac0/pkg/services/object_manager/tombstone/checker.go#L72) tombstone ID with GCMark. Then, during (2) step, storage engine calls Exists method of the metabase. However, tombstone ID is already marked with GCMark, [objectStatus](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/local_object_storage/metabase/exists.go#L104) check returns error, and put operation eventually fails. ![image](/attachments/22876fd2-f643-4b59-83e3-cce7d8b93246) ## Possible Solution For this exact issue solution may be quite simple: switch order or operation in [local target writer](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/services/object/common/writer/local.go#L34): 1) store tombstone first, 2) inhume object later. Maybe there is a reason to keep such order, but this solution does not break any tests and fixes the issue. ## Steps to Reproduce (for bugs) 1. Start frostfs-aio 2. Begin a stream of `object.Delete` operations 3. Tick new epoch during this stream I wrote simple [main.go.txt](/attachments/17612081-045f-4fc7-8c72-28caed814a60) to create this load, but one can use k6 for the same purpose. ``` $ ./gcrace --wif KzPXA6669m2pf18XmUdoR8MnP1pi1PMmefiFujStVFnv7WR5SRmK 2024/10/16 18:01:14 INFO container has been created id=9yUfvtvh8ytyWktvet8yhNsdtXe3Tkh3fcBR5MZikVtd 2024/10/16 18:01:24 INFO objects has been uploaded amount=1000 2024/10/16 18:01:24 INFO now start to tick epochs 2024/10/16 18:01:27 WARN delete object error id=hkZePXKTbjsWpZeo7v9Qvm3oNZisvrKJj1t1ZouvGUE error="remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard" 2024/10/16 18:01:27 WARN delete object error id=HJBGHW8DoMT8v8mKtx99ezgKuUsp1RsL5y9MYVq3HQcb error="remove object via client localhost:8080: delete object on client: status: code = 1024 message = incomplete object PUT by placement: could not write header: (writer.LocalTarget) could not put object to local storage: could not put object to any shard" ``` ## Context This issue was found during [minio warp](https://github.com/minio/warp) runs. In the end of the benchmark test, it removes plenty of objects and it fails sometimes. ``` $ ./warp client & $ ./warp put --host=localhost:8084 --warp-client=localhost --access-key=GqwtfpuEAype3jQL8f3PZbroVZ5PyJgfZYpGHYKguEa80FFogcYoFmZN9aEmGHPYuYpogswvJWVCFXrEibAP951RB --secret-key=1a283733518fdc6d297d35607096b2459cef78731c329dc38ba374eae4b68b56 --obj.size=1KiB --concurrent=4 --duration=5m --disable-multipart --bucket=test-bucket ``` ## Regression No ## Your Environment frostfs-aio v1.6.1 frostfs-node v0.42.15 / v0.44.0-rc.5-14-g90f36693 (master)

image.png

93 KiB

gcrace.puml.txt

792 B

main.go.txt

4.4 KiB

alexvanin added the

bug

triage

labels 2024-10-16 16:08:31 +00:00

fyrchik added the

frostfs-node

label 2024-10-16 19:38:15 +00:00

fyrchik commented

2024-10-17 08:55:02 +00:00

Owner

I have reproduced it on the shard level, here are some tricks to make it work for the posterity:

We need to send event in the notification channel and wait for some time to pass
We need to prevent removeGarbage routine from running (because GC mark in the description will be removed, and thus the object could be put again).
collectExpiredTombstones need to be set to a shard method and tombstone source need to return false (which may be the culprit -- it will be first when we put the first tombstone copy).

The GC remover interval is 100ms, we need either to increase it or to disable removeGarbage() routine.

I have reproduced it on the shard level, here are some tricks to make it work for the posterity: 1. We need to send event in the notification channel and wait for some time to pass 2. We need to prevent `removeGarbage` routine from running (because GC mark in the description will be removed, and thus the object could be put again). 3. `collectExpiredTombstones` need to be set to a shard method and tombstone source need to return false (which may be the culprit -- it _will_ be first when we put the first tombstone copy). The GC remover interval is 100ms, we need either to increase it or to disable `removeGarbage()` routine.

fyrchik added this to the v0.44.0 milestone 2024-10-17 10:58:21 +00:00

fyrchik referenced this issue

2024-10-17 11:55:31 +00:00

Get rid of IsTombstoneAvailable() #1434

fyrchik referenced this issue from a pull request that will close it,

2024-10-17 11:56:09 +00:00

Handle nasty race condition when putting tombstone #1435