engine, policer: Properly process error already removed #813

Merged
fyrchik merged 2 commits from acid-ant/frostfs-node:bugfix/799-policer-skip-already-removed into master 2024-02-01 17:49:25 +00:00
Member

Close #799

Here is a scenario:

  1. Create container with policy REP 2 ...
  2. Put object in container - distribute it on node 2.
  3. policer on node 2 choose node 3 to replicate object on it.
  4. Finish object put on node 3.
  5. Remove object from node 2 and node 3.
  6. At the same time, policer on node 2 trying to put object to node 3. Choose the first shard - got error already removed, put object on the next shard.
  7. That is why we are able to search an object in the root bucket in the metabase.

We need to skip nodes which already removed object in policer.
Also skip put on the shards when faced an error object already removed - that was initial issue.

Unfortunately, it is impossible to cover this line via unittest.

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Close #799 Here is a scenario: 1. Create container with policy `REP 2 ...` 2. Put object in container - distribute it on node 2. 3. `policer` on node 2 choose node 3 to replicate object on it. 4. Finish object put on node 3. 5. Remove object from node 2 and node 3. 6. At the same time, `policer` on node 2 trying to put object to node 3. Choose the first shard - got error `already removed`, put object on the next shard. 7. That is why we are able to search an object in the root bucket in the metabase. We need to skip nodes which already removed object in policer. Also skip put on the shards when faced an error `object already removed` - that was initial issue. Unfortunately, it is impossible to cover this [line](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/8d18fa159e085dde62336ec80f76fdfee358c412/pkg/local_object_storage/engine/put.go#L68) via unittest. Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant changed title from [#799] policer: Skip nodes which already removed object to policer: Skip nodes which already removed object 2023-11-16 12:30:05 +00:00
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 1861bdd00a to 641070e159 2023-11-16 12:36:58 +00:00 Compare
acid-ant requested review from storage-core-committers 2023-11-16 13:02:06 +00:00
acid-ant requested review from storage-core-developers 2023-11-16 13:02:07 +00:00
dstepanov-yadro reviewed 2023-11-16 14:26:22 +00:00
@ -145,3 +135,1 @@
zap.String("error", err.Error()),
)
} else {
if err == nil || client.IsErrObjectAlreadyRemoved(err) {

If object is already removed on other node, so we can just skip it on current node, because it will be skiped on the next policer run.

If object is already removed on other node, so we can just skip it on current node, because it will be skiped on the next policer run.
Author
Member

Updated a bit, propagate error to the upper layer.

Updated a bit, propagate error to the upper layer.
dstepanov-yadro marked this conversation as resolved
fyrchik approved these changes 2023-11-17 11:58:22 +00:00
fyrchik left a comment
Owner

Am I right in describing the scenario?

  1. Node 1 sees AlreadyRemoved from 2, considers as "no copy"
  2. Node 1 replicates on node 2 with an error.
  3. Node 1 successfully replicates on node 3.

Because node 3 was offline at the moment of object removal, it doesn't has tombstone and replication successfully finishes.

In this case we still will replicate tombstone later, so the problem is only in speed?

Am I right in describing the scenario? 1. Node 1 sees AlreadyRemoved from 2, considers as "no copy" 2. Node 1 replicates on node 2 with an error. 3. Node 1 successfully replicates on node 3. Because node 3 was offline at the moment of object removal, it doesn't has tombstone and replication successfully finishes. In this case we still will replicate tombstone later, so the problem is only in speed?
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 641070e159 to e55ef24f5f 2023-11-17 12:46:11 +00:00 Compare
aarifullin approved these changes 2023-11-17 13:00:41 +00:00
acid-ant changed title from policer: Skip nodes which already removed object to WIP: policer: Skip nodes which already removed object 2023-11-17 13:28:56 +00:00
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 43112edc08 to dfad3ce06a 2023-11-21 13:46:37 +00:00 Compare
acid-ant changed title from WIP: policer: Skip nodes which already removed object to engine, policer: Properly process error already removed 2023-11-22 06:49:26 +00:00
acid-ant requested review from fyrchik 2023-11-22 06:49:29 +00:00
acid-ant requested review from aarifullin 2023-11-22 06:49:31 +00:00
acid-ant requested review from storage-core-committers 2023-11-22 06:49:39 +00:00
acid-ant requested review from storage-core-developers 2023-11-22 06:49:39 +00:00
Author
Member

Am I right in describing the scenario?
Added scenario to the description.

> Am I right in describing the scenario? Added scenario to the description.
fyrchik reviewed 2023-11-22 07:38:39 +00:00
@ -99,3 +100,1 @@
func (e *StorageEngine) putToShard(ctx context.Context, sh hashedShard, ind int, pool util.WorkerPool, addr oid.Address, obj *objectSDK.Object) (bool, bool) {
var putSuccess, alreadyExists bool
// Third return value is true iff object already removed. In this case forth value also not nil.
Owner

Can we introduce putToShardRes now? Return values are named here, but it is easy to misorder on the callsize.

Can we introduce `putToShardRes` now? Return values are named here, but it is easy to misorder on the callsize.
Author
Member

Yeah, thought about this too. If others do not mind about this, I'll add it.
@TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers

Yeah, thought about this too. If others do not mind about this, I'll add it. @TrueCloudLab/storage-core-developers @TrueCloudLab/storage-core-committers
@ -150,6 +152,15 @@ func (e *StorageEngine) putToShard(ctx context.Context, sh hashedShard, ind int,
zap.String("trace_id", tracingPkg.GetTraceID(ctx)))
return
}
if client.IsErrObjectAlreadyRemoved(err) {
Owner

What about a test? (2 shards, put -> delete -> put, the first shard should return AlreadyRemoved, the second should return NotFound)
As a sanity check, it should fail on master.

What about a test? (2 shards, put -> delete -> put, the first shard should return AlreadyRemoved, the second should return NotFound) As a sanity check, it should fail on master.
Author
Member

To reproduce this bug, we need three concurrent writes - two for object and one for tomb. For your proposal, both master and current branch will allow to put->delete->put and object will be on the first shard always.
It is hard to emulate concurrent usages of engine.
In the scope of this task we are not avoiding race in common, we just properly react for one of the possible error which my lead to duplication of object on shards. Because we are checking existence of the object before iteration over shards.
The same may happen when disk is full or blinking - we are not covering these errors and just go to the next shard.
Also, that may happen when policers from two nodes start putting object on third, because disk was removed and no more object on node

To reproduce this bug, we need three concurrent writes - two for object and one for tomb. For your proposal, both master and current branch will allow to `put->delete->put` and object will be on the first shard always. It is hard to emulate concurrent usages of `engine`. In the scope of this task we are not avoiding race in common, we just properly react for one of the possible error which my lead to duplication of object on shards. Because we are checking existence of the object before iteration over shards. The same may happen when disk is full or blinking - we are not covering these errors and just go to the next shard. Also, that may happen when `policers` from two nodes start putting object on third, because disk was removed and no more object on node
acid-ant force-pushed bugfix/799-policer-skip-already-removed from dfad3ce06a to ef74f8da89 2023-11-22 11:12:52 +00:00 Compare
dstepanov-yadro approved these changes 2023-11-23 08:08:36 +00:00
aarifullin approved these changes 2023-11-29 14:33:08 +00:00
acid-ant force-pushed bugfix/799-policer-skip-already-removed from ef74f8da89 to 11a63eb931 2023-12-05 08:39:50 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 11a63eb931 to 28a53162af 2023-12-07 11:03:13 +00:00 Compare
acid-ant requested review from storage-core-committers 2023-12-07 11:42:34 +00:00
acid-ant requested review from storage-core-developers 2023-12-07 11:42:35 +00:00
dstepanov-yadro approved these changes 2023-12-07 13:03:23 +00:00
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 28a53162af to f63b23a382 2023-12-07 14:08:06 +00:00 Compare
fyrchik reviewed 2023-12-07 15:04:34 +00:00
@ -91,0 +102,4 @@
return errPutShard
case putToShardRemoved:
return shRes.err
default:
Owner

Having return nil in the default branch is dangerous IMO, because if we add and error and forget to add the case here, this is DL scenario (OK to client, don't save in reality).

Having `return nil` in the default branch is dangerous IMO, because if we add and error and forget to add the case here, this is DL scenario (OK to client, don't save in reality).
Author
Member

Agree, I've updated this switch.

Agree, I've updated this switch.
@ -91,0 +103,4 @@
case putToShardRemoved:
return shRes.err
default:
return nil
Owner

How about returning nil for explicit statuses enumeration?
It is easy to miss new here and return OK without object being put.

How about returning nil for explicit statuses enumeration? It is easy to miss new here and return OK without object being put.
Owner

old comment, didn't see

old comment, didn't see
fyrchik marked this conversation as resolved
@ -150,12 +165,21 @@ func (e *StorageEngine) putToShard(ctx context.Context, sh hashedShard, ind int,
zap.String("trace_id", tracingPkg.GetTraceID(ctx)))
return
}
if client.IsErrObjectAlreadyRemoved(err) {
Owner

If we remove an object with frostfs-cli control drop-objects, what kind of error will we receive?

If we remove an object with `frostfs-cli control drop-objects`, what kind of error will we receive?
Author
Member

frostfs-cli control drop-objects calls shard.Inhume(), so we will have here AlreadyRemoved.

`frostfs-cli control drop-objects` calls `shard.Inhume()`, so we will have here `AlreadyRemoved`.
@ -148,2 +138,4 @@
shortage--
checkedNodes.submitReplicaHolder(nodes[i])
} else {
if client.IsErrObjectAlreadyRemoved(err) {
Owner

Do we need the changes in policer even after we have changed logic in shard.put?

Do we need the changes in policer even after we have changed logic in shard.put?
Author
Member

I think yes. The idea here is to fail faster and do not process other nodes in queue.

I think yes. The idea here is to fail faster and do not process other nodes in queue.
acid-ant force-pushed bugfix/799-policer-skip-already-removed from f63b23a382 to aee3e4b14d 2023-12-10 15:00:45 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from aee3e4b14d to 062152a9d7 2023-12-22 06:16:37 +00:00 Compare
acid-ant requested review from aarifullin 2023-12-22 06:17:15 +00:00
acid-ant requested review from dstepanov-yadro 2023-12-22 06:17:21 +00:00
dstepanov-yadro approved these changes 2023-12-22 07:44:48 +00:00
dstepanov-yadro reviewed 2023-12-22 17:15:12 +00:00
@ -99,3 +115,1 @@
func (e *StorageEngine) putToShard(ctx context.Context, sh hashedShard, ind int, pool util.WorkerPool, addr oid.Address, obj *objectSDK.Object) (bool, bool) {
var putSuccess, alreadyExists bool
// Third return value is true iff object already removed. In this case forth value also not nil.

Comment is not actual

Comment is not actual
Author
Member

Updated, thanks.

Updated, thanks.
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 062152a9d7 to 3a3f79c12d 2023-12-24 14:05:58 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 3a3f79c12d to 50e916a41d 2023-12-24 14:31:19 +00:00 Compare
Author
Member

Need to think about bellow scenario:

  • Put Object on node 2,3.
  • Policer on node 2 unable to connect with node 3, put object on node 1
  • Policer on node 1 able to connect with node 3, check object on node 3 and remove from node 1
    Looks like we need to continue process nodes in policer, because inside policer we aren't able to check by whom object removed.
Need to think about bellow scenario: - Put Object on node 2,3. - Policer on node 2 unable to connect with node 3, put object on node 1 - Policer on node 1 able to connect with node 3, check object on node 3 and remove from node 1 Looks like we need to continue process nodes in policer, because inside policer we aren't able to check by whom object removed.
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 50e916a41d to ac24c3f1ac 2024-01-10 12:18:28 +00:00 Compare
Author
Member

Updated commit into policer - now it contains only refactoring for method processNodes.

Updated commit into `policer` - now it contains only refactoring for method `processNodes`.
acid-ant force-pushed bugfix/799-policer-skip-already-removed from ac24c3f1ac to a61f70c33b 2024-01-19 08:19:25 +00:00 Compare
aarifullin approved these changes 2024-01-19 08:49:09 +00:00
acid-ant force-pushed bugfix/799-policer-skip-already-removed from a61f70c33b to 3686122b0d 2024-01-25 08:33:02 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 3686122b0d to 716e831f55 2024-01-26 09:52:52 +00:00 Compare
fyrchik reviewed 2024-01-26 10:16:44 +00:00
@ -132,21 +132,21 @@ func (p *Policer) processNodes(ctx context.Context, requirements *placementRequi
cancel()
if client.IsErrObjectNotFound(err) {
Owner

What do we achiev with this refactoring?

What do we achiev with this refactoring?
Author
Member

Had a decision to keep this refactoring because err will be nil more often, so it is not necessary to check for the type of the error at first. In normal situation, node holds replicas.

Had a decision to keep this refactoring because `err` will be `nil` more often, so it is not necessary to check for the type of the error at first. In normal situation, node holds replicas.
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 716e831f55 to e73af83b6e 2024-01-29 08:58:24 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from e73af83b6e to 4ce6ad711b 2024-01-30 07:14:32 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 4ce6ad711b to 6d39528d13 2024-01-31 08:36:57 +00:00 Compare
acid-ant force-pushed bugfix/799-policer-skip-already-removed from 6d39528d13 to db247442dd 2024-01-31 11:10:15 +00:00 Compare
fyrchik approved these changes 2024-02-01 17:49:17 +00:00
fyrchik merged commit d0eadf7ea2 into master 2024-02-01 17:49:25 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#813
No description provided.