Get rid of IsTombstoneAvailable() #1434

Open
opened 2024-10-17 11:55:31 +00:00 by fyrchik · 0 comments
Owner

We provide tombstone source to the shard level to check whether we can remove tombstone mark.

if !s.tsSource.IsTombstoneAvailable(ctx, ts.Tombstone(), epoch) {


func (g *ExpirationChecker) IsTombstoneAvailable(ctx context.Context, a oid.Address, epoch uint64) bool {

This is wrong on so many levels:

  1. The possibility of making network queries from inside the shard.
  2. Decoupling the mark from its lifetime (#1433 is just a symptom, we may have other problems too, like node suddenly becoming isolated).
  3. The bool return value is not enough to make an informed decision, there are lot's of errors and tombstone being unavailable dosen't mean it doesn't exist.

The suggestion is to:

  1. Store expiration epoch together with the tombstone mark.
  2. When removing tombstones take only epoch number into account and compare it with the attached GC mark. I would argue this is right by itself, because epoch can't go back. We take them from blockchain, so we make decision base on fact, not on a single experiment outcome (like now).
  3. We completely remove the whole class of bugs (there are variants of #1433 with shard becoming unavailable, network becoming unavailable etc.)

Refs #1433

We provide tombstone source to the shard level to check whether we can remove tombstone mark. https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/local_object_storage/shard/gc.go#L522 https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/90f36693995e1b411094686e4419bb7d11831f35/pkg/services/object_manager/tombstone/checker.go#L49 This is wrong on so many levels: 1. The possibility of making network queries from inside the shard. 2. Decoupling the mark from its lifetime (#1433 is just a symptom, we may have other problems too, like node suddenly becoming isolated). 3. The `bool` return value is not enough to make an informed decision, there are lot's of errors and tombstone being unavailable dosen't mean it doesn't exist. The suggestion is to: 1. Store expiration epoch together with the tombstone mark. 2. When removing tombstones take only epoch number into account and compare it with the attached GC mark. I would argue this is _right_ by itself, because epoch can't go back. We take them from blockchain, so we make decision base on _fact_, not on a single experiment outcome (like now). 3. We completely remove the whole class of bugs (there are variants of #1433 with shard becoming unavailable, network becoming unavailable etc.) Refs #1433
fyrchik added the
discussion
frostfs-node
labels 2024-10-17 11:55:31 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1434
No description provided.