metabase: Set storageID even if there is no object in metabase #1008

Merged
dstepanov-yadro merged 1 commit from dstepanov-yadro/frostfs-node:fix/flush_upgrade_storage_id into master 2024-02-28 13:53:42 +00:00

There may be a race condition between put an object and flushing the writecache:

  1. Shard puts object to the writecache
  2. Writecache flushes object to the blobstore, but can't set storageID, because object is not exist in metabase
  3. Shard puts object to the metabase, set writecache's storageID

To reproduce set time.Sleep(10 * time.Second) here:

res, err := s.metaBase.Put(ctx, pPrm)

In logs:

2024-02-26T16:50:45.005+0300	debug	get/get.go:89	serving request...	{"component": "Object.Get service", "request": "GET", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "raw": false, "local": false, "with session": false, "with bearer": false}

2024-02-26T16:50:45.005+0300	debug	shard/get.go:146	object is missing in write-cache	{"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "skip_meta": false, "trace_id": ""}

2024-02-26T16:50:45.005+0300	warn	shard/get.go:137	fetching object without meta	{"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET"}
2024-02-26T16:50:45.005+0300	debug	shard/get.go:146	object is missing in write-cache	{"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "skip_meta": true, "trace_id": ""}

2024-02-26T16:50:45.006+0300	warn	engine/get.go:106	meta info was present, but the object is missing	{"shard_id": "YVwpjfCnmM648NtVETYQ6D", "error": "status: code = 2049 message = object not found", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "trace_id": ""}

2024-02-26T16:50:45.007+0300	debug	get/get.go:101	operation finished successfully	{"component": "Object.Get service", "request": "GET", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "raw": false, "local": false, "with session": false, "with bearer": false}
There may be a race condition between put an object and flushing the writecache: 1. Shard puts object to the writecache 2. Writecache flushes object to the blobstore, but can't set storageID, because object is not exist in metabase 3. Shard puts object to the metabase, set writecache's storageID To reproduce set `time.Sleep(10 * time.Second)` here: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/abea258b657cb5208afae58b08f79e9385fe1825/pkg/local_object_storage/shard/put.go#L87 In logs: ``` 2024-02-26T16:50:45.005+0300 debug get/get.go:89 serving request... {"component": "Object.Get service", "request": "GET", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "raw": false, "local": false, "with session": false, "with bearer": false} 2024-02-26T16:50:45.005+0300 debug shard/get.go:146 object is missing in write-cache {"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "skip_meta": false, "trace_id": ""} 2024-02-26T16:50:45.005+0300 warn shard/get.go:137 fetching object without meta {"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET"} 2024-02-26T16:50:45.005+0300 debug shard/get.go:146 object is missing in write-cache {"shard_id": "YVwpjfCnmM648NtVETYQ6D", "addr": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "skip_meta": true, "trace_id": ""} 2024-02-26T16:50:45.006+0300 warn engine/get.go:106 meta info was present, but the object is missing {"shard_id": "YVwpjfCnmM648NtVETYQ6D", "error": "status: code = 2049 message = object not found", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "trace_id": ""} 2024-02-26T16:50:45.007+0300 debug get/get.go:101 operation finished successfully {"component": "Object.Get service", "request": "GET", "address": "3PoAs8FJYD23QKyTwyZ8E4bRZn8ht3MUQaB6q3gd13hQ/85cDGCCcKYsy6KcXUuF5g7TUgJ2bNRf1c5Zk5mqPf3ET", "raw": false, "local": false, "with session": false, "with bearer": false} ```
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from 222fbbe100 to 0eff54cdbb 2024-02-26 13:36:54 +00:00 Compare
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from 0eff54cdbb to d5f2ab2ea5 2024-02-26 14:10:48 +00:00 Compare
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from d5f2ab2ea5 to e926f061ec 2024-02-26 14:11:51 +00:00 Compare
dstepanov-yadro changed title from WIP: metabase: Do not update storageID on put to metabase: Set storageID even if there is no object in metabase 2024-02-26 14:43:50 +00:00
dstepanov-yadro requested review from storage-core-committers 2024-02-26 14:44:31 +00:00
dstepanov-yadro requested review from storage-core-developers 2024-02-26 14:44:31 +00:00
fyrchik reviewed 2024-02-26 16:07:05 +00:00
@ -242,3 +239,1 @@
val: id,
})
if err != nil {
if err = setStorageID(tx, objectCore.AddressOf(obj), id, true); err != nil {
Owner

Correct me if I am wrong:
Now we will overwrite this ID, so if flush happened before PUT, the stored id will be invalid and won't change because the object will be removed from the writecache?

Correct me if I am wrong: Now we will _overwrite_ this ID, so if flush happened before PUT, the stored id will be invalid and won't change because the object will be removed from the writecache?
Author
Member

No. Flush doesnt set storageID if there is no object in metabase (exist = false).

		exists, err := db.exists(tx, prm.addr, currEpoch)
		if err == nil && exists {
			err = updateStorageID(tx, prm.addr, prm.id)
		} else if errors.As(err, new(logicerr.Logical)) {
			err = updateStorageID(tx, prm.addr, prm.id)
		}
No. Flush doesnt set storageID if there is no object in metabase (exist = false). ``` exists, err := db.exists(tx, prm.addr, currEpoch) if err == nil && exists { err = updateStorageID(tx, prm.addr, prm.id) } else if errors.As(err, new(logicerr.Logical)) { err = updateStorageID(tx, prm.addr, prm.id) } ```
Owner

Then I don't understand, what is the problem now (on master)?
If PUT happened before the UpdateStorageID() everything is ok.
If PUT happened after the UpdateStorageID() everything is ok too, as you described in the comment.

Then I don't understand, what is the problem now (on master)? If PUT happened before the `UpdateStorageID()` everything is ok. If PUT happened after the `UpdateStorageID()` everything is ok too, as you described in the comment.
Author
Member

If PUT happened after the UpdateStorageID() everything is ok too, as you described in the comment.

No.
I described so:

...
3. Shard puts object to the metabase, set writecache's storageID

This means that after Shard.Put metabase has storageID = writecache, but there is already no object in writecache (already flushed). Correct storageID is blobovnicza path, not writecache.

> If PUT happened after the UpdateStorageID() everything is ok too, as you described in the comment. No. I described so: ``` ... 3. Shard puts object to the metabase, set writecache's storageID ``` This means that after `Shard.Put` metabase has `storageID = writecache`, but there is already no object in writecache (already flushed). Correct storageID is blobovnicza path, not writecache.
@ -494,2 +489,2 @@
return bkt.Put(objectKey(addr.Object(), key), id)
key = objectKey(addr.Object(), key)
if !insertOnly || (bkt.Get(key) == nil && insertOnly) {
Owner

&& insertOnly is not needed, as it is always true in this branch

`&& insertOnly` is not needed, as it is always true in this branch
Author
Member

done

done
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from e926f061ec to a91ad86e35 2024-02-27 06:08:44 +00:00 Compare
aarifullin reviewed 2024-02-27 11:55:52 +00:00
@ -495,1 +489,3 @@
return bkt.Put(objectKey(addr.Object(), key), id)
key = objectKey(addr.Object(), key)
if !insertOnly || bkt.Get(key) == nil {
return bkt.Put(key, id)
Member

Just for curiosity: if it happens that a shard inserts storageID for an object and, at the same time, updated Blobstor causes updating storageID for the object, may we expect some error like g2 cannot Put because g1 is Getting?

Just for curiosity: if it happens that a shard *inserts* storageID for an object and, at the same time, *updated* Blobstor causes *updating* storageID for the object, may we expect some error like `g2 cannot Put because g1 is Getting`?
Author
Member

Only one concurrent write tx is allowed. So it can be only put->update or update->put.

Only one concurrent write tx is allowed. So it can be only `put->update` or `update->put`.
Owner

You can see the mechanics of it inside bbolt Batch function -- we amortize only fsyncs, the transactions themselves are executed sequentially.

You can see the mechanics of it inside bbolt `Batch` function -- we amortize only fsyncs, the transactions themselves are executed sequentially.
aarifullin approved these changes 2024-02-27 13:15:42 +00:00
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from a91ad86e35 to 6c3f9ec8a2 2024-02-28 08:01:07 +00:00 Compare
dstepanov-yadro force-pushed fix/flush_upgrade_storage_id from 6c3f9ec8a2 to 918613546f 2024-02-28 08:02:05 +00:00 Compare
fyrchik approved these changes 2024-02-28 08:58:32 +00:00
dstepanov-yadro requested review from aarifullin 2024-02-28 09:56:19 +00:00
aarifullin approved these changes 2024-02-28 11:52:34 +00:00
dstepanov-yadro merged commit 918613546f into master 2024-02-28 13:53:42 +00:00
dstepanov-yadro deleted branch fix/flush_upgrade_storage_id 2024-02-28 13:53:42 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1008
No description provided.