node: Implement Lock\Delete requests for EC object #1147

Merged
fyrchik merged 10 commits from acid-ant/frostfs-node:feature/ec-del-lock into master 2024-05-30 08:13:07 +00:00
Member

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant force-pushed feature/ec-del-lock from 8be17871f4 to fc92eb7b5f 2024-05-27 19:17:33 +00:00 Compare
acid-ant requested review from storage-core-committers 2024-05-27 19:55:42 +00:00
acid-ant requested review from storage-core-developers 2024-05-27 19:55:53 +00:00
Owner

Sanity tests pass?

Sanity tests pass?
Author
Member

Sanity tests pass?

Yes.

> Sanity tests pass? Yes.
fyrchik approved these changes 2024-05-28 08:37:09 +00:00
@ -362,3 +362,1 @@
default:
commonCmd.ExitOnErr(cmd, "failed to get raw object header: %w", err)
case err == nil:
if err == nil {
Owner

Why have you replaced switch with else if chain?

Why have you replaced switch with `else if` chain?
Author
Member

Because previous implementation was only for Split Info. Thought it should be more readable. Reverted switch back.

Because previous implementation was only for `Split Info`. Thought it should be more readable. Reverted switch back.
Owner

IMO it is exactly the opposite -- else if is ok once, switch is less verbose for multiple branches.

IMO it is exactly the opposite -- `else if` is ok once, switch is less verbose for multiple branches.
fyrchik marked this conversation as resolved
@ -369,0 +383,4 @@
if err != nil {
commonCmd.ExitOnErr(cmd, "failed to create object id: %w", err)
}
chain = append(chain, objID)
Owner

So we add each chunk to the tombstone/lock? It is a problem, because chunks may be missing (with split it cannot be the case, it means DL, with EC it is ok).

So we add each chunk to the tombstone/lock? It is a problem, because chunks may be missing (with split it cannot be the case, it means DL, with EC it is ok).
Author
Member

Oh, thanks, that was from previous implementation, removed.

Oh, thanks, that was from previous implementation, removed.
Owner

Does the new implementation still pass sanity tests?

Does the new implementation still pass sanity tests?
Author
Member

Execute each time when changed sensitive part of the code.

Execute each time when changed sensitive part of the code.
fyrchik marked this conversation as resolved
@ -174,0 +201,4 @@
zap.Stringer("addr", addr),
zap.String("err", err.Error()),
zap.String("trace_id", tracingPkg.GetTraceID(ctx)))
continue
Owner

There is an error which we have ignored. What happens with this yet-to-be-removed chunk?

There is an error which we have ignored. What happens with this yet-to-be-removed chunk?
Author
Member

It will be removed by remover at next iteration. The behavior is the same as for complex object, see deleteChildren().

It will be removed by `remover` at next iteration. The behavior is the same as for complex object, see [deleteChildren()](https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/3627b44e92395d2be7eeda9790513021b9f345ca/pkg/local_object_storage/engine/delete.go#L136).
fyrchik marked this conversation as resolved
@ -63,3 +64,3 @@
b.ResetTimer()
for i := 0; i < b.N; i++ {
ok, err := e.exists(context.Background(), addr)
ok, _, err := e.exists(context.Background(), addr, oid.Address{})
Owner

The interface is confusing, we have 2 identical types with different meaning.
What about accepting shard.ExistsPrm?

The interface is confusing, we have 2 identical types with different meaning. What about accepting `shard.ExistsPrm`?
Author
Member

In this case we need to make fields of ExistsPrm public, are you ok?

In this case we need to make fields of `ExistsPrm` public, are you ok?
Author
Member

Implemented in a separate commit.

Implemented in a separate commit.
@ -223,0 +235,4 @@
e.iterateOverUnsortedShards(func(h hashedShard) (stop bool) {
ld, err := h.Shard.GetLocked(ctx, addr)
if err != nil {
e.reportShardError(h, "can't get object's lockers", err, zap.Stringer("addr", addr),
Owner

It is a log message, should be a const. Also, why didn't the linter fail? cc @achuprov

It is a log message, should be a const. Also, why didn't the linter fail? cc @achuprov
Author
Member

Thanks, updated.

Thanks, updated.
fyrchik marked this conversation as resolved
@ -88,0 +96,4 @@
return err
}
for _, locker := range lockers {
err = e.lock(ctx, addr.Container(), locker, []oid.ID{addr.Object()})
Owner

Do we lock an object before we have put it? It seems like a problem, because this lock record can persist indefinitely.

Do we lock an object before we have put it? It seems like a problem, because this lock record can persist indefinitely.
Author
Member

Didn't catch the problem. Here we are persisting lock for a chunk before put, because we need to avoid gc removing it. This is reconstruction scenario - when we need to put chunk on the node. If there is no lock for a chunk, gc will inhume it.

Didn't catch the problem. Here we are persisting lock for a chunk before put, because we need to avoid `gc` removing it. This is reconstruction scenario - when we need to put chunk on the node. If there is no lock for a chunk, `gc` will inhume it.
Owner

The problem is atomicity -- lock -> CRASH -> put and we now have some garbage about locks which will (?) be removed eventually.
We could do it atomically in put instead, this would also ensure we put info on the same shard.

The problem is atomicity -- `lock -> CRASH -> put` and we now have some garbage about locks which will (?) be removed eventually. We could do it atomically in `put` instead, this would also ensure we put info on the same shard.
Author
Member

As a result of discussion, we need to move gc on a storage engine level. Created #1151 for tracking.

As a result of discussion, we need to move `gc` on a storage engine level. Created https://git.frostfs.info/TrueCloudLab/frostfs-node/issues/1151 for tracking.
@ -477,0 +498,4 @@
offset := 0
for offset < len(val) {
if bytes.Equal(objKey, val[offset:offset+objectKeySize]) {
val = append(val[:offset], val[offset+objectKeySize:]...)
Owner

val is received from getFromBucket. Is it taken from bbolt or freshly allocated? Bbolt prohibits changing values in some cases.

`val` is received from `getFromBucket`. Is it taken from bbolt or freshly allocated? Bbolt prohibits changing values in some cases.
Author
Member

According to doc, val should be valid for the life of the transaction. Let's clone it.

According to doc, `val` should be valid for the life of the transaction. Let's clone it.
Owner

This line is more important // The returned memory is owned by bbolt and must never be modified; writing to this memory might corrupt the database.

This line is more important `// The returned memory is owned by bbolt and must never be modified; writing to this memory might corrupt the database.`
Author
Member

This line is from the newest version. Looks like we need to update bbolt.

This line is from the newest version. Looks like we need to update `bbolt`.
fyrchik marked this conversation as resolved
@ -526,3 +526,3 @@
log.Debug(logs.ShardHandlingExpiredTombstonesBatch, zap.Int("number", len(tssExp)))
s.expiredTombstonesCallback(ctx, tssExp)
if len(tssExp) > 0 {
Owner

To be clear: is this an optimization or a functional change?

To be clear: is this an optimization or a functional change?
Author
Member

It is an optimization - here we do nothing but getting lock on metabase, because call db.boltDB.Update(...).

It is an optimization - here we do nothing but getting lock on metabase, because call `db.boltDB.Update(...)`.
fyrchik marked this conversation as resolved
Owner

Approved accidentally, please disregard

Approved accidentally, please disregard
Author
Member

Fixed gopls-run. Resolved issue from gopls.

Fixed `gopls-run`. Resolved issue from `gopls`.
fyrchik reviewed 2024-05-28 10:55:08 +00:00
Makefile Outdated
@ -221,3 +224,3 @@
make gopls-install; \
fi
@if [[ $$(find . -type f -name "*.go" -print | xargs $(GOPLS_VERSION_DIR)/gopls check | tee /dev/tty | wc -l) -ne 0 ]]; then \
$(GOPLS_VERSION_DIR)/gopls check $(SOURCES) 2>&1 >$(GOPLS_TEMP_FILE)
Owner

Was there any problem with the previous implementation (pipe instead of temp file)?

Was there any problem with the previous implementation (pipe instead of temp file)?
Author
Member

We are unable to use tee /dev/tty for security reason. If replace it for ... check 2>&1 | tee | wc -l there are no output for error.

We are unable to use `tee /dev/tty` for security reason. If replace it for `... check 2>&1 | tee | wc -l` there are no output for error.
fyrchik marked this conversation as resolved
acid-ant force-pushed feature/ec-del-lock from 998d6a86d7 to 2ea7b6331d 2024-05-28 12:53:15 +00:00 Compare
acid-ant force-pushed feature/ec-del-lock from 2ea7b6331d to 13f6770f25 2024-05-28 12:54:48 +00:00 Compare
acid-ant force-pushed feature/ec-del-lock from 13f6770f25 to 9ba4a97276 2024-05-28 13:46:44 +00:00 Compare
aarifullin approved these changes 2024-05-29 11:19:29 +00:00
acid-ant force-pushed feature/ec-del-lock from 9ba4a97276 to 2a4f637861 2024-05-29 14:11:36 +00:00 Compare
acid-ant added 1 commit 2024-05-30 06:26:35 +00:00
[#1147] node: Use public fields for shard.ExistsPrm
All checks were successful
Vulncheck / Vulncheck (pull_request) Successful in 4m54s
DCO action / DCO (pull_request) Successful in 5m6s
Build / Build Components (1.21) (pull_request) Successful in 5m25s
Build / Build Components (1.22) (pull_request) Successful in 5m33s
Tests and linters / gopls check (pull_request) Successful in 5m50s
Tests and linters / Staticcheck (pull_request) Successful in 6m59s
Tests and linters / Lint (pull_request) Successful in 8m26s
Pre-commit hooks / Pre-commit (pull_request) Successful in 9m38s
Tests and linters / Tests with -race (pull_request) Successful in 10m49s
Tests and linters / Tests (1.21) (pull_request) Successful in 11m7s
Tests and linters / Tests (1.22) (pull_request) Successful in 11m37s
f1f267a9a5
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
fyrchik approved these changes 2024-05-30 08:12:59 +00:00
fyrchik merged commit 92e19feb57 into master 2024-05-30 08:13:07 +00:00
fyrchik referenced this pull request from a commit 2024-05-30 08:13:09 +00:00
fyrchik referenced this pull request from a commit 2024-05-30 08:13:10 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1147
No description provided.