node: Implement Lock\Delete
requests for EC object #1147
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1147
Loading…
Reference in a new issue
No description provided.
Delete branch "acid-ant/frostfs-node:feature/ec-del-lock"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com
8be17871f4
tofc92eb7b5f
Sanity tests pass?
Yes.
@ -362,3 +362,1 @@
default:
commonCmd.ExitOnErr(cmd, "failed to get raw object header: %w", err)
case err == nil:
if err == nil {
Why have you replaced switch with
else if
chain?Because previous implementation was only for
Split Info
. Thought it should be more readable. Reverted switch back.IMO it is exactly the opposite --
else if
is ok once, switch is less verbose for multiple branches.@ -369,0 +383,4 @@
if err != nil {
commonCmd.ExitOnErr(cmd, "failed to create object id: %w", err)
}
chain = append(chain, objID)
So we add each chunk to the tombstone/lock? It is a problem, because chunks may be missing (with split it cannot be the case, it means DL, with EC it is ok).
Oh, thanks, that was from previous implementation, removed.
Does the new implementation still pass sanity tests?
Execute each time when changed sensitive part of the code.
@ -174,0 +201,4 @@
zap.Stringer("addr", addr),
zap.String("err", err.Error()),
zap.String("trace_id", tracingPkg.GetTraceID(ctx)))
continue
There is an error which we have ignored. What happens with this yet-to-be-removed chunk?
It will be removed by
remover
at next iteration. The behavior is the same as for complex object, see deleteChildren().@ -63,3 +64,3 @@
b.ResetTimer()
for i := 0; i < b.N; i++ {
ok, err := e.exists(context.Background(), addr)
ok, _, err := e.exists(context.Background(), addr, oid.Address{})
The interface is confusing, we have 2 identical types with different meaning.
What about accepting
shard.ExistsPrm
?In this case we need to make fields of
ExistsPrm
public, are you ok?Implemented in a separate commit.
@ -223,0 +235,4 @@
e.iterateOverUnsortedShards(func(h hashedShard) (stop bool) {
ld, err := h.Shard.GetLocked(ctx, addr)
if err != nil {
e.reportShardError(h, "can't get object's lockers", err, zap.Stringer("addr", addr),
It is a log message, should be a const. Also, why didn't the linter fail? cc @achuprov
Thanks, updated.
@ -88,0 +96,4 @@
return err
}
for _, locker := range lockers {
err = e.lock(ctx, addr.Container(), locker, []oid.ID{addr.Object()})
Do we lock an object before we have put it? It seems like a problem, because this lock record can persist indefinitely.
Didn't catch the problem. Here we are persisting lock for a chunk before put, because we need to avoid
gc
removing it. This is reconstruction scenario - when we need to put chunk on the node. If there is no lock for a chunk,gc
will inhume it.The problem is atomicity --
lock -> CRASH -> put
and we now have some garbage about locks which will (?) be removed eventually.We could do it atomically in
put
instead, this would also ensure we put info on the same shard.As a result of discussion, we need to move
gc
on a storage engine level. Created #1151 for tracking.@ -477,0 +498,4 @@
offset := 0
for offset < len(val) {
if bytes.Equal(objKey, val[offset:offset+objectKeySize]) {
val = append(val[:offset], val[offset+objectKeySize:]...)
val
is received fromgetFromBucket
. Is it taken from bbolt or freshly allocated? Bbolt prohibits changing values in some cases.According to doc,
val
should be valid for the life of the transaction. Let's clone it.This line is more important
// The returned memory is owned by bbolt and must never be modified; writing to this memory might corrupt the database.
This line is from the newest version. Looks like we need to update
bbolt
.@ -526,3 +526,3 @@
log.Debug(logs.ShardHandlingExpiredTombstonesBatch, zap.Int("number", len(tssExp)))
s.expiredTombstonesCallback(ctx, tssExp)
if len(tssExp) > 0 {
To be clear: is this an optimization or a functional change?
It is an optimization - here we do nothing but getting lock on metabase, because call
db.boltDB.Update(...)
.Approved accidentally, please disregard
Fixed
gopls-run
. Resolved issue fromgopls
.@ -221,3 +224,3 @@
make gopls-install; \
fi
@if [[ $$(find . -type f -name "*.go" -print | xargs $(GOPLS_VERSION_DIR)/gopls check | tee /dev/tty | wc -l) -ne 0 ]]; then \
$(GOPLS_VERSION_DIR)/gopls check $(SOURCES) 2>&1 >$(GOPLS_TEMP_FILE)
Was there any problem with the previous implementation (pipe instead of temp file)?
We are unable to use
tee /dev/tty
for security reason. If replace it for... check 2>&1 | tee | wc -l
there are no output for error.998d6a86d7
to2ea7b6331d
2ea7b6331d
to13f6770f25
13f6770f25
to9ba4a97276
9ba4a97276
to2a4f637861
shard.ExistsPrm