Create tombstone source when reload SUPPORT #392

Merged
fyrchik merged 1 commit from dstepanov-yadro/frostfs-node:fix/gc_panic_support into support/v0.36 2023-07-26 21:07:58 +00:00

There is a panic when node's config reloaded and GC deletes objects with tombstones:

2023/05/25 10:58:24 worker exits from a panic: runtime error: invalid memory address or nil pointer dereference
2023/05/25 10:58:24 worker exits from panic: goroutine 7480 [running]:
github.com/panjf2000/ants/v2.(*goWorker).run.func1.1()
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c
panic({0x104dd40, 0x1d436e0})
        runtime/panic.go:838 +0x207
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard.(*Shard).collectExpiredTombstones(0xc0006d11d0, {0x1487990, 0xc000626000}, {0x14778a0?, 0x1d095d8?})
        git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard/gc.go:410 +0x7de
git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard.(*gc).listenEvents.func1()
        git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard/gc.go:149 +0x74
github.com/panjf2000/ants/v2.(*goWorker).run.func1()
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97
created by github.com/panjf2000/ants/v2.(*goWorker).run
        github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x65

2023-05-25T10:58:24.027Z        info    frostfs-node/config.go:983      bootstrapping with online state {"previous": "ONLINE"}

Because of panic workgroup.Done() is never called for GC. So there is no receiver for new epoch event and new epoch notification is blocked after panic.

There is a panic when node's config reloaded and GC deletes objects with tombstones: ``` 2023/05/25 10:58:24 worker exits from a panic: runtime error: invalid memory address or nil pointer dereference 2023/05/25 10:58:24 worker exits from panic: goroutine 7480 [running]: github.com/panjf2000/ants/v2.(*goWorker).run.func1.1() github.com/panjf2000/ants/v2@v2.4.0/worker.go:58 +0x10c panic({0x104dd40, 0x1d436e0}) runtime/panic.go:838 +0x207 git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard.(*Shard).collectExpiredTombstones(0xc0006d11d0, {0x1487990, 0xc000626000}, {0x14778a0?, 0x1d095d8?}) git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard/gc.go:410 +0x7de git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard.(*gc).listenEvents.func1() git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/shard/gc.go:149 +0x74 github.com/panjf2000/ants/v2.(*goWorker).run.func1() github.com/panjf2000/ants/v2@v2.4.0/worker.go:68 +0x97 created by github.com/panjf2000/ants/v2.(*goWorker).run github.com/panjf2000/ants/v2@v2.4.0/worker.go:48 +0x65 2023-05-25T10:58:24.027Z info frostfs-node/config.go:983 bootstrapping with online state {"previous": "ONLINE"} ``` Because of panic workgroup.Done() is never called for GC. So there is no receiver for new epoch event and new epoch notification is blocked after panic.
dstepanov-yadro force-pushed fix/gc_panic_support from 5f9da55ac0 to 2360cf263b 2023-05-25 12:53:30 +00:00 Compare
dstepanov-yadro requested review from storage-core-committers 2023-05-25 12:55:02 +00:00
dstepanov-yadro requested review from storage-core-developers 2023-05-25 12:55:02 +00:00
acid-ant approved these changes 2023-05-25 14:18:36 +00:00
JuliaKovshova approved these changes 2023-05-25 15:54:53 +00:00
fyrchik approved these changes 2023-05-26 12:13:44 +00:00
fyrchik merged commit 2360cf263b into support/v0.36 2023-05-26 12:13:48 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#392
No description provided.