storage node: Drain internal error's channel #878

Merged
fyrchik merged 1 commit from dstepanov-yadro/frostfs-node:fix/shutdown_panic into master 2023-12-19 16:38:06 +00:00

Scenario:

  1. Some morph connection gets error and passes it to internalErr channel
  2. Storage node starts to shutdow and closes internalErr channel
  3. Other morph connection gets error and tries to pass it to internalErr channel => panic:
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: panic: send on closed channel
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: goroutine 4136 [running]:
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).listenLoop(0xc000176780, {0x162fe20, 0xc0002a5c70}, 0xc0002bc240, 0xc001054c60)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:238 +0x433
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).listen(0xc000176780, {0x162fe20, 0xc0002a5c70}, 0x43?)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:169 +0xb6
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).ListenWithError.func1()
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:152 +0x45
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync.(*Once).doSlow(0x193af2b?, 0x43?)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         sync/once.go:74 +0xc2
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync.(*Once).Do(...)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         sync/once.go:65
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).ListenWithError(0xc00440c270?, {0x162fe20?, 0xc0002a5c70?}, 0x0?)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:151 +0x67
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.listenMorphNotifications.func1.1({0x162fe20?, 0xc0002a5c70?}, 0xc000d37f10?)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/morph.go:213 +0x39
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.runAndLog({0x162fe20, 0xc0002a5c70}, 0xc000297800, {0x13c0fcc, 0x12}, 0x0, 0xc000276f68)
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/main.go:116 +0xc4
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.listenMorphNotifications.func1({0x162fe20?, 0xc0002a5c70?})
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/morph.go:212 +0x68
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.startWorker.func1({{0x0?, 0xc004c94400?}, 0xc00272ca60?})
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/worker.go:28 +0x37
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: created by main.startWorker
дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]:         git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/worker.go:27 +0x138
дек 12 02:37:31 blackmetal-node16 systemd[1]: frostfs-storage.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
дек 12 02:37:31 blackmetal-node16 systemd[1]: frostfs-storage.service: Failed with result 'exit-code'.
Scenario: 1. Some morph connection gets error and passes it to `internalErr` channel 2. Storage node starts to shutdow and closes `internalErr` channel 3. Other morph connection gets error and tries to pass it to `internalErr` channel => panic: ``` дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: panic: send on closed channel дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: goroutine 4136 [running]: дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).listenLoop(0xc000176780, {0x162fe20, 0xc0002a5c70}, 0xc0002bc240, 0xc001054c60) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:238 +0x433 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).listen(0xc000176780, {0x162fe20, 0xc0002a5c70}, 0x43?) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:169 +0xb6 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).ListenWithError.func1() дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:152 +0x45 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync.(*Once).doSlow(0x193af2b?, 0x43?) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync/once.go:74 +0xc2 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync.(*Once).Do(...) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: sync/once.go:65 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event.(*listener).ListenWithError(0xc00440c270?, {0x162fe20?, 0xc0002a5c70?}, 0x0?) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/morph/event/listener.go:151 +0x67 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.listenMorphNotifications.func1.1({0x162fe20?, 0xc0002a5c70?}, 0xc000d37f10?) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/morph.go:213 +0x39 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.runAndLog({0x162fe20, 0xc0002a5c70}, 0xc000297800, {0x13c0fcc, 0x12}, 0x0, 0xc000276f68) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/main.go:116 +0xc4 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.listenMorphNotifications.func1({0x162fe20?, 0xc0002a5c70?}) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/morph.go:212 +0x68 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: main.startWorker.func1({{0x0?, 0xc004c94400?}, 0xc00272ca60?}) дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/worker.go:28 +0x37 дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: created by main.startWorker дек 12 02:37:31 blackmetal-node16 frostfs-node[7839]: git.frostfs.info/TrueCloudLab/frostfs-node/cmd/frostfs-node/worker.go:27 +0x138 дек 12 02:37:31 blackmetal-node16 systemd[1]: frostfs-storage.service: Main process exited, code=exited, status=2/INVALIDARGUMENT дек 12 02:37:31 blackmetal-node16 systemd[1]: frostfs-storage.service: Failed with result 'exit-code'. ```
dstepanov-yadro requested review from storage-core-committers 2023-12-18 15:23:06 +00:00
dstepanov-yadro requested review from storage-core-developers 2023-12-18 15:23:07 +00:00
dstepanov-yadro force-pushed fix/shutdown_panic from d03042fcd4 to 3c16114465 2023-12-18 15:47:50 +00:00 Compare
aarifullin approved these changes 2023-12-18 16:16:30 +00:00
dstepanov-yadro force-pushed fix/shutdown_panic from 3c16114465 to 484836b9f9 2023-12-19 05:27:30 +00:00 Compare
acid-ant approved these changes 2023-12-19 06:34:56 +00:00
fyrchik merged commit d69d318cb0 into master 2023-12-19 16:38:06 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#878
No description provided.