Fix/tree svc panic #322

Merged
fyrchik merged 1 commits from carpawell/frostfs-node:fix/panic-in-tree-svc into master 2023-05-05 07:06:16 +00:00
Collaborator

If a connection has not been established earlier, it stores nil in LRU
cache. Cache eviction tries to close every connection (even a nil one) and
panics but not crash the app because we are using pools.
That ugly bug also leads to a deadlock where Unlock is not called via
defer func (and that is the way I found it).

Signed-off-by: Pavel Karpy p.karpy@yadro.com

Could be the reason for #260 but should be rechecked.

If a connection has not been established earlier, it stores `nil` in LRU cache. Cache eviction tries to close every connection (even a `nil` one) and panics but not crash the app because we are using pools. That ugly bug also leads to a deadlock where `Unlock` is not called via `defer` func (and that is the way I found it). Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Could be the reason for #260 but should be rechecked.
carpawell added the
bug
frostfs-node
P0
labels 2023-05-04 16:47:55 +00:00
carpawell self-assigned this 2023-05-04 16:47:55 +00:00
carpawell force-pushed fix/panic-in-tree-svc from 1774705f93 to 479c5a65e1 2023-05-04 16:49:00 +00:00 Compare
carpawell changed title from [#xxx] node: Fix tree svc panic to Fix/tree svc panic 2023-05-04 16:49:07 +00:00
carpawell requested review from storage-core-developers 2023-05-04 16:49:19 +00:00
carpawell requested review from storage-core-committers 2023-05-04 16:49:19 +00:00
carpawell requested review from fyrchik 2023-05-04 16:49:23 +00:00
fyrchik approved these changes 2023-05-04 16:54:34 +00:00

We need the same for 1.2 (support/v0.36)

We need the same for 1.2 (support/v0.36)
Poster
Collaborator

@fyrchik, no problemo: #323.

@fyrchik, no problemo: https://git.frostfs.info/TrueCloudLab/frostfs-node/pulls/323.

Could you explain, how exactly it could be the reason for #260?

Could you explain, how exactly it _could_ be the reason for #260?
Poster
Collaborator

@fyrchik, that is a theory. Every tree svc operation was blocked when i looked at them cause of that bug. Tree sync is related to handling morph events (and event should be handled correctly without any blocks to be handled again next time). At least it stopped some operations and locked them forever. That was the first i saw in the pprof and absolutely no locks in any morph code.

@fyrchik, that is a theory. Every tree svc operation was blocked when i looked at them cause of that bug. Tree sync is related to handling morph events (and event should be handled correctly without any blocks to be handled again next time). At least it stopped some operations and locked them forever. That was the first i saw in the pprof and _absolutely no locks_ in any `morph` code.
dstepanov-yadro approved these changes 2023-05-05 06:26:17 +00:00
acid-ant approved these changes 2023-05-05 06:36:58 +00:00
fyrchik merged commit 479c5a65e1 into master 2023-05-05 07:06:16 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#322
There is no content yet.