Validate advertised node addresses before adding to netmap #1497

Open
opened 2024-11-14 10:23:24 +00:00 by potyarkin · 0 comments
Member

I've encountered all sorts of weird problems (OOM, cryptic errors returned for PUT requests, etc) due to a misconfiguration on my part: storage nodes were (mis)configured to advertise both their and their neighbors' addresses in node.addresses[]:

addresses: # list of addresses announced by Storage node in the Network map
- s01.frostfs.devenv:8080
- /dns4/s02.frostfs.devenv/tcp/8081
- grpc://127.0.0.1:8082
- grpcs://localhost:8083

Describe the solution you'd like

Let's discuss whether innerring should intervene and gracefully handle such scenarios. This is especially relevant for public FrostFS deployments where untrusted actors may intentionally add misconfigured storage nodes to the network.

Innerring node may check (a) whether the advertised address is responsive and (b) whether the node replying on that address is the one that's advertising it. I think that dropping unresponsive addresses from netmap is a step too far (e.g. nodes may want to advertise their LAN address for local peering) but what about dropping addresses which sign replies with a wrong key? Theoretically, there exists a chance of false positive (LAN address collision between different LANs) but is that significant enough?

Describe alternatives you've considered

Fixing all dysfunctional behaviors caused by nodes advertising wrong addresses on netmap would be quite an effort, but I guess that's still an alternative to consider.

Additional context

  • Example of cryptic error caused by misconfigured storage nodes (private chat):
    status: code = 1024 message = incomplete object PUT by placement: internal/key.go: public key is different from the key in the network map: want 0383cafefa22109a9c1a0feac60d0ca464bcd5432adfad35b863d08e81e2790bbf, got 02604799e1413e07d2e749c05401abaaf15571856b20db488380dd59c8e5f2a79e
    
## Is your feature request related to a problem? Please describe. I've encountered all sorts of weird problems (OOM, cryptic errors returned for PUT requests, etc) due to a misconfiguration on my part: storage nodes were (mis)configured to advertise both their and their neighbors' addresses in `node.addresses[]`: https://git.frostfs.info/TrueCloudLab/frostfs-node/src/commit/d77a218f7c1a449369eb6d63e00ae1906984aed4/config/example/node.yaml#L27-L31 ## Describe the solution you'd like Let's discuss whether innerring should intervene and gracefully handle such scenarios. This is especially relevant for public FrostFS deployments where untrusted actors may intentionally add misconfigured storage nodes to the network. Innerring node may check (a) whether the advertised address is responsive and (b) whether the node replying on that address is the one that's advertising it. I think that dropping unresponsive addresses from netmap is a step too far (e.g. nodes may want to advertise their LAN address for local peering) but what about dropping addresses which sign replies with a wrong key? Theoretically, there exists a chance of false positive (LAN address collision between different LANs) but is that significant enough? ## Describe alternatives you've considered Fixing all dysfunctional behaviors caused by nodes advertising wrong addresses on netmap would be quite an effort, but I guess that's still an alternative to consider. ## Additional context - Example of cryptic error caused by misconfigured storage nodes ([private chat](https://chat.yadro.com/yadro/pl/qou9kkb49iy49joqb6rop1mcfh)): ``` status: code = 1024 message = incomplete object PUT by placement: internal/key.go: public key is different from the key in the network map: want 0383cafefa22109a9c1a0feac60d0ca464bcd5432adfad35b863d08e81e2790bbf, got 02604799e1413e07d2e749c05401abaaf15571856b20db488380dd59c8e5f2a79e ```
potyarkin added the
enhancement
discussion
frostfs-ir
P3
triage
labels 2024-11-14 10:23:24 +00:00
potyarkin changed title from innerring: Validate advertised node addresses before adding to netmap to Validate advertised node addresses before adding to netmap 2024-11-14 10:23:50 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1497
No description provided.