AIO fails client connections after some time #32

Closed
opened 2024-01-11 13:30:06 +00:00 by alexvanin · 2 comments
Owner

There are reports that AIO services are unavailable after it runs for some time (days, weeks). AIO contains 14k containers with small objects. Clients can't connect to the storage node and receive no healthy client errors while using SDK Pool.

Try to observe it by running longevity AIO stand and adding debug settings (enable pprof, etc.).

There are reports that AIO services are unavailable after it runs for some time (days, weeks). AIO contains 14k containers with small objects. Clients can't connect to the storage node and receive `no healthy client` errors while using SDK Pool. Try to observe it by running longevity AIO stand and adding debug settings (enable pprof, etc.).
alexvanin changed title from AIO connection after some time to AIO fails client connections after some time 2024-01-11 13:30:22 +00:00
Owner

Could the problem be somehow related to the connection handling in SDK pool, though?

Could the problem be somehow related to the connection handling in SDK pool, though?
ironbee was assigned by alexvanin 2024-01-31 09:28:12 +00:00
Author
Owner

Didn't reproduce. AIO image was running for 3 months preloaded with thousand of containers and objects.

$ docker ps
CONTAINER ID   IMAGE                            COMMAND                  CREATED        STATUS                  PORTS                                                                                                                                                                                                                        NAMES
cde08cad4e5d   truecloudlab/frostfs-aio:1.3.0   "/usr/bin/init-aio.sh"   3 months ago   Up 3 months (healthy)   0.0.0.0:6661-6664->6661-6664/tcp, :::6661-6664->6661-6664/tcp, 0.0.0.0:8080-8086->8080-8086/tcp, :::8080-8086->8080-8086/tcp, 0.0.0.0:16513->16513/tcp, :::16513->16513/tcp, 0.0.0.0:30333->30333/tcp, :::30333->30333/tcp   aio

$ frostfs-cli -c c.yml container list | wc -l
13999

All new connections with frostfs-cli and SDK Pool test app are handled just fine.

$ ./frostfs-cli -c c.yml object put --cid mFnxUV9CUUjGBKQ7vPBCSbbseNehkKmPooCfqdbUA1S --file ./c.yml       
[./c.yml] Object successfully stored
  OID: 5zPA3WBepEAp1HMe6XC8NTxTDdFnscVGdw7t2hH1Jvtz
  CID: mFnxUV9CUUjGBKQ7vPBCSbbseNehkKmPooCfqdbUA1S

There was an issue with environment restart that was found in this longrun, it was described and fixed in #34.

Closed.

Didn't reproduce. AIO image was running for 3 months preloaded with thousand of containers and objects. ``` $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cde08cad4e5d truecloudlab/frostfs-aio:1.3.0 "/usr/bin/init-aio.sh" 3 months ago Up 3 months (healthy) 0.0.0.0:6661-6664->6661-6664/tcp, :::6661-6664->6661-6664/tcp, 0.0.0.0:8080-8086->8080-8086/tcp, :::8080-8086->8080-8086/tcp, 0.0.0.0:16513->16513/tcp, :::16513->16513/tcp, 0.0.0.0:30333->30333/tcp, :::30333->30333/tcp aio $ frostfs-cli -c c.yml container list | wc -l 13999 ``` All new connections with frostfs-cli and SDK Pool test app are handled just fine. ``` $ ./frostfs-cli -c c.yml object put --cid mFnxUV9CUUjGBKQ7vPBCSbbseNehkKmPooCfqdbUA1S --file ./c.yml [./c.yml] Object successfully stored OID: 5zPA3WBepEAp1HMe6XC8NTxTDdFnscVGdw7t2hH1Jvtz CID: mFnxUV9CUUjGBKQ7vPBCSbbseNehkKmPooCfqdbUA1S ``` There was an issue with environment restart that was found in this longrun, it was described and fixed in #34. Closed.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-aio#32
No description provided.