Commit graph

17 commits

Author SHA1 Message Date
Roman Khimov
b8c09f509f network: add random slight delay to connection attempts
Small (especially dockerized/virtualized) networks often start all nodes at
ones and then we see a lot of connection flapping in the log. This happens
because nodes try to connect to each other simultaneously, establish two
connections, then each one finds a duplicate and drops it, but this can be
different duplicate connections on other sides, so they retry and it all
happens for some time. Eventually everything settles, but we have a lot of
garbage in the log and a lot of useless attempts.

This random waiting timeout doesn't change the logic much, adds a minimal
delay, but increases chances for both nodes to establish a proper single
connection on both sides to only then see another one and drop it on both
sides as well. It leads to almost no flapping in small networks, doesn't
affect much bigger ones. The delay is close to unnoticeable especially if
there is something in the DB for node to process during startup.
2022-11-17 18:42:43 +03:00
Roman Khimov
23f118a1a9 network: rework discoverer/server interaction
* treat connected/handshaked peers separately in the discoverer, save
   "original" address for connected ones, it can be a name instead of IP and
   it's important to keep it to avoid reconnections
 * store name->IP mapping for seeds if and when they're connected to avoid
   reconnections
 * block seed if it's detected to be our own node (which is often the case for
   small private networks)
 * add an event for handshaked peers in the server, connected but
   non-handshaked ones are not really helpful for MinPeers or GetAddr logic

Fixes #2796.
2022-11-17 17:07:19 +03:00
Roman Khimov
215e8704f1 network: simplify discoverer, make it almost a lib
We already have two basic lists: connected and unconnected nodes, we don't
need an additional channel and we don't need a goroutine to handle it.
2022-10-14 15:53:32 +03:00
Roman Khimov
c1ef326183 network: re-add addresses to the pool on UnregisterConnectedAddr
That's what we do anyway, but this way we can be a bit more efficient.
2022-10-14 14:12:33 +03:00
Roman Khimov
631f166709 network: broadcast to log-dependent number of nodes
Fixes #608.
2022-10-14 14:12:33 +03:00
Roman Khimov
b8192d0958 network: optimize waiting in test
require.Eventually polls more often reducing average waiting time.
2021-07-08 11:14:35 +03:00
Roman Khimov
9cc4f42a71 network: fix discoverer test
Asynchronous tryAddress() routines may get dial result AFTER the switch to
another test, so we need to ensure that they'll get the result intended for
this particular call. Fixes:

2021-07-07T20:25:40.1624521Z === RUN   TestDefaultDiscoverer
2021-07-07T20:25:40.1625316Z     discovery_test.go:159: timeout expecting for transport dial; i: 2, j: 1
2021-07-07T20:25:40.1626319Z --- FAIL: TestDefaultDiscoverer (1.19s)
2021-07-08 11:03:30 +03:00
Roman Khimov
685ff8bbf6 network: reduce dial timeout in discoverer test
We don't care much about dialing, but the same constant is used in outer
discoverer loop in case no connections are established and we have no
connections established.
2021-07-08 10:40:54 +03:00
Roman Khimov
78bf172108 network: drop some not really useful test code
SA4010: this result of append is never used, except maybe in other appends (staticcheck)
2021-05-12 19:29:45 +03:00
Roman Khimov
163d90c866 network: don't register addresses before version handshake
1) It duplicates registration in `version` message handler and no valid
   connection can work without version exchange.
2) On public networks we have seed nodes defined by names, so we register
   connections to them using these names, but then if connection is dropped we
   delist them by IP:PORT combinations which can lead to zero PeerCount() with
   all seeds still being registered as connected in the discovery subsystem
   and thus no reconnection attempts being made.
2021-01-18 21:10:06 +03:00
Evgenii Stratonikov
27624946d9 network/test: add tests for server commands 2020-12-09 15:23:49 +03:00
Roman Khimov
38a22b44b2 network: try connecting to seeds indefinitely, use them with 0 pool
If the node is to start with seeds unavailable it will try connecting to each
of them three times, blacklist them and then sit forever waiting for
something. It's not a good behavior, it should always try connecting to seeds
if nothing else works.
2020-10-13 19:02:10 +03:00
Roman Khimov
8028e08abc network: an address should either be good or bad, but not both 2020-10-13 19:01:45 +03:00
Anna Shaleva
8c5c248e79 protocol: add capabilities to address payload
Part of #871
2020-05-27 19:02:25 +03:00
Anna Shaleva
c590cc02f4 protocol: add capabilities to version payload
closes #871
2020-05-27 19:01:14 +03:00
Roman Khimov
77624a8847 network: add Close() to discoverer, shut it down on exit 2020-02-28 16:22:04 +03:00
Roman Khimov
006337b1f8 network: rework discovery with rwmutex, add test
Keeping run() as the owner of all maps would mean adding at least three more
channels to keep address getters with thread-safety. But then there also is a
race between requestToWork() and run() which is way harder to solve with
channels because there are lots of possibilities for deadlocks. So rework all
of this with good old mutexes.

While at it, fix `requestCh` handling in the inner select of run, it will waste
one loop to handle it, so we should add one to the `requested`.

Fixes #445.
2019-10-28 13:37:27 +03:00