Commit graph

37 commits

Author SHA1 Message Date
Roman Khimov
8f45d57612 *: stop using math/rand
Mostly this switches to math/rand/v2, but sometimes randomness is not really needed.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
2024-08-30 17:00:11 +03:00
Roman Khimov
b8c09f509f network: add random slight delay to connection attempts
Small (especially dockerized/virtualized) networks often start all nodes at
ones and then we see a lot of connection flapping in the log. This happens
because nodes try to connect to each other simultaneously, establish two
connections, then each one finds a duplicate and drops it, but this can be
different duplicate connections on other sides, so they retry and it all
happens for some time. Eventually everything settles, but we have a lot of
garbage in the log and a lot of useless attempts.

This random waiting timeout doesn't change the logic much, adds a minimal
delay, but increases chances for both nodes to establish a proper single
connection on both sides to only then see another one and drop it on both
sides as well. It leads to almost no flapping in small networks, doesn't
affect much bigger ones. The delay is close to unnoticeable especially if
there is something in the DB for node to process during startup.
2022-11-17 18:42:43 +03:00
Roman Khimov
075a54192c network: don't try too many connections
Consider mainnet, it has an AttemptConnPeers of 20, so may already have 3
peers and request 20 more, then have 4th connected and attemtp 20 more again,
this leads to a huge number of connections easily.
2022-11-17 18:03:04 +03:00
Roman Khimov
23f118a1a9 network: rework discoverer/server interaction
* treat connected/handshaked peers separately in the discoverer, save
   "original" address for connected ones, it can be a name instead of IP and
   it's important to keep it to avoid reconnections
 * store name->IP mapping for seeds if and when they're connected to avoid
   reconnections
 * block seed if it's detected to be our own node (which is often the case for
   small private networks)
 * add an event for handshaked peers in the server, connected but
   non-handshaked ones are not really helpful for MinPeers or GetAddr logic

Fixes #2796.
2022-11-17 17:07:19 +03:00
Roman Khimov
af24051bf5 network: sleep a bit before retrying reconnects
If Dial() is to exit quickly we can end up in a retry loop eating CPU.
2022-10-24 14:34:48 +03:00
Roman Khimov
851cbc7dab network: implement adaptive peer requests
When the network is big enough, MinPeers may be suboptimal for good network
connectivity, but if we know the network size we can do some estimation on the
number of sufficient peers.
2022-10-14 15:53:32 +03:00
Roman Khimov
215e8704f1 network: simplify discoverer, make it almost a lib
We already have two basic lists: connected and unconnected nodes, we don't
need an additional channel and we don't need a goroutine to handle it.
2022-10-14 15:53:32 +03:00
Roman Khimov
c1ef326183 network: re-add addresses to the pool on UnregisterConnectedAddr
That's what we do anyway, but this way we can be a bit more efficient.
2022-10-14 14:12:33 +03:00
Roman Khimov
631f166709 network: broadcast to log-dependent number of nodes
Fixes #608.
2022-10-14 14:12:33 +03:00
Roman Khimov
dc62046019 network: add network size estimation metric 2022-10-12 22:29:55 +03:00
Roman Khimov
779a5c070f network: wait for exit in discoverer
And synchronize other threads with channels instead of mutexes. Overall this
scheme is more reliable.
2022-08-19 22:23:47 +03:00
Elizaveta Chichindaeva
28908aa3cf [#2442] English Check
Signed-off-by: Elizaveta Chichindaeva <elizaveta@nspcc.ru>
2022-05-04 19:48:27 +03:00
Roman Khimov
9d2712573f *: enable godot linter and fix all its warnings
It's important for NeoGo to have clean documentation. No functional changes.
2021-05-12 23:17:03 +03:00
Roman Khimov
d0634a7829 network: don't attempt to connect to the same node twice
We can have multiple copies of the same address in the pool and we should only
proceed to connect once per attempt.
2021-03-26 12:26:45 +03:00
Roman Khimov
163d90c866 network: don't register addresses before version handshake
1) It duplicates registration in `version` message handler and no valid
   connection can work without version exchange.
2) On public networks we have seed nodes defined by names, so we register
   connections to them using these names, but then if connection is dropped we
   delist them by IP:PORT combinations which can lead to zero PeerCount() with
   all seeds still being registered as connected in the discovery subsystem
   and thus no reconnection attempts being made.
2021-01-18 21:10:06 +03:00
Evgenii Stratonikov
27624946d9 network/test: add tests for server commands 2020-12-09 15:23:49 +03:00
Roman Khimov
1526772663 network: drop requests to discovery pool when it can't be handled
It happens from time to time in a four-node private network where there are
seeds (aka CNs) and not a lot of other nodes to connect to.

I don't know how to test for an infinite loop that has no side-effects, so no
test added here.
2020-12-04 21:39:50 +03:00
Roman Khimov
38a22b44b2 network: try connecting to seeds indefinitely, use them with 0 pool
If the node is to start with seeds unavailable it will try connecting to each
of them three times, blacklist them and then sit forever waiting for
something. It's not a good behavior, it should always try connecting to seeds
if nothing else works.
2020-10-13 19:02:10 +03:00
Roman Khimov
8028e08abc network: an address should either be good or bad, but not both 2020-10-13 19:01:45 +03:00
Anna Shaleva
8c5c248e79 protocol: add capabilities to address payload
Part of #871
2020-05-27 19:02:25 +03:00
Evgenii Stratonikov
b7dee156e2 network: fix a deadlock in DefaultDiscovery
Why a deadlock can occur:
1. (*DefaultDiscovery).run() has a for loop over requestCh channel.
2. (*DefaultDiscovery).RequestRemote() send to this channel while
    holding a mutex.
3. (*DefaultDiscovery).RegisterBadAddr() tries to take mutex for write.
4. Second select-case can't take mutex for read because of (3).
2020-03-10 15:40:23 +03:00
Roman Khimov
77624a8847 network: add Close() to discoverer, shut it down on exit 2020-02-28 16:22:04 +03:00
Roman Khimov
eb4ec61b8b network: register connected addr in handleVersionCmd()
Prevent useless attempts to connect to this peer if the peer has already made
a connection to us.
2020-01-30 14:03:52 +03:00
Vsevolod Brekelov
d374175170 monitoring: add prometheus monitoring
add init metrics service which uses prometheus;
add configuration for metrics service;
add monitoring metrics for blockchain,rpc,server;
2019-10-29 20:51:17 +03:00
Roman Khimov
006337b1f8 network: rework discovery with rwmutex, add test
Keeping run() as the owner of all maps would mean adding at least three more
channels to keep address getters with thread-safety. But then there also is a
race between requestToWork() and run() which is way harder to solve with
channels because there are lots of possibilities for deadlocks. So rework all
of this with good old mutexes.

While at it, fix `requestCh` handling in the inner select of run, it will waste
one loop to handle it, so we should add one to the `requested`.

Fixes #445.
2019-10-28 13:37:27 +03:00
Roman Khimov
77a50d6dc6 network: remove useless checks in discovery
These are useless.
2019-10-27 16:11:32 +03:00
Vsevolod Brekelov
8ee421db14 fix spelling and godoc comments 2019-10-22 17:56:03 +03:00
Roman Khimov
3fc2bf5452 *: fix some misspellings
Goreport:
   neo-go/pkg/core/contract_state_test.go
        Line 21: warning: "Contracto" is a misspelling of "Contraction" (misspell)
        Line 64: warning: "Contracto" is a misspelling of "Contraction" (misspell)

   neo-go/pkg/core/interop_neo.go
        Line 420: warning: "succeedes" is a misspelling of "succeeds" (misspell)

   neo-go/pkg/network/discovery.go
        Line 118: warning: "succeded" is a misspelling of "succeeded" (misspell)
        Line 128: warning: "successfuly" is a misspelling of "successfully" (misspell)
2019-10-17 12:30:24 +03:00
Roman Khimov
773ccc2b92 network: allow discoverer to reuse addresses
...and don't try to connect to the nodes we're already connected to.

Before this change we had a problem of discoverer throwing away good valid
addresses just because they are already known which lead to pool draining over
time (as address reuse was basically forbidden and getaddr may not get enough
new nodes).
2019-09-16 16:32:04 +03:00
Roman Khimov
2a49e68d77 network: start worker goroutine for every connection attempts
Prevents blocking on write to workCh which can be dangerous for the server.
2019-09-16 16:26:30 +03:00
Roman Khimov
b4e284f301 discovery: make pool management more reliable
Just drop excessive addresses, otherwise we can block for no good reason.
2019-09-16 16:26:30 +03:00
Roman Khimov
85f19936dd network: implement connection retries
It's worth to try a bit more than once.
2019-09-16 16:26:30 +03:00
Roman Khimov
be6c905e5d network: use and improve discovery mechanism for reconnections
This makes our node reconnect to other nodes if connection drops for some
reason. Fixes #390.
2019-09-16 16:26:30 +03:00
Evgeniy Kulikov
67cbcac643 Fix typos (#133)
* Fix typos

* revert chains/unit_testnet

* revert chains

* fix review comments (thx @AlexVanin)
2019-02-13 18:01:10 +00:00
Evgeniy Kulikov
630919bf7d Fix typos and warnings for GoReport / GolangCiLinter (#132)
- typos
- gofmt -s
- govet warnings
- golangci-lint run
2019-02-09 16:53:58 +01:00
Steven Jack
19a430b262 RCP server (#50)
* Adds basic RPC supporting files

* Adds interrupt handling and error chan

* Add getblock RPC method

* Update request structure

* Update names of nodes

* Allow bad addresses to be registered in discovery externally

* Small tidy up

* Few tweaks

* Check if error is close error in tcp transport

* Fix tests

* Fix priv port

* Small tweak to param name

* Comment fix

* Remove version from server

* Moves submitblock to TODO block

* Remove old field

* Bumps version and fix hex issues
2018-03-23 21:36:59 +01:00
Anthony De Meulemeester
aa4bc1b6e8
Node improvements (#47)
* block partial persist

* replaced refactored files with old one.

* removed gokit/log from deps

* Tweaks to not overburden remote nodes with getheaders/getblocks

* Changed Transporter interface to not take the server as argument due to a cause of race warning from the compiler

* started server test suite

* more test + return errors from message handlers

* removed --race from build

* Little improvements.
2018-03-14 10:36:59 +01:00