It happens from time to time in a four-node private network where there are
seeds (aka CNs) and not a lot of other nodes to connect to.
I don't know how to test for an infinite loop that has no side-effects, so no
test added here.
1) It duplicates registration in `version` message handler and no valid
connection can work without version exchange.
2) On public networks we have seed nodes defined by names, so we register
connections to them using these names, but then if connection is dropped we
delist them by IP:PORT combinations which can lead to zero PeerCount() with
all seeds still being registered as connected in the discovery subsystem
and thus no reconnection attempts being made.
If the node is to start with seeds unavailable it will try connecting to each
of them three times, blacklist them and then sit forever waiting for
something. It's not a good behavior, it should always try connecting to seeds
if nothing else works.
Why a deadlock can occur:
1. (*DefaultDiscovery).run() has a for loop over requestCh channel.
2. (*DefaultDiscovery).RequestRemote() send to this channel while
holding a mutex.
3. (*DefaultDiscovery).RegisterBadAddr() tries to take mutex for write.
4. Second select-case can't take mutex for read because of (3).
Keeping run() as the owner of all maps would mean adding at least three more
channels to keep address getters with thread-safety. But then there also is a
race between requestToWork() and run() which is way harder to solve with
channels because there are lots of possibilities for deadlocks. So rework all
of this with good old mutexes.
While at it, fix `requestCh` handling in the inner select of run, it will waste
one loop to handle it, so we should add one to the `requested`.
Fixes#445.
Goreport:
neo-go/pkg/core/contract_state_test.go
Line 21: warning: "Contracto" is a misspelling of "Contraction" (misspell)
Line 64: warning: "Contracto" is a misspelling of "Contraction" (misspell)
neo-go/pkg/core/interop_neo.go
Line 420: warning: "succeedes" is a misspelling of "succeeds" (misspell)
neo-go/pkg/network/discovery.go
Line 118: warning: "succeded" is a misspelling of "succeeded" (misspell)
Line 128: warning: "successfuly" is a misspelling of "successfully" (misspell)
...and don't try to connect to the nodes we're already connected to.
Before this change we had a problem of discoverer throwing away good valid
addresses just because they are already known which lead to pool draining over
time (as address reuse was basically forbidden and getaddr may not get enough
new nodes).
* Adds basic RPC supporting files
* Adds interrupt handling and error chan
* Add getblock RPC method
* Update request structure
* Update names of nodes
* Allow bad addresses to be registered in discovery externally
* Small tidy up
* Few tweaks
* Check if error is close error in tcp transport
* Fix tests
* Fix priv port
* Small tweak to param name
* Comment fix
* Remove version from server
* Moves submitblock to TODO block
* Remove old field
* Bumps version and fix hex issues
* block partial persist
* replaced refactored files with old one.
* removed gokit/log from deps
* Tweaks to not overburden remote nodes with getheaders/getblocks
* Changed Transporter interface to not take the server as argument due to a cause of race warning from the compiler
* started server test suite
* more test + return errors from message handlers
* removed --race from build
* Little improvements.