Commit graph

39 commits

Author SHA1 Message Date
Roman Khimov
e1d5f18ff4 network: fix outdated Peer interface comments 2022-10-12 10:16:07 +03:00
Roman Khimov
e80c60a3b9 network: rework broadcast logic
We have a number of queues for different purposes:
 * regular broadcast queue
 * direct p2p queue
 * high-priority queue

And two basic egress scenarios:
 * direct p2p messages (replies to requests in Server's handle* methods)
 * broadcasted messages

Low priority broadcasted messages:
 * transaction inventories
 * block inventories
 * notary inventories
 * non-consensus extensibles

High-priority broadcasted messages:
 * consensus extensibles
 * getdata transaction requests from consensus process
 * getaddr requests

P2P messages are a bit more complicated, most of the time they use p2p queue,
but extensible message requests/replies use HP queue.

Server's handle* code is run from Peer's handleIncoming, every peer has this
thread that handles incoming messages. When working with the peer it's
important to reply to requests and blocking this thread until we send (queue)
a reply is fine, if the peer is slow we just won't get anything new from
it. The queue used is irrelevant wrt this issue.

Broadcasted messages are radically different, we want them to be delivered to
many peers, but we don't care about specific ones. If it's delivered to 2/3 of
the peers we're fine, if it's delivered to more of them --- it's not an
issue. But doing this fairly is not an easy thing, current code tries performing
unblocked sends and if this doesn't yield enough results it then blocks (but
has a timeout, we can't wait indefinitely). But it does so in sequential
manner, once the peer is chosen the code will wait for it (and only it) until
timeout happens.

What can be done instead is an attempt to push the message to all of the peers
simultaneously (or close to that). If they all deliver --- OK, if some block
and wait then we can wait until _any_ of them pushes the message through (or
global timeout happens, we still can't wait forever). If we have enough
deliveries then we can cancel pending ones and it's again not an error if
these canceled threads still do their job.

This makes the system more dynamic and adds some substantial processing
overhead, but it's a networking code, any of this overhead is much lower than
the actual packet delivery time. It also allows to spread the load more
fairly, if there is any spare queue it'll get the packet and release the
broadcaster. On the next broadcast iteration another peer is more likely to be
chosen just because it didn't get a message previously (and had some time to
deliver already queued messages).

It works perfectly in tests, with optimal networking conditions we have much
better block times and TPS increases by 5-25%% depending on the scenario.

I'd go as far as to say that it fixes the original problem of #2678, because
in this particular scenario we have empty queues in ~100% of the cases and
this new logic will likely lead to 100% fan out in this case (cancelation just
won't happen fast enough). But when the load grows and there is some waiting
in the queue it will optimize out the slowest links.
2022-10-11 18:42:40 +03:00
Elizaveta Chichindaeva
28908aa3cf [#2442] English Check
Signed-off-by: Elizaveta Chichindaeva <elizaveta@nspcc.ru>
2022-05-04 19:48:27 +03:00
Evgenii Stratonikov
0a5049658f network: support non-blocking broadcast
Right now a single slow peer can slow down whole network.
Do broadcast in 2 parts:
1. Perform non-blocking send to all peers if possible.
2. Perform blocking sends until message is sent to 2/3 of good peers.
2020-12-25 14:36:52 +03:00
Roman Khimov
2ce3c8b75f network: treat unsolicited addr commands as errors
See neo-project/neo#2097.
2020-11-25 13:34:38 +03:00
Roman Khimov
c8cc91eeee network: request blocks when there is a ping with bigger than ours height
Turns out, C# node no longer broadcasts an Inv when it's creating a block,
instead it sends a ping and if we're not paying attention to the height
specified there we're technically missing a new block. Of course we'll get it
later after ping timer expiration and regular ping/pong sequence, but that's
delaying it for no good reason.
2020-08-14 16:22:15 +03:00
Anna Shaleva
c590cc02f4 protocol: add capabilities to version payload
closes #871
2020-05-27 19:01:14 +03:00
Roman Khimov
e41d434a49 *: move all packages from CityOfZion to nspcc-dev 2020-03-03 17:21:42 +03:00
Roman Khimov
9eafec0d1d network: introduce peer-to-peer message queue
This one is designed to give more priority to direct nodes communication, that
is that their messaging would have more priority than generic broadcasts. It
should improve consensus process under TX pressure and allow to handle
pings in time (preventing disconnects).
2020-01-30 14:03:52 +03:00
Roman Khimov
06c3fbe455 network: rework ping sends, fix overpinging
Our node was too pingy because of wrong timer setups (that divided timeout
Duration by time.Second), it also was wrong in its time calculations (using
UTC time to calculate intervals). At the same time missing block is a
server-wide problem, so it's better solved with server-wide protocol loop.
2020-01-28 17:39:52 +03:00
Roman Khimov
1f672e0da7 network: move SendVersion() to the Peer
Only leave server-specific `getVersionMsg()` in the Server, all the other
logic is peer-related.
2020-01-21 17:26:08 +03:00
Roman Khimov
2c4ace022e network/config: redesign ping timeout handling a bit
1) Make timeout a timeout, don't do magic ping counts.
2) Drop additional timer from the main peer's protocol loop, create it
   dynamically and make it disconnect the peer.
3) Don't expose the ping counter to the outside, handle more logic inside the
   Peer.

Relates to #430.
2020-01-20 19:37:17 +03:00
Roman Khimov
0ba6b2a754 network: introduce peer sending queues
Two queues for high-priority and ordinary messages. Fixes #590. These queues
are deliberately made small to avoid buffer bloat problem, there is gonna be
another queueing layer above them to compensate for that. The queues are
designed to be synchronous in enqueueing, async capabilities are to be added
layer above later.
2020-01-20 17:23:26 +03:00
Roman Khimov
7f0882767c network: remove useless Done() method from the peer
It's internal state of the peer that no one should care about.
2020-01-20 17:23:26 +03:00
Roman Khimov
907a236285 network: move per-peer goroutines into the TCPPeer
As they're directly tied to it.
2020-01-20 17:23:26 +03:00
Vsevolod Brekelov
4e6ed9021c network: add ping pong processing
add pingInterval same as used in ref C# implementation with the same logic
add pingTimeout which is used to check whether pong received. If not -- drop the peer.
add pingLimit which is hardcoded to 4 in TCPPeer. It's limit for unsuccessful ping/pong calls (where pong wasn't received in pingTimeout interval)
2020-01-17 13:24:14 +03:00
Roman Khimov
e859e03240 network: split Peer's NetAddr into RemoteAddr and PeerAddr
As they are different things used for different purposes.
2019-11-06 15:26:24 +03:00
Roman Khimov
76c7cff67f network: make node strictly follow handshake procedure
Don't accept other messages before handshake is completed, check handshake
message sequence.
2019-09-16 16:32:04 +03:00
Roman Khimov
8d9bc83214 util: drop Endpoint structure, fix #321
I think it's useless, buggy and hides parsing errors for no good reason.
2019-09-09 17:54:38 +03:00
Roman Khimov
a9b9c9226d *: add/fix godoc comments to satisfy golint
Fixes things like:
 * exported type/method/function X should have comment or be unexported
 * comment on exported type/method/function X should be of the form "X ..."
   (with optional leading article)

Refs. #213.
2019-09-03 17:57:51 +03:00
Anthony De Meulemeester
ab2568cc51
Fixed some networking issues (#68)
* Faster persist timer

* fixed networking issues.
2018-04-13 12:14:08 +02:00
Anthony De Meulemeester
aa4bc1b6e8
Node improvements (#47)
* block partial persist

* replaced refactored files with old one.

* removed gokit/log from deps

* Tweaks to not overburden remote nodes with getheaders/getblocks

* Changed Transporter interface to not take the server as argument due to a cause of race warning from the compiler

* started server test suite

* more test + return errors from message handlers

* removed --race from build

* Little improvements.
2018-03-14 10:36:59 +01:00
Anthony De Meulemeester
aa4bd34b6b
Node network improvements (#45)
* small improvements.

* Fixed datarace + cleanup node and peer

* bumped version.

* removed race flag to pass build
2018-03-10 13:04:06 +01:00
Anthony De Meulemeester
4023661cf1
Refactor of the Go node (#44)
* added headersOp for safely processing headers

* Better handling of protocol messages.

* housekeeping + cleanup tests

* Added more blockchain logic + unit tests

* fixed unreachable error.

* added structured logging for all (node) components.

* added relay flag + bumped version
2018-03-09 16:55:25 +01:00
Steven Jack
42195b1af4 Refactor peer message sending into single interface method .Send() (#40)
* Adds Send method to Peer interface and removes redundant methods

* Fix imports

* Bumps version
2018-03-04 14:47:56 +01:00
Anthony De Meulemeester
b6d8271b8d
Fixed header sync issue (#17)
* headers can now sync till infinity

* fixed empty hashStop getBlock payload + test

* added more test + more binary decoding/encoding

* bump version
2018-02-07 15:16:50 +01:00
Anthony De Meulemeester
046494dd68
Implemented processing headers + added leveldb as a dependency. (#16)
* Implemented processing headers + added leveldb as a dependency.

* version 0.7.0

* put glide get and install in build_cli section
2018-02-06 07:43:32 +01:00
Anthony De Meulemeester
628656483a
bug fixes (TCP + uint256) and started core part (#14)
* Fixed TCP read + Uint256 reversed array + started on some core pieces

* Disabled some debug output + muted test

* 0.5.0
2018-02-04 20:54:51 +01:00
Anthony De Meulemeester
6e3f1ec43e
Refactor tcp transport (#11)
* refactored tcp transport

* return errors on outgoing messages

* TCP transport should report its error after reading from connection

* handle error returned from peer transport

* bump version

* cleaned up error
2018-02-02 11:02:25 +01:00
Charlie Revett
dd94086a22
CircleCI 2 & Releases (#9) 2018-02-01 10:54:23 -08:00
anthdm
b416a51db7 tweaked TCP transport + finished version + verack. 2018-02-01 14:53:49 +01:00
anthdm
626a82b93e deleted proxy functions + moved TCPPeer to tcp file 2018-01-31 22:14:13 +01:00
anthdm
861882ff83 refactor server RPC. 2018-01-31 20:11:08 +01:00
anthdm
0c9d2dd04e Block message + handle the length of the user agent better. 2018-01-31 09:27:08 +01:00
anthdm
d4a96267c6 Refactor version msg 2018-01-29 19:17:49 +01:00
anthdm
1821ff1a0e handle address list message. 2018-01-28 14:59:32 +01:00
anthdm
d7826a4d43 wip implement inv command 2018-01-27 16:47:43 +01:00
anthdm
058459c65d Added initiator field to peer to detect in the peer initiated the connected. 2018-01-26 21:42:43 +01:00
anthdm
536a499236 initial commit. 2018-01-26 19:04:13 +01:00