neo-go

mirror of https://github.com/nspcc-dev/neo-go.git synced 2024-11-26 19:42:23 +00:00

Author	SHA1	Message	Date
Anna Shaleva	7028930bd6	network: do not use error channel to start network srv It's obsolete thing, we have looger and it perfectly suits our needs. Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>	2023-08-01 17:22:01 +03:00
Anna Shaleva	0cbef58b3c	consensus: enqueue newly created blocks Do not add them directly to chain, it will be done by the block queue manager. Close https://github.com/nspcc-dev/neo-go/issues/2923. However, this commit is not valid without https://github.com/roman-khimov/dbft/pull/4. It's the neo-go's duty to initialize consensus after subsequent block addition; the dBFT itself must wait for the neo-go to complete the block addition and notify the dBFT, so that it can initialize at 0-th view to collect the next block.	2023-03-15 17:37:47 +03:00
Anna Shaleva	04d0b45ceb	network: move blockqueue to a separate package	2023-03-15 17:37:47 +03:00
Anna Shaleva	91a77c25a2	network: refactor blockqueuer interface Remove unused argument.	2023-03-15 17:37:47 +03:00
Roman Khimov	4f708c037d	network: drain send queues on peer disconnection Fix potential memory leak with a lot of connected clients that keep requesting things from node and then disconnect.	2023-02-21 16:19:06 +03:00
Anna Shaleva	da757fa387	network: fix grammar typo in the error message	2023-02-20 11:08:07 +03:00
Anna Shaleva	28927228f0	*: adjust subscription-related doc Add a warning about received events modification where applicable.	2023-01-17 17:11:19 +03:00
Anna Shaleva	9b364aa7ee	network: do not allow to request invalid block count The problem is in peer disconnection due to invalid GetBlockByIndex payload (the logs are from some patched neo-go version): ``` дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.490Z INFO new peer connected {"addr": "10.78.69.115:50846", "peerCount": 3} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.490Z WARN peer disconnected {"addr": "10.78.69.115:50846", "error": "invalid block count", "peerCount": 2} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.490Z INFO started protocol {"addr": "10.78.69.115:50846", "userAgent": "/NEO-GO:1.0.0/", "startHeight": 0, "id": 1339571820} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.491Z INFO new peer connected {"addr": "10.78.69.115:50856", "peerCount": 3} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.492Z WARN peer disconnected {"addr": "10.78.69.115:50856", "error": "invalid block count", "peerCount": 2} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.492Z INFO started protocol {"addr": "10.78.69.115:50856", "userAgent": "/NEO-GO:1.0.0/", "startHeight": 0, "id": 1339571820} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.492Z INFO new peer connected {"addr": "10.78.69.115:50858", "peerCount": 3} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.493Z INFO started protocol {"addr": "10.78.69.115:50858", "userAgent": "/NEO-GO:1.0.0/", "startHeight": 0, "id": 1339571820} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.493Z WARN peer disconnected {"addr": "10.78.69.115:50858", "error": "invalid block count", "peerCount": 2} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.494Z INFO new peer connected {"addr": "10.78.69.115:50874", "peerCount": 3} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.494Z INFO started protocol {"addr": "10.78.69.115:50874", "userAgent": "/NEO-GO:1.0.0/", "startHeight": 0, "id": 1339571820} дек 15 16:02:39 glagoli neo-go[928530]: 2022-12-15T16:02:39.494Z WARN peer disconnected {"addr": "10.78.69.115:50874", "error": "invalid block count", "peerCount": 2} ``` GetBlockByIndex payload can't be decoded, and the only possible cause is zero (or <-1, but it's probably not the case) block count requested. Error is improved as far.	2022-12-28 13:04:56 +03:00
Anna Shaleva	c0a453a53b	network: adjust requestBlocs logic If the lastQueued block index is the same as the one we'd like to request in payload, then we need to increment the payload's count.	2022-12-28 12:50:30 +03:00
Roman Khimov	e79dec15f9	*: use zap.Stringer instead of zap.String where it can be used It's a bit more efficient in case we're not logging the message (mostly for debug), makes the code somewhat simpler as well.	2022-12-13 12:44:54 +03:00
Roman Khimov	7589733017	config: add a special Blockchain type to configure Blockchain And include some node-specific configurations there with backwards compatibility. Note that in the future we'll remove Ledger's fields from the ProtocolConfiguration and it'll be possible to access them in Blockchain directly (not via .Ledger). The other option tried was using two configuration types separately, but that incurs more changes to the codebase, single structure that behaves almost like the old one is better for backwards compatibility. Fixes #2676.	2022-12-07 17:35:53 +03:00
Anna Shaleva	82221b0ca7	*: fix Neo and NeoGo misuses	2022-12-07 17:29:09 +03:00
Anna Shaleva	54c2aa8582	config: move P2P options to a separate config section And convert time-related settings to a Duration format along the way.	2022-12-07 13:06:05 +03:00
Anna Shaleva	9cf6cc61f4	network: allow multiple bind addresses for server And replace Transporter.Address() with Transporter.HostPort() along the way.	2022-12-07 13:06:03 +03:00
Roman Khimov	c2adbf768b	config: add TimePerBlock to replace SecondsPerBlock It's more generic and convenient than MillisecondsPerBlock. This setting is made in backwards-compatible fashion, but it'll override SecondsPerBlock if both are used. Configurations are specifically not changed here, it's important to check compatibility. Fixes #2675.	2022-12-02 19:52:14 +03:00
Roman Khimov	0ad6e295ea	core: make GetHeaderHash accept uint32 It should've always been this way because block indexes are uint32.	2022-11-25 14:30:51 +03:00
Roman Khimov	b8c09f509f	network: add random slight delay to connection attempts Small (especially dockerized/virtualized) networks often start all nodes at ones and then we see a lot of connection flapping in the log. This happens because nodes try to connect to each other simultaneously, establish two connections, then each one finds a duplicate and drops it, but this can be different duplicate connections on other sides, so they retry and it all happens for some time. Eventually everything settles, but we have a lot of garbage in the log and a lot of useless attempts. This random waiting timeout doesn't change the logic much, adds a minimal delay, but increases chances for both nodes to establish a proper single connection on both sides to only then see another one and drop it on both sides as well. It leads to almost no flapping in small networks, doesn't affect much bigger ones. The delay is close to unnoticeable especially if there is something in the DB for node to process during startup.	2022-11-17 18:42:43 +03:00
Roman Khimov	075a54192c	network: don't try too many connections Consider mainnet, it has an AttemptConnPeers of 20, so may already have 3 peers and request 20 more, then have 4th connected and attemtp 20 more again, this leads to a huge number of connections easily.	2022-11-17 18:03:04 +03:00
Roman Khimov	6bce973ac2	network: drop duplicationg check from handleAddrCmd() It was relevant with the queue-based discoverer, now it's not, discoverer handles this internally.	2022-11-17 17:42:36 +03:00
Roman Khimov	1c7487b8e4	network: add a timer to check for peers Consider initial connection phase for public networks: * simultaneous connections to seeds * very quick handshakes * got five handshaked peers and some getaddr requests sent * but addr replies won't trigger new connections * so we can stay with just five connections until any of them breaks or a (long) address checking timer fires This new timers solves the problem, it's adaptive at the same time. If we have enough peers we won't be waking up often.	2022-11-17 17:32:05 +03:00
Roman Khimov	23f118a1a9	network: rework discoverer/server interaction * treat connected/handshaked peers separately in the discoverer, save "original" address for connected ones, it can be a name instead of IP and it's important to keep it to avoid reconnections * store name->IP mapping for seeds if and when they're connected to avoid reconnections * block seed if it's detected to be our own node (which is often the case for small private networks) * add an event for handshaked peers in the server, connected but non-handshaked ones are not really helpful for MinPeers or GetAddr logic Fixes #2796.	2022-11-17 17:07:19 +03:00
Roman Khimov	6ba4afc977	network: consider handshaked peers only when comparing with MinPeers We don't know a lot about non-handshaked ones, so it's safer to try more connection.	2022-11-17 16:40:29 +03:00
Anna Shaleva	6f3a0a6b4c	network: adjust warning for deposit expiration Provide additional info for better user experience.	2022-11-15 14:16:34 +03:00
Roman Khimov	c405092953	network: pre-filter transactions going into dbft Drop some load from dbft loop during consensus process.	2022-11-11 15:32:51 +03:00
Roman Khimov	e19d867d4e	Merge pull request #2761 from nspcc-dev/fancy-getaddr Fancy getaddr	2022-10-25 16:51:38 +07:00
Roman Khimov	28f54d352a	network: do getaddr requests periodically, fix #2745 Every 1000 blocks seems to be OK for big networks (that only had done some initial requests previously and then effectively never requested addresses again because there was a sufficient number of addresses), won't hurt smaller ones as well (that effectively keep doing this on every connect/disconnect, peer changes are very rare there, but when they happen we want to have some quick reaction to these changes).	2022-10-24 15:10:51 +03:00
Roman Khimov	9efc110058	network: it is 42 32 is a very good number, but we all know 42 is a better one. And it can even be proven by tests with higher peaking TPS values. You may wonder why is it so good? Because we're using packet-switching networks mostly and a packet is a packet almost irrespectively of how bit it is. Yet a packet has some maximum possible size (hi, MTU) and this size most of the time is 1500 (or a little less than that, hi VPN). Subtract IP header (20 for IPv4 or 40 for IPv6 not counting options), TCP header (another 20) and Neo message/payload headers (~8 for this case) and we have just a little more than 1400 bytes for our dear hashes. Which means that in a single packet most of the time we can have 42-44 of them, maybe 45. Choosing between these numbers is not hard then.	2022-10-24 14:44:19 +03:00
Roman Khimov	9d6b18adec	network: drop minPoolCount magic constant We have AttemptConnPeers that is closely related, the more we have there the bigger the network supposedly is, so it's much better than magic minPoolCount.	2022-10-24 14:36:10 +03:00
Roman Khimov	af24051bf5	network: sleep a bit before retrying reconnects If Dial() is to exit quickly we can end up in a retry loop eating CPU.	2022-10-24 14:34:48 +03:00
Roman Khimov	f42b8e78fc	Merge pull request #2758 from nspcc-dev/check-inflight-tx-invs network: check inv against currently processed transactions	2022-10-24 14:16:33 +07:00
Roman Khimov	e26055190e	network: check inv against currently processed transactions Sometimes we already have it, but it's not yet processed, so we can save on getdata request. It only affects very high-speed networks like 4-1 scenario and it doesn't affect it a lot, but still we can do it.	2022-10-21 21:16:18 +03:00
Roman Khimov	cfb5058018	network: batch getdata replies This is not exactly the protocol-level batching as was tried in #1770 and proposed by neo-project/neo#2365, but it's a TCP-level change in that we now Write() a set of messages and given that Go sets up TCP sockets with TCP_NODELAY by default this is a substantial change, we have less packets generated with the same amount of data. It doesn't change anything on properly connected networks, but the ones with delays benefit from it a lot. This also improves queueing because we no longer generate 32 messages to deliver on transaction's GetData, it's just one stream of bytes with 32 messages inside. Do the same with GetBlocksByIndex, we can have a lot of messages there too. But don't forget about potential peer DoS attacks, if a peer is to request a lot of big blocks we need to flush them before we process the whole set.	2022-10-21 17:16:32 +03:00
Roman Khimov	e1b5ac9b81	network: separate tx handling from msg handling This allows to naturally scale transaction processing if we have some peer that is sending a lot of them while others are mostly silent. It also can help somewhat in the event we have 50 peers that all send transactions. 4+1 scenario benefits a lot from it, while 7+2 slows down a little. Delayed scenarios don't care. Surprisingly, this also makes disconnects (#2744) much more rare, 4-node scenario almost never sees it now. Most probably this is the case where peers affect each other a lot, single-threaded transaction receiver can be slow enough to trigger some timeout in getdata handler of its peer (because it tries to push a number of replies).	2022-10-21 12:11:24 +03:00
Roman Khimov	e003b67418	network: reuse inventory hash list for request hashes Microoptimization, we can do this because we only use them in handleInvCmd().	2022-10-21 11:28:40 +03:00
Roman Khimov	0f625f04f0	Merge pull request #2748 from nspcc-dev/stop-tx-flow network/consensus: use new dbft StopTxFlow callback	2022-10-18 16:29:37 +07:00
Roman Khimov	73ce898e27	network/consensus: use new dbft StopTxFlow callback It makes sense in general (further narrowing down the time window when transactions are processed by consensus thread) and it improves block times a little too, especially in the 7+2 scenario. Related to #2744.	2022-10-18 11:06:20 +03:00
Roman Khimov	2791127ee4	network: add prometheus histogram with cmd processing time It can be useful to detect some performance issues.	2022-10-17 22:51:16 +03:00
Roman Khimov	73079745ab	Merge pull request #2746 from nspcc-dev/optimize-tx-callbacks network: only call tx callback if we're waiting for transactions	2022-10-17 16:39:41 +07:00
Roman Khimov	dce9f80585	Merge pull request #2743 from nspcc-dev/log-fan-out Logarithmic gossip fan out	2022-10-14 23:18:34 +07:00
Roman Khimov	4dd3fd4ac0	network: only call tx callback if we're waiting for transactions Until the consensus process starts for a new block and until it really needs some transactions we can spare some cycles by not delivering transactions to it. In tests this doesn't affect TPS, but makes block delays a bit more stable. Related to #2744, I think it also may cause timeouts during transaction processing (waiting on the consensus process channel while it does something dBFT-related).	2022-10-14 18:45:48 +03:00
Roman Khimov	65f0fadddb	network: register peer only if it's not a duplicate	2022-10-14 15:53:32 +03:00
Roman Khimov	851cbc7dab	network: implement adaptive peer requests When the network is big enough, MinPeers may be suboptimal for good network connectivity, but if we know the network size we can do some estimation on the number of sufficient peers.	2022-10-14 15:53:32 +03:00
Roman Khimov	c17b2afab5	network: add BroadcastFactor to control gossip, fix #2678	2022-10-14 15:53:32 +03:00
Roman Khimov	215e8704f1	network: simplify discoverer, make it almost a lib We already have two basic lists: connected and unconnected nodes, we don't need an additional channel and we don't need a goroutine to handle it.	2022-10-14 15:53:32 +03:00
Roman Khimov	c1ef326183	network: re-add addresses to the pool on UnregisterConnectedAddr That's what we do anyway, but this way we can be a bit more efficient.	2022-10-14 14:12:33 +03:00
Roman Khimov	631f166709	network: broadcast to log-dependent number of nodes Fixes #608.	2022-10-14 14:12:33 +03:00
Roman Khimov	dc62046019	network: add network size estimation metric	2022-10-12 22:29:55 +03:00
Roman Khimov	bcf77c3c42	network: filter out not-yet-ready nodes when broadcasting They can fail right in the getPeers or they can fail later when packet send is attempted. Of course they can complete handshake in-between these events, but most likely they won't and we'll waste more resources on this attempt. So rule out bad peers immediately.	2022-10-12 16:51:01 +03:00
Roman Khimov	137f2cb192	network: deduplicate TCPPeer code a bit context.Background() is never canceled and has no deadline, so we can avoid duplicating some code.	2022-10-12 15:43:31 +03:00
Roman Khimov	104da8caff	network: broadcast messages, enqueue packets Drop EnqueueP2PPacket, replace EnqueueHPPacket with EnqueueHPMessage. We use Enqueue* when we have a specific per-peer message, it makes zero sense duplicating serialization code for it (unlike Broadcast*).	2022-10-12 15:39:20 +03:00

1 2 3 4 5 ...

580 commits