We're paging these hashes, so we need a previous full page and a current one
plus some cache for various requests. Storing 1M of hashes is 32M of memory
and it grows quickly. It also seriously affects node startup time, most of
what it's doing is reading these hashes, the longer the chain the more time it
needs to do that.
Notice that this doesn't change the underlying DB scheme in any way.
If we only have genesis block (or <2000 headers) then we might as well use
generic logic below with zero targetHash because genesis block has zero
PrevHash (and its hash will naturally be the last on the chain going
backwards).
Let upper-layer APIs like actor.Send() return it as well. Server can return
"already exists" which is an error and yet at the same time a very special
one, in many cases it means we can proceed with waiting for the TX to settle.
Sometimes it can be hard to persist all changes at ones, the process
can take almost all RAM and a lot of time. Here's the example of reset
for mainnet from 2.4M to 1:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:16:48.236+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:16:48.236+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:16:48.237+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:16:48.237+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:16:48.240+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:16:48.297+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:16:48.300+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:19:29.313+0300 INFO blocks, transactions ans AERs are reset {"took": "2m41.015126493s", "keys": 3958420}
...
```
To avoid OOM killer, split blocks reset into multiple stages. It increases
operation time due to intermediate DB persists, but makes things cleaner, the
result for almost the same DB height with the new approach:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:39:42.023+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:39:42.023+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:39:42.023+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:39:42.023+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:39:42.026+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:39:42.071+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:39:42.073+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:40:11.735+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 1, "took": "29.66363737s", "keys": 210973}
2022-11-20T17:40:33.574+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 2, "took": "21.839208683s", "keys": 241203}
2022-11-20T17:41:29.325+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 3, "took": "55.750698386s", "keys": 250593}
2022-11-20T17:42:12.532+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 4, "took": "43.205892757s", "keys": 321896}
2022-11-20T17:43:07.978+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 5, "took": "55.445398156s", "keys": 334822}
2022-11-20T17:43:35.603+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 6, "took": "27.625292032s", "keys": 317131}
2022-11-20T17:43:51.747+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 7, "took": "16.144359017s", "keys": 355832}
2022-11-20T17:44:05.176+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 8, "took": "13.428733899s", "keys": 357690}
2022-11-20T17:44:32.895+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 9, "took": "27.718548783s", "keys": 393356}
2022-11-20T17:44:51.814+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 10, "took": "18.917954658s", "keys": 366492}
2022-11-20T17:45:07.208+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 11, "took": "15.392642196s", "keys": 326030}
2022-11-20T17:45:18.776+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 12, "took": "11.568255716s", "keys": 299884}
2022-11-20T17:45:25.862+0300 INFO last batch of removed blocks, transactions and AERs is persisted {"batches persisted": 13, "took": "7.086079594s", "keys": 190399}
2022-11-20T17:45:25.862+0300 INFO blocks, transactions ans AERs are reset {"took": "5m43.791214084s", "overall persisted keys": 3966301}
...
```
We need to keep the headers information consistent with header batches
and headers. This comit fixes the bug with failing blockchain
initialization on recovering from state reset interrupted after the
second stage (blocks/txs/AERs removal):
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T16:28:29.437+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T16:28:29.440+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: could not get header 1898cd356a4a2688ed1c6c7ba1fd6ba7d516959d8add3f8dd26232474d4539bd: key not found
```
Don't use cache because it's not yet initialized. Also, perform
safety checks only if state reset wasn't yet started. These fixes
alloww to solve the following problem while recovering from
interrupted state reset:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T15:51:31.431+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T15:51:31.434+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: current block height is 0, can't reset state to height 83000
```
We don't use all of the Stack functionality for it, so drop useless methods
and avoid some interface conversions. It increases single-node TPS by about
0.9%, so nothing really important there, but not a bad change either. Maybe it
can be reworked again with generics though.
Blockchain's subscriptions, unsubscriptions and notifications are
handled by a single notificationDispatcher routine. Thus, on attempt
to send the subsequent event to Blockchain's subscribers, dispatcher
can't handle subscriptions\unsubscriptions. Make subscription and
unsubscription to be a non-blocking operation for blockchain on the
server side, otherwise it may cause the dispatcher locks.
To achieve this, use a separate lock for those code that make calls
to blockchain's subscription API and for subscription counters on
the server side.
Small (especially dockerized/virtualized) networks often start all nodes at
ones and then we see a lot of connection flapping in the log. This happens
because nodes try to connect to each other simultaneously, establish two
connections, then each one finds a duplicate and drops it, but this can be
different duplicate connections on other sides, so they retry and it all
happens for some time. Eventually everything settles, but we have a lot of
garbage in the log and a lot of useless attempts.
This random waiting timeout doesn't change the logic much, adds a minimal
delay, but increases chances for both nodes to establish a proper single
connection on both sides to only then see another one and drop it on both
sides as well. It leads to almost no flapping in small networks, doesn't
affect much bigger ones. The delay is close to unnoticeable especially if
there is something in the DB for node to process during startup.
Consider mainnet, it has an AttemptConnPeers of 20, so may already have 3
peers and request 20 more, then have 4th connected and attemtp 20 more again,
this leads to a huge number of connections easily.
Consider initial connection phase for public networks:
* simultaneous connections to seeds
* very quick handshakes
* got five handshaked peers and some getaddr requests sent
* but addr replies won't trigger new connections
* so we can stay with just five connections until any of them breaks or a
(long) address checking timer fires
This new timers solves the problem, it's adaptive at the same time. If we have
enough peers we won't be waking up often.
* treat connected/handshaked peers separately in the discoverer, save
"original" address for connected ones, it can be a name instead of IP and
it's important to keep it to avoid reconnections
* store name->IP mapping for seeds if and when they're connected to avoid
reconnections
* block seed if it's detected to be our own node (which is often the case for
small private networks)
* add an event for handshaked peers in the server, connected but
non-handshaked ones are not really helpful for MinPeers or GetAddr logic
Fixes#2796.
If VUB-th block is received, we still can't guaranty that transaction
wasn't accepted to chain. Back this situation by rolling back to a
poll-based waiter.
Do not block subscribers until the unsubscription request to RPC server
is completed. Otherwise, another notification may be received from the
RPC server which will block the unsubscription process.
At the same time, fix event-based waiter. We must not block the receiver
channel during unsubscription because there's a chance that subsequent
event will be sent by the server. We need to read this event in order not
to block the WSClient's readloop.
Bad contract -> no contract. Unfortunately we've got a broken
6f1837723768f27a6f6a14452977e3e0e264f2cc contract on the mainnet which can't
be decoded (even though it had been saved successfully), so this is a
temporary fix for #2801 to be able to start mainnet node after shutdown.