They can stay in the memory pool forever because consensus process will never
accept these transactions (and maybe even block consensus process at all).
We're paging these hashes, so we need a previous full page and a current one
plus some cache for various requests. Storing 1M of hashes is 32M of memory
and it grows quickly. It also seriously affects node startup time, most of
what it's doing is reading these hashes, the longer the chain the more time it
needs to do that.
Notice that this doesn't change the underlying DB scheme in any way.
If we only have genesis block (or <2000 headers) then we might as well use
generic logic below with zero targetHash because genesis block has zero
PrevHash (and its hash will naturally be the last on the chain going
backwards).
Let upper-layer APIs like actor.Send() return it as well. Server can return
"already exists" which is an error and yet at the same time a very special
one, in many cases it means we can proceed with waiting for the TX to settle.
Sometimes it can be hard to persist all changes at ones, the process
can take almost all RAM and a lot of time. Here's the example of reset
for mainnet from 2.4M to 1:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:16:48.236+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:16:48.236+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:16:48.237+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:16:48.237+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:16:48.240+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:16:48.297+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:16:48.300+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:19:29.313+0300 INFO blocks, transactions ans AERs are reset {"took": "2m41.015126493s", "keys": 3958420}
...
```
To avoid OOM killer, split blocks reset into multiple stages. It increases
operation time due to intermediate DB persists, but makes things cleaner, the
result for almost the same DB height with the new approach:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:39:42.023+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:39:42.023+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:39:42.023+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:39:42.023+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:39:42.026+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:39:42.071+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:39:42.073+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:40:11.735+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 1, "took": "29.66363737s", "keys": 210973}
2022-11-20T17:40:33.574+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 2, "took": "21.839208683s", "keys": 241203}
2022-11-20T17:41:29.325+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 3, "took": "55.750698386s", "keys": 250593}
2022-11-20T17:42:12.532+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 4, "took": "43.205892757s", "keys": 321896}
2022-11-20T17:43:07.978+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 5, "took": "55.445398156s", "keys": 334822}
2022-11-20T17:43:35.603+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 6, "took": "27.625292032s", "keys": 317131}
2022-11-20T17:43:51.747+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 7, "took": "16.144359017s", "keys": 355832}
2022-11-20T17:44:05.176+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 8, "took": "13.428733899s", "keys": 357690}
2022-11-20T17:44:32.895+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 9, "took": "27.718548783s", "keys": 393356}
2022-11-20T17:44:51.814+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 10, "took": "18.917954658s", "keys": 366492}
2022-11-20T17:45:07.208+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 11, "took": "15.392642196s", "keys": 326030}
2022-11-20T17:45:18.776+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 12, "took": "11.568255716s", "keys": 299884}
2022-11-20T17:45:25.862+0300 INFO last batch of removed blocks, transactions and AERs is persisted {"batches persisted": 13, "took": "7.086079594s", "keys": 190399}
2022-11-20T17:45:25.862+0300 INFO blocks, transactions ans AERs are reset {"took": "5m43.791214084s", "overall persisted keys": 3966301}
...
```
We need to keep the headers information consistent with header batches
and headers. This comit fixes the bug with failing blockchain
initialization on recovering from state reset interrupted after the
second stage (blocks/txs/AERs removal):
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T16:28:29.437+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T16:28:29.440+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: could not get header 1898cd356a4a2688ed1c6c7ba1fd6ba7d516959d8add3f8dd26232474d4539bd: key not found
```
Don't use cache because it's not yet initialized. Also, perform
safety checks only if state reset wasn't yet started. These fixes
alloww to solve the following problem while recovering from
interrupted state reset:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T15:51:31.431+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T15:51:31.434+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: current block height is 0, can't reset state to height 83000
```
It's useful to reproduce different execution problems including executions
stopped because of GAS limit. Satoshi representation is deliberately used,
because that's what is usually found in logs.
We don't use all of the Stack functionality for it, so drop useless methods
and avoid some interface conversions. It increases single-node TPS by about
0.9%, so nothing really important there, but not a bad change either. Maybe it
can be reworked again with generics though.
Blockchain's subscriptions, unsubscriptions and notifications are
handled by a single notificationDispatcher routine. Thus, on attempt
to send the subsequent event to Blockchain's subscribers, dispatcher
can't handle subscriptions\unsubscriptions. Make subscription and
unsubscription to be a non-blocking operation for blockchain on the
server side, otherwise it may cause the dispatcher locks.
To achieve this, use a separate lock for those code that make calls
to blockchain's subscription API and for subscription counters on
the server side.
Small (especially dockerized/virtualized) networks often start all nodes at
ones and then we see a lot of connection flapping in the log. This happens
because nodes try to connect to each other simultaneously, establish two
connections, then each one finds a duplicate and drops it, but this can be
different duplicate connections on other sides, so they retry and it all
happens for some time. Eventually everything settles, but we have a lot of
garbage in the log and a lot of useless attempts.
This random waiting timeout doesn't change the logic much, adds a minimal
delay, but increases chances for both nodes to establish a proper single
connection on both sides to only then see another one and drop it on both
sides as well. It leads to almost no flapping in small networks, doesn't
affect much bigger ones. The delay is close to unnoticeable especially if
there is something in the DB for node to process during startup.
Consider mainnet, it has an AttemptConnPeers of 20, so may already have 3
peers and request 20 more, then have 4th connected and attemtp 20 more again,
this leads to a huge number of connections easily.