Sometimes it can be hard to persist all changes at ones, the process
can take almost all RAM and a lot of time. Here's the example of reset
for mainnet from 2.4M to 1:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:16:48.236+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:16:48.236+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:16:48.237+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:16:48.237+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:16:48.240+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:16:48.297+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:16:48.300+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:19:29.313+0300 INFO blocks, transactions ans AERs are reset {"took": "2m41.015126493s", "keys": 3958420}
...
```
To avoid OOM killer, split blocks reset into multiple stages. It increases
operation time due to intermediate DB persists, but makes things cleaner, the
result for almost the same DB height with the new approach:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:39:42.023+0300 INFO MaxBlockSize is not set or wrong, setting default value {"MaxBlockSize": 262144}
2022-11-20T17:39:42.023+0300 INFO MaxBlockSystemFee is not set or wrong, setting default value {"MaxBlockSystemFee": 900000000000}
2022-11-20T17:39:42.023+0300 INFO MaxTransactionsPerBlock is not set or wrong, using default value {"MaxTransactionsPerBlock": 512}
2022-11-20T17:39:42.023+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:39:42.026+0300 INFO restoring blockchain {"version": "0.2.6"}
2022-11-20T17:39:42.071+0300 INFO initialize state reset {"target height": 1}
2022-11-20T17:39:42.073+0300 INFO trying to reset blocks, transactions and AERs
2022-11-20T17:40:11.735+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 1, "took": "29.66363737s", "keys": 210973}
2022-11-20T17:40:33.574+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 2, "took": "21.839208683s", "keys": 241203}
2022-11-20T17:41:29.325+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 3, "took": "55.750698386s", "keys": 250593}
2022-11-20T17:42:12.532+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 4, "took": "43.205892757s", "keys": 321896}
2022-11-20T17:43:07.978+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 5, "took": "55.445398156s", "keys": 334822}
2022-11-20T17:43:35.603+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 6, "took": "27.625292032s", "keys": 317131}
2022-11-20T17:43:51.747+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 7, "took": "16.144359017s", "keys": 355832}
2022-11-20T17:44:05.176+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 8, "took": "13.428733899s", "keys": 357690}
2022-11-20T17:44:32.895+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 9, "took": "27.718548783s", "keys": 393356}
2022-11-20T17:44:51.814+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 10, "took": "18.917954658s", "keys": 366492}
2022-11-20T17:45:07.208+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 11, "took": "15.392642196s", "keys": 326030}
2022-11-20T17:45:18.776+0300 INFO intermediate batch of removed blocks, transactions and AERs is persisted {"batches persisted": 12, "took": "11.568255716s", "keys": 299884}
2022-11-20T17:45:25.862+0300 INFO last batch of removed blocks, transactions and AERs is persisted {"batches persisted": 13, "took": "7.086079594s", "keys": 190399}
2022-11-20T17:45:25.862+0300 INFO blocks, transactions ans AERs are reset {"took": "5m43.791214084s", "overall persisted keys": 3966301}
...
```
We need to keep the headers information consistent with header batches
and headers. This comit fixes the bug with failing blockchain
initialization on recovering from state reset interrupted after the
second stage (blocks/txs/AERs removal):
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T16:28:29.437+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T16:28:29.440+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: could not get header 1898cd356a4a2688ed1c6c7ba1fd6ba7d516959d8add3f8dd26232474d4539bd: key not found
```
Don't use cache because it's not yet initialized. Also, perform
safety checks only if state reset wasn't yet started. These fixes
alloww to solve the following problem while recovering from
interrupted state reset:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T15:51:31.431+0300 INFO MaxValidUntilBlockIncrement is not set or wrong, using default value {"MaxValidUntilBlockIncrement": 5760}
2022-11-20T15:51:31.434+0300 INFO restoring blockchain {"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: current block height is 0, can't reset state to height 83000
```
When block is being spread through the network we can get a lot of invs with
the same hash. Some more stale nodes may also announce previous or some
earlier block. We can avoid full DB lookup for them and minimize inv handling
time (timeouts in inv handler had happened in #2744).
It doesn't affect tests, just makes node a little less likely to spend some
considerable amount of time in the inv handler.
Blockchain's notificationDispatcher sends events to channels and these
channels must be read from. Unfortunately, regular service shutdown procedure
does unsubscription first (outside of the read loop) and only then drains the
channel. While it waits for unsubscription request to be accepted
notificationDispatcher can try pushing more data into the same channel which
will lead to a deadlock. Reading in the same method solves this, any number of
events can be pushed until unsub channel accepts the data.
Which allows to enable/disable the service, change nodes, keys and other
settings. Unfortunately, atomic.Value doesn't allow Store(nil), so we have to
store a pointer there that can point to nil interface.
It's not an ideal solution, but at least it solves the problem for
now. Caveats:
* consensus only needs one method, so it's mirrored to Blockchain
* rpcsrv uses core.* definition of the StateRoot (so technically it might as
well not have an internal Ledger), but it uses core already unfortunately
1. It's not good for pkg/core to import anything from pkg/neorpc.
2. The type is closely tied to the state package, even though it's not stored
in the DB
Specifying a certain stateroot R as `invoke*historic` RPC-call
parameter, we're willing to perform historic call based on the storage
state of root R. Thus, next block should be of the height h(R)+1. This
allows to use historic functionality for the current blockchain height.
name old time/op new time/op delta
TokenTransferLog_Append-8 93.0µs ±170% 46.8µs ±152% ~ (p=0.053 n=10+9)
name old alloc/op new alloc/op delta
TokenTransferLog_Append-8 53.8kB ± 4% 38.6kB ±39% -28.26% (p=0.004 n=8+10)
name old allocs/op new allocs/op delta
TokenTransferLog_Append-8 384 ± 0% 128 ± 0% -66.67% (p=0.000 n=10+10)
We shouldn't use StoragePrice from Blockchain because its dao doesn't
contain the whole set of changes from previouse transactions in the
current block. Instead, we should use an updated storage price for
each transaction and retrieve the price from cached DAO.
The usage of the Blockchain's one leads to the same ExecFeeFactor within
a single block. What we need is to update ExecFeeFactor after each
transaction invocation, thus, cached DAO should be used as it contains
all relevant changes.
We don't have a need to iterate over them at the moment, but since we're
changing the DB format in the next release anyway let's add this ability also,
just in case.
It couldn't be done previously with two maps and mixed storage, but now all of
the storage changes are located in a single map, so it's trivial to do exact
slice allocations and avoid string->[]byte conversions.
Private DAO is only used in a single thread which means we can safely reuse
key/data buffers most of the time and handle it all in DAO.
Doesn't affect any benchmarks.
Most of the time we don't need locking on the higher-level stores and we drop
them after Persist, so that's what private MemCachedStore is for.
It doesn't improve things in any noticeable way, some ~1% can be observed in
neo-bench under various loads and even less than that in chain processing. But
it seems to be a bit better anyway (less allocations, less locks).
They never return errors, so their interface should reflect that. This allows
to remove quite a lot of useless and never tested code.
Notice that Get still does return an error. It can be made not to do that, but
usually we need to differentiate between successful/unsuccessful accesses
anyway, so this doesn't help much.
Initially I thought of doing it in the next persist cycle, but testing shows
that it needs just ~2-5% of the time MPT GC does, so doing it in the same
cycle doesn't affect anything.
The key idea here is that even though we can't ensure MPT code won't make the
node active again we can order the changes made to the persistent store in
such a way that it practically doesn't matter. What happens is:
* after persist if it's time to collect our garbage we do it synchronously
right in the same thread working the underlying persistent store directly
* all the other node code doesn't see much of it, it works with bc.dao or
layers above it
* if MPT doesn't find some stale deactivated node in the storage it's OK,
it'll recreate it in bc.dao
* if MPT finds it and activates it, it's OK too, bc.dao will store it
* while GC is being performed nothing else changes the persistent store
* all subsequent bc.dao persists only happen after the GC is completed which
means that any changes to the (potentially) deleted nodes have a priority,
it's OK for GC to delete something that'll be recreated with the next
persist cycle
Otherwise it's a simple scheme with node status/last active height stored in
the value. Preliminary tests show that it works ~18% worse than the simple
KeepOnlyLatest scheme, but this seems to be the best result so far.
Fixes#2095.
Add "active" flag into the node data and make the remainder modal, for active
nodes it's a reference counter, for inactive ones the deactivation height is
stored.
Technically, refcounted chains storing just one trie don't need a flag, but
it's a bit simpler this way.
They're misleading now that we have variable number of committee
members/validators. The standby list can be seen in the configuration and the
appropriate numbers can be received from it also.
It should be performed before we're able to process blocks or before we
shutdown (which uses addition lock), fix
panic: assignment to entry in nil map
goroutine 5755 [running]:
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).drop(...)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:74
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).PutChangeSet(0xc00093e700, 0xc0002051a0, 0xc000205200, 0x1, 0x1)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:100 +0x25c
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).PutBatch(...)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:90
github.com/nspcc-dev/neo-go/pkg/core.(*Blockchain).removeOldStorageItems(0xc000206a00)
/home/rik/dev/neo-go/pkg/core/blockchain.go:495 +0x33a
created by github.com/nspcc-dev/neo-go/pkg/core.(*Blockchain).jumpToStateInternal
/home/rik/dev/neo-go/pkg/core/blockchain.go:553 +0x68a
FAIL github.com/nspcc-dev/neo-go/pkg/core 52.084s