It should be performed before we're able to process blocks or before we
shutdown (which uses addition lock), fix
panic: assignment to entry in nil map
goroutine 5755 [running]:
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).drop(...)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:74
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).PutChangeSet(0xc00093e700, 0xc0002051a0, 0xc000205200, 0x1, 0x1)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:100 +0x25c
github.com/nspcc-dev/neo-go/pkg/core/storage.(*MemoryStore).PutBatch(...)
/home/rik/dev/neo-go/pkg/core/storage/memory_store.go:90
github.com/nspcc-dev/neo-go/pkg/core.(*Blockchain).removeOldStorageItems(0xc000206a00)
/home/rik/dev/neo-go/pkg/core/blockchain.go:495 +0x33a
created by github.com/nspcc-dev/neo-go/pkg/core.(*Blockchain).jumpToStateInternal
/home/rik/dev/neo-go/pkg/core/blockchain.go:553 +0x68a
FAIL github.com/nspcc-dev/neo-go/pkg/core 52.084s
It doesn't cost much, but it's used _a lot_, so optimizing it makes sense.
name old time/op new time/op delta
TxHash-8 4.89ns ± 5% 0.54ns ± 2% -88.86% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
TxHash-8 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
TxHash-8 0.00 0.00 ~ (all equal)
We're likely to have something comparable to the current changeset in the
subsequent one. If it's bigger, no big deal, it'll be reallocated, if it's
smaller, no big deal, the next one will be preallocated smaller.
It's very effective in avoiding allocations for big.Int, we don't have a
microbenchmark for memppol, but this improves TPS metrics by ~1-2%, so it's
noticeable.
Problem: transactions with wrong hashes are accepted to the chain if
consensus nodes are designated as Oracle nodes. The result is wrong
MerkleRoot for the accepted block. Consensus nodes got such blocks
right from the dbft and store them without errors, but if
non-consensus nodes are present in the network, they just can't accept
these "bad" blocks:
```
2021-11-29T12:56:40.533+0300 WARN blockQueue: failed adding block into the blockchain {"error": "invalid block: MerkleRoot mismatch (expected a866b57ad637934f7a7700e3635a549387e644970b42681d865a54c3b3a46122, calculated d465aafabaf4539a3f619d373d178eeeeab7acb9847e746e398706c8c1582bf8)", "blockHeight": 17, "nextIndex": 18}
```
This problem happens because of transaction hash caching. We can't set
transaction hash if transaction construction wasn't yet completed.
Problem:
```
--- FAIL: TestMemCachedPersist (0.07s)
--- FAIL: TestMemCachedPersist/BoltDBStore (0.07s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestMemCachedPersist_BoltDBStore294966711\001\test_bolt_db: The process cannot access the file because it is being used by another process.
```
Solution:
Release the resources occupied by the DB.
We use 2 prefixes for storing items because of state synchronization.
This commit allows to parametrize dao with the default prefix.
Signed-off-by: Evgeniy Stratonikov <evgeniy@nspcc.ru>
b9be892bf9 has made Persist asynchronous which
is very effective in allowing the system to continue processing
blocks/transactions while flushing things to disk. It at the same time is very
dangerous in that if the disk is slow and it takes much time to flush KV set
(more than persisting interval), there might be even bigger new KV set in
MemCachedStore by the time it finishes. Even if the system immediately starts
to flush this new data set it (being bigger) can take more time than the
previous one. And while doing so a new data set will appear in memory,
potentially again bigger than this.
So we can easily end up with the system going out of control, consuming more
and more memory and taking more and more time to persist a single set of
data. To avoid this we need to detect such condition and just wait for Persist
to really finish its job and release the resources.
Everywhere it matters (and that's callExFromNative() now) it's incremented
already, so when we're doing Call() at the same time (and it's done to invoke
`_initialize` method) we're effectively double-incrementing it.
Standards are NEP-11 and NEP-17, not NEP11, not NEP17, not anything
else. Variable/function names of course can use whatever fits, but documents
and comments should be consistent wrt this.
Oracle responses must use the same set of signers as oracle requests even
though the transaction itself is signed by oracle nodes/contract.
We can probably improve interop.Context by removing Tx field completely and
adding more functionality to Container, but it's not very convenient for
VerifyWitness and will require adding more stub-like methods for Block, so Tx
is used for now (and we do have it in every relevant case).
I don't think it's possible with regular service functioning, but it happens
during testing because of pointer reuse:
WARNING: DATA RACE
Read at 0x00c003a0e3f0 by goroutine 114:
github.com/nspcc-dev/neo-go/pkg/services/notary.(*Notary).verifyIncompleteWitnesses()
/home/runner/work/neo-go/neo-go/pkg/services/notary/notary.go:441 +0x1dc
github.com/nspcc-dev/neo-go/pkg/services/notary.(*Notary).OnNewRequest()
/home/runner/work/neo-go/neo-go/pkg/services/notary/notary.go:188 +0x205
github.com/nspcc-dev/neo-go/pkg/core.TestNotary.func11()
/home/runner/work/neo-go/neo-go/pkg/core/notary_test.go:347 +0x612
github.com/nspcc-dev/neo-go/pkg/core.TestNotary()
/home/runner/work/neo-go/neo-go/pkg/core/notary_test.go:443 +0xe33
testing.tRunner()
/opt/hostedtoolcache/go/1.16.10/x64/src/testing/testing.go:1193 +0x202
Previous write at 0x00c003a0e3f0 by goroutine 104:
github.com/nspcc-dev/neo-go/pkg/services/notary.(*Notary).finalize()
/home/runner/work/neo-go/neo-go/pkg/services/notary/notary.go:338 +0x50a
github.com/nspcc-dev/neo-go/pkg/services/notary.(*Notary).PostPersist()
/home/runner/work/neo-go/neo-go/pkg/services/notary/notary.go:314 +0x297
github.com/nspcc-dev/neo-go/pkg/services/notary.(*Notary).Run()
/home/runner/work/neo-go/neo-go/pkg/services/notary/notary.go:169 +0x4a7
Serializing/deserializing the payload yields this:
Error: Received unexpected error:
both main and fallback transactions should have the same ValidUntil value
See neo-project/neo#2622. The implementation is somewhat asymmetric (and not
very efficient) for binary/JSON encoding/decoding, but it should be
sufficient.
Eventually this will be replaced by `pkg/neotest` invocations but for
now it allows us to remove NNS constants together with the tests.
Signed-off-by: Evgeniy Stratonikov <evgeniy@nspcc.ru>
Use circular buffer which is a bit more appropriate. The problem is that
priority queue accepts and stores equal items which wastes memory even in
normal usage scenario, but it's especially dangerous if the node is stuck for
some reason. In this case it'll accept from peers and put into queue the same
blocks again and again leaking memory up to OOM condition.
Notice that queue length calculation might be wrong in case circular buffer
wraps, but it's not very likely to happen (usually blocks not coming from the
queue are added by consensus and it's not very fast in doing so).
Notes for witnesses:
* [N sig + M multisig + K contract] combination is possible where N, M, K >=0.
* Each verification script should be properly filled in.
* Each invocation script should either be empty or contain exactly one
signature.
Real persistent storage guarantees that result of Seek is sorted
by keys. The idea of optimisation is to merge two sorted seek
results into one (memStore+persistentStore), so that
(*MemCachedStore).Seek will return sorted list. The only thing
that remains is to sort items got from (*MemoryStore).Seek.
MemoryStore is used in a MemCachedStore as a persistent layer in tests.
Further commits suppose that persistent storage returns sorted values
from Seek, so sort the result of MemoryStore.Seek.
Benchmark results for 10000 matching items in MemoryStore compared to
master:
name old time/op new time/op delta
MemorySeek-8 712µs ± 0% 3850µs ± 0% +440.52% (p=0.000 n=8+8)
name old alloc/op new alloc/op delta
MemorySeek-8 160kB ± 0% 2724kB ± 0% +1602.61% (p=0.000 n=10+8)
name old allocs/op new allocs/op delta
MemorySeek-8 10.0k ± 0% 10.0k ± 0% +0.24% (p=0.000 n=10+10)
For details on implementation efficiency see the
https://github.com/nspcc-dev/neo-go/pull/2193#discussion_r722993358.
(*Billet).Traverse changes:
1. Get rid of the `offset` argument. We can cut `from` and pass just the
part that remains. This implies that node with path matching `from` will
also be included in the result, so additional check needs to be added to
the callback function.
2. Pass `path` and `from` without search prefix. Append prefix to the
result inside the callback.
3. Remove duplicating code.
(*Trie).Find changes:
1. Properly prepare `from` argument for traversing function. It closly
depends on the `path` argument.
Instead of flushing everything to `cache` and then to `bc.dao`, wrap `bc.dao`
directly for block/tx data and AERs and then flush to it. Block/transactions
are usually processed more quickly than other components, so they easily end
up in `cache` where they directly affect Seek performance for any executing
transaction.
Simple as it is this change improves voter NEO transfer benchmark with 1000
accounts by more than 25%, from ~18500 TPS to ~23500 TPS. It doesn't affect
much other cases.
GAS can only be distributed once in a block for particular address, so it
makes little sense trying to calculate it again and again. This fixes
neo-bench for NEO voter, because without it we get ~2500 TPS for
single-address test and with it it jumps 13-fold to normal values like
~33500.
We need to store NEO balance's LastUpdateHeight before GAS mint,
because mint can call onNEP17Payment and onNEP17Payment can call NEO
transfer which also calls GAS mint. Storing balance height allows to
avoid recursion.
We need to copy the result of `TryGet` method, otherwice the slice can
be modified inside `Add` or `Update` methods, which leads to
inconsistent MPT pool state.
We need several stages to manage state jump process in order not to mess
up old and new contract storage items and to be sure about genesis state data
are properly removed from the storage. Other operations do not require
separate stage and can be performed each time `jumpToStateInternal` is
called.
We don't need this method to be exposed, the only its user is the
StateSync module. At the same time StateSync module manages its state by
itself which guarantees that (*Blockchain).jumpToState will be called
with proper StateSync stage.
State jump should be an atomic operation, we can't modify contract
storage items state on-the-fly. Thus, store fresh items under temp
prefix and replase the outdated ones after state sync is completed.
Related
https://github.com/nspcc-dev/neo-go/pull/2019#discussion_r693350460.
Before state sync process can be started, outdated MPT nodes
should be removed from storage. After state sync is completed,
outdated blocks/transactions/AERs should also be removed.
In this commit:
1. Request unknown MPT nodes from peers. Note, that StateSync module itself
shouldn't be responsible for nodes requests, that's a server duty.
2. Do not request the same node twice, check if it is in storage
already. If so, then the only thing remaining is to update refcounter.
MPT restore process is much simpler then regular MPT maintaining: trie
has a fixed structure, we don't need to remove or rebuild MPT nodes. The
only thing we should do is to replace Hash nodes to their unhashed
counterparts and increment refcount. It's better not to touch the
regular MPT code and create a separate structure for this.
C# node does not return empty proof enymore in case if path is bad. C#
node also throws an exception on bad Put.
Our node does not return an error on delete if the key is empty.
Allow it for (*Trie).Put. And distinguish empty value and nil value for
(*Trie).PutBatch, because batch is already capable of handling both nil
and empty value. For (*Trie).PutBatch putting nil value means deletion,
while putting empty value means just putting LeafNode with an empty
value.