If it's the end of epoch, then it contains the updated validators list recalculated
during the last block's PostPersist. If it's middle of the epoch, then it contains
previously calculated value (value for the previous completed epoch) that is equal
to the current nextValidators cache value.
Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
Recalculate them once per epoch. Consensus is aware of it and must
call CalculateNextValidators exactly when needed.
Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
We have two similar blockchain APIs: GetNextBlockValidators and GetValidators.
It's hard to distinguish them, thus renaming it to match the meaning, so what
we have now is:
GetNextBlockValidators literally just returns the top of the committee that
was elected in the start of batch of CommitteeSize blocks batch. It doesn't
change its valie every block.
ComputeNextBlockValidators literally computes the list of validators based on
the most fresh committee members information got from the NeoToken's storage
and based on the latest register/unregister/vote events. The list returned by
this method may be updated every block.
Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
Blockchain passes his own pure unwrapped DAO to
(*Blockchain).ComputeNextBlockValidators which means that native
RW NEO cache structure stored inside this DAO can be modified by
anyone who uses exported ComputeNextBlockValidators Blockchain API,
and technically it's valid, and we should allow this, because it's
the only purpose of `validators` caching. However, at the same time
some RPC server is allowed to request a subsequent wrapped DAO for
some test invocation. It means that descendant wrapped DAO
eventually will request RW NEO cache and try to `Copy()`
the underlying's DAO cache which is in direct use of
ComputeNextBlockValidators. Here's the race:
ComputeNextBlockValidators called by Consensus service tries to
update cached `validators` value, and descendant wrapped DAO
created by the RPC server tries to copy DAO's native cache and
read the cached `validators` value.
So the problem is that native cache not designated to handle
concurrent access between parent DAO layer and derived (wrapped)
DAO layer. I've carefully reviewed all the usages of native cache,
and turns out that the described situation is the only place where
parent DAO is used directly to modify its cache concurrently with
some descendant DAO that is trying to access the cache. All other
usages of native cache (not only NEO, but also all other native
contrcts) strictly rely on the hierarchical DAO structure and don't
try to perform these concurrent operations between DAO layers.
There's also persist operation, but it keeps cache RW lock taken,
so it doesn't have this problem as far. Thus, in this commit we rework
NEO's `validators` cache value so that it always contain the relevant
list for upper Blockchain's DAO and is updated every PostPersist (if
needed).
Note: we must be very careful extending our native cache in the
future, every usage of native cache must be checked against the
described problem.
Close#2989.
Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
go.uber.org/atomic deprecated CAS methods in version 1.10 (that introduced
CompareAndSwap), so we need to fix it.
Signed-off-by: Roman Khimov <roman@nspcc.ru>
Move them to the core/network packages, close#2950. The name of
mempool's unsorted transactions metrics has been changed along the
way to match the core's metrics naming convention.
Signed-off-by: Anna Shaleva <shaleva.ann@nspcc.ru>
Everywhere including examples, external interop APIs, bindings generators
code and in other valuable places. A couple of `interface{}` usages are
intentionally left in the CHANGELOG.md, documentation and tests.
Do not add them directly to chain, it will be done by the block queue
manager. Close https://github.com/nspcc-dev/neo-go/issues/2923. However,
this commit is not valid without
https://github.com/roman-khimov/dbft/pull/4.
It's the neo-go's duty to initialize consensus after subsequent block
addition; the dBFT itself must wait for the neo-go to complete the block
addition and notify the dBFT, so that it can initialize at 0-th view to
collect the next block.
And include some node-specific configurations there with backwards
compatibility. Note that in the future we'll remove Ledger's
fields from the ProtocolConfiguration and it'll be possible to access them in
Blockchain directly (not via .Ledger).
The other option tried was using two configuration types separately, but that
incurs more changes to the codebase, single structure that behaves almost like
the old one is better for backwards compatibility.
Fixes#2676.
It makes sense in general (further narrowing down the time window when
transactions are processed by consensus thread) and it improves block times a
little too, especially in the 7+2 scenario.
Related to #2744.
Blockchain's notificationDispatcher sends events to channels and these
channels must be read from. Unfortunately, regular service shutdown procedure
does unsubscription first (outside of the read loop) and only then drains the
channel. While it waits for unsubscription request to be accepted
notificationDispatcher can try pushing more data into the same channel which
will lead to a deadlock. Reading in the same method solves this, any number of
events can be pushed until unsub channel accepts the data.
It's not an ideal solution, but at least it solves the problem for
now. Caveats:
* consensus only needs one method, so it's mirrored to Blockchain
* rpcsrv uses core.* definition of the StateRoot (so technically it might as
well not have an internal Ledger), but it uses core already unfortunately
consensus.OnTransaction is a callback, so it can be called at any time.
We need to check whether service (and dBFT) is started before the
subsequent transaction handling like it is done inside the OnPayload
callback.
Notice that it makes the node accept Extensible payloads with any category
which is the same way C# node works. We're trusting Extensible senders,
improper payloads are harmless until they DoS the network, but we have some
protections against that too (and spamming with proper category doesn't differ
a lot).
It was broken somewhere between 2f490a3403 and
85ce207f40 leading to panic on watch only node:
2021-07-21T16:21:39.201+0200 INFO received Commit {"validator": 3}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0xbcc59e]
goroutine 486 [running]:
github.com/nspcc-dev/neo-go/pkg/consensus.(*service).newBlockFromContext(0xc0001629a0, 0xc000308000, 0xc0010fa000, 0x2cb417800)
github.com/nspcc-dev/neo-go/pkg/consensus/consensus.go:664 +0xbe
github.com/nspcc-dev/dbft.(*Context).MakeHeader(...)
github.com/nspcc-dev/dbft@v0.0.0-20210302103605-cc75991b7cfb/context.go:270
github.com/nspcc-dev/dbft.(*DBFT).onCommit(0xc000308000, 0x138c998, 0xc000115110)
github.com/nspcc-dev/dbft@v0.0.0-20210302103605-cc75991b7cfb/dbft.go:487 +0x575
github.com/nspcc-dev/dbft.(*DBFT).OnReceive(0xc000308000, 0x138c998, 0xc000115110)
github.com/nspcc-dev/dbft@v0.0.0-20210302103605-cc75991b7cfb/dbft.go:251 +0xef5
github.com/nspcc-dev/neo-go/pkg/consensus.(*service).eventLoop(0xc0001629a0)
github.com/nspcc-dev/neo-go/pkg/consensus/consensus.go:312 +0x7d6
created by github.com/nspcc-dev/neo-go/pkg/consensus.(*service).Start
github.com/nspcc-dev/neo-go/pkg/consensus/consensus.go:262 +0xdc
In fact, nonce is correctly provided by dbft library (since Legacy), we just
need to use it here.
It's only used to sign/verify it and is not a part of the structure. It's
still neded in consensus.Payload though because that's the way dbft library
is.
It's not network-tied any more, network is only needed to
sign/verify. Unfortunately we still have to keep network in consensus data
structures because of dbft library interface.