Why a deadlock can occur:
1. (*DefaultDiscovery).run() has a for loop over requestCh channel.
2. (*DefaultDiscovery).RequestRemote() send to this channel while
holding a mutex.
3. (*DefaultDiscovery).RegisterBadAddr() tries to take mutex for write.
4. Second select-case can't take mutex for read because of (3).
When `return` or `break` statement is encountered inside
a for/range/switch statement, top stack items can be auxilliary.
They need to be cleaned up before returning from the function.
C# uses ToArray() or UintXXX(bytes) here which interprets hashes as they
should be interpreted (BE, although they always convert to LE when converting
to String just for the fun of it). It leads to state difference for us at
block 2025204 where even though we have the same value for the key, the key
itself differs, ours:
dd2b538e2a0c1db1ae5061c15be14f916bd1e678e512ffcda6d9499d8e7fe97ee71fd6b8004583d9afe09cc4dadbd5deb63d01e061009b7cffdaa674beae0f930ebe6085af900093e5fe56b34a5c220ccdcf6efc336fc5000000000000000000000000000000000010
theirs:
dd2b538e2a0c1db1ae5061c15be14f916bd1e67861e0013db6ded5dbdac49ce0afd9834500b8d61fe77ee97f8e9d49d9a6cdff12e5009b7cffdaa674beae0f930ebe6085af900093e5fe56b34a5c220ccdcf6efc336fc5000000000000000000000000000000000010
In this key there is a tx hash encoded
(e512ffcda6d9499d8e7fe97ee71fd6b84583d9afe09cc4dadbd5deb63d01e061 in LE used
by all the tools like neoscan).
I love Neo.
Both verification and invocation scripts need to
be unmarshaled from hex.
Also fix failing RPC tests: block contains non-pointer
`transaction.Witness` field and (*Witness).MarshalJSON method
is not called.
They have it specified right in the transaction. Unfortunately, this little
change rendered invalid our RPC test chain, but I think it became even better
after it, especially given that chain generation is a nice test by itself, so
it should be running as a regular test.
And drop doc.go from server package as it duplicates docs/rpc.md and has no
value of its own (TODO list is managed with GitHub issues, really). Also, RPC
server is not really expected to be used by non-neo-go packages (contrary to
the client).
1) Add marshaller and tests for smartcontract.Parameter
2) Add unmarshaller and tests for missing types of smartcontract.Parameter:
- MapType
- BoolType
Merged two types:
- smartcontract.ParamType
- rpc.StackParamType
into single one:
- smartcontract.ParamType
as they duplicated the functionality.
NOTE: type smartcontract.MapType was added (as in C# implementation).
From now, list of supported smartcontract parameter types:
UnknownType
SignatureType
BoolType
IntegerType
Hash160Type
Hash256Type
ByteArrayType
PublicKeyType
StringType
ArrayType
MapType
InteropInterfaceType
VoidType
Current implementation of short-circuting is just plain wrong
as it uses `last` or `before-last` labels which meaning depend
on context. It doesn't even handle simple assignements like
`a := x == 1 && y == 2`.
This commit makes all jumps in such conditions local
and adds tests.
Closes#699, #700.
Close transport and disconnect peers right in the Shutdown(), so that no new
connections would be accepted and so that all the peers would be disconnected
correctly (avoiding the same deadlock as in e2116e4c3f).
prevHash == input.PrevHash, so make less DB accesses and more real work. Fix
some bugs along the way:
* spentCoins structure may already be present in the DB when persisting TX,
there is nothing wrong with that and we shouldn't overwrite it
* it's only used for NEO and only to check for claim validity. Thus, when
processing claim tx the corresponding spentCoins should always be present
in the DB
Everywhere in this code prevHash == input.PrevHash, thus we can easily move
some common code out of the loop saving on DB accesses and
serialization/deserialization.
sendrawtransaction just returns a bool, sendtoaddress returns a proper
transaction and that should be the same as the one we have in
TransactionOutputRaw.
Mostly as is, no real effort done yet to optimize them, so there are still a
lot of duplicates there, but at least we sort them out into different smaller
packages.
Implement mempool and consensus block creation policies, almost the same as
SimplePolicy plugin for C# node provides with two caveats:
* HighPriorityTxType is not configured and hardcoded to ClaimType
* BlockedAccounts are not supported
Other than that it allows us to run successfuly as testnet CN, previously our
proposals were rejected because we were proposing blocks with oversized
transactions (that are rejected by PoolTx() now).
Mainnet and testnet configuration files are updated accordingly, but privnet
is left as is with no limits.
Configuration is currently attached to the Blockchain and so is the code that
does policying, it may be moved somewhere in the future, but it works for
now.
Both are very useful outside of the core, this change also makes respective
transactions initialize with the package as they don't depend on any kind of
input and it makes no sense recreating them again and again on every use.
It wasn't sorted when all validators were elected. There is also no need to do
`Unique()` on the result because validators are distinguished by the key, so
no two registered validators can have the same key.
Changed returncode of getrowtransaction method in case when transaction
with specified hash does not exists. Now it returns error with code -100
instead of -32602 (as in c# node)
- Attribute should have 2 fields (usage, data)
- VOut should have 4 (5) fields (asset, value, address, n)
- Script should have 2 fields (invocation, verification)
As C# node does it. Technically it's only needed for consensus and could be
implemented in the appropriate package, but for better compatibility with C#
node we're better returning it sorted right here.
We can still lock the (*Server).run with dead peers:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 40 [select, 871 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).putPacketIntoQueue(0xc030ab5320, 0xc02f251f20, 0xc00af0dcc0, 0x18, 0x40, 0x100000000000000, 0xffffffffffffffff)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:82 +0xf4
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).EnqueueHPPacket(0xc030ab5320, 0xc00af0dcc0, 0x18, 0x40, 0x1367240, 0xc03090ef98)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:124 +0x52
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).iteratePeersWithSendMsg(0xc0000ca000, 0xc00af35800, 0xcb2a58, 0x0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:720 +0x12a
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).broadcastHPMessage(...)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:731
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).run(0xc0000ca000)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:203 +0xee4
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).Start(0xc0000ca000, 0xc000072ba0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:173 +0x2ec
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by github.com/CityOfZion/neo-go/cli/server.startServer
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/cli/server/server.go:331 +0x476
...
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 2199 [chan send, 870 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).Disconnect.func1()
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:366 +0x85
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: sync.(*Once).Do(0xc030ab403c, 0xc02f262788)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/usr/local/go/src/sync/once.go:44 +0xb3
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).Disconnect(0xc030ab4000, 0xd92440, 0xc000065a00)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:365 +0x6d
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).SendPing.func1()
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:394 +0x42
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by time.goFunc
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/usr/local/go/src/time/sleep.go:169 +0x44
...
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 3448 [chan send, 854 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).handleConn(0xc01ed203f0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:143 +0x6c
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by github.com/CityOfZion/neo-go/pkg/network.(*TCPTransport).Accept
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_transport.go:62 +0x44c
...
The problem is that the select in putPacketIntoQueue() only works the way it
was intended to after the `close(p.done)`, but that happens only after
successful unregistration request send. Thus, do disconnects the other way
around, first unblock queueing and exit goroutines, then destroy the
connection (if it wasn't previously destroyed) and only after that signal to
the Server.
We were completely lacking ValidatorsCount that is supposed to track the
number of votes with particular count of consensus nodes which in theory can
change the number of active consensus nodes (if it ever to exceed the number
of standby validators), so we were not producing the right count and based on
that not giving the right set of validators.
Fixes#512.
Simple as that: UnregisteredAndHasNoVotes != !RegisteredAndHasVotes
Registered validators should stay in the DB, we might be in the process of
updating votes for them and that starts with subtraction.
While initializing a struct, it is a top item on ALTSTACK.
This means that if we need to load a local variable,
DUPFROMALTSTACK won't longer push an array of locals on stack
but rather a currently initializing struct.
Closes#656.
Old implementation could view 0x62 byte in
a script as a JMP instruction irregardless of whether it is
a real opcode or a part of a parameter of another instruction.
In this commit instructions are decoded together with parameters
during jump label rewriting.
We can leak sending goroutines and stall broadcasts because of already gone
peers that happened to be cached by some s.Peers() user (more than 800 of
these can be seen in nodoka log along with (*Server).run blocking on
CMDGetAddr send):
Feb 10 16:35:15 nodoka neo-go[1563]: goroutine 41 [chan send, 3320 minutes]:
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).putPacketIntoQueue(...)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:81
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).EnqueueHPPacket(0xc0083d57a0, 0xc017206100, 0x18, 0x40, 0x136a240, 0xc018ef9720)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:119 +0x98
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).iteratePeersWithSendMsg(0xc0000ca000, 0xc0001848a0, 0xcb4550, 0x0)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:720 +0x12a
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).broadcastHPMessage(...)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:731
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).run(0xc0000ca000)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:203 +0xee4
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).Start(0xc0000ca000, 0xc000072c60)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:173 +0x2ec
Feb 10 16:35:15 nodoka neo-go[1563]: created by github.com/CityOfZion/neo-go/cli/server.startServer
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/cli/server/server.go:331 +0x476
Lack of FreeGasLimit in privnet leads to gas limit exceeding in case of transactions with small amount of GAS to be used for invoke operation (< real cost of the transaction). Solution: Fixed constraint in case when FreeGasLimit == 0. So now we are able to perform transactions in privnet with FreeGasLimit = 0 for free.
Tesnet sync failed with:
Feb 07 00:04:19 nodoka neo-go[1747]: 2020-02-07T00:04:19.838+0300 WARN blockQueue: failed adding block into the blockchain {"error": "failed to store notifications: not supported", "blockHeight": 713984, "nextIndex": 713985}
because some (not so) smart contract emitted a notification with an
InteropItem inside.
Seeing some
blockQueue: failed adding block into the blockchain {"error": "failed to store notifications: not supported", "blockHeight": 713984, "nextIndex": 713985}
in logs is not very helpful.
Fix mempool and chain locking
This allows us easily make 1000 Tx/s in 4-nodes privnet, fixes potential
double spends and improves mempool testing coverage.
Fixes GolangCI:
Error return value of
(*github.com/CityOfZion/neo-go/pkg/core/mempool.Pool).Add is not checked
(from errcheck)
and allows us to almost completely forget about mempool here.
Our mempool only contains valid verified transactions all the time, it never
has any unverified ones. Unverified pool made some sense for quick unverifying
after the new block acceptance (and gradual background reverification), but
reverification needs some non-trivial locking between blockchain and mempool
and internal mempool state locking (reverifying tx and moving it between
unverified and verified pools must be atomic). But our current reverification
is fast enough (and has all the appropriate locks), so bothering with
unverified pool makes little sense.
We not only need to remove transactions stored in the block, but also
invalidate some potential double spends caused by these transactions. Usually
new block contains a substantial number of transactions from the pool, so it's
easier to make one pass over it only keeping valid items rather than remove
them one by one and make an additional pass to recheck inputs/witnesses.
It doesn't harm as we have transactions naturally ordered by fee anyway and it
makes managing them a little easier. This also makes slices store item itself
instead of pointers to it which reduces the pressure on the memory subsystem.
They shouldn't depend on the chain state and for the same transaction they
should always produce the same result. Thus, it makes no sense recalculating
them over and over again.
We can only add one block of the given height and we have two competing
goroutines to do that --- consensus and block queue. Whomever adds the block
first shouldn't trigger an error in another one.
Fix block relaying for blocks added via the block queue also, previously one
consensus-generated blocks were broadcasted.
Eliminate races between tx checks and adding them to the mempool, ensure the
chain doesn't change while we're working with the new tx. Ensure only one
block addition attempt could be in progress.
The chain may already be more current than our dBFT state (like when the node
has commited something at view 0, but all the other nodes changed view and
accepted something at view 1), so in this case we should reinit dBFT on new
height.
Because the constants are loaded directly via `emitLoadConst`, there is no need to store
them in an array of locals. It can have a big overhead, because it
is done at the beginning of every function.
It can lead to some goroutine explosion, but supposedly it's better than
stalling other processing and eventually all of these goroutines should finish
their sends. Note that this doesn't change the behavior for RPC-relayed
transactions that are still waiting for the broadcast to finish ensuring
proper transaction distribution before returning the result to the client.
If we have already got Version message, we don't need the rest of handshake to
complete before being able to properly answer the PeerAddr() requests. Fixes
some duplicate connections between machines.
This one is designed to give more priority to direct nodes communication, that
is that their messaging would have more priority than generic broadcasts. It
should improve consensus process under TX pressure and allow to handle
pings in time (preventing disconnects).
They have the opposite order, height first and nonce second. It was done wrong
in 4e6ed902 and never fixed since. Fixes sending wrong peer state leading to
useless getheaders messages (and disconnects when the other side is lagging
behind).
We can have more than one connection attempt in progress and not yet completed
the handshake, so if there is a Version already received we should look it.