We can still lock the (*Server).run with dead peers:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 40 [select, 871 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).putPacketIntoQueue(0xc030ab5320, 0xc02f251f20, 0xc00af0dcc0, 0x18, 0x40, 0x100000000000000, 0xffffffffffffffff)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:82 +0xf4
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).EnqueueHPPacket(0xc030ab5320, 0xc00af0dcc0, 0x18, 0x40, 0x1367240, 0xc03090ef98)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:124 +0x52
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).iteratePeersWithSendMsg(0xc0000ca000, 0xc00af35800, 0xcb2a58, 0x0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:720 +0x12a
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).broadcastHPMessage(...)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:731
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).run(0xc0000ca000)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:203 +0xee4
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*Server).Start(0xc0000ca000, 0xc000072ba0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:173 +0x2ec
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by github.com/CityOfZion/neo-go/cli/server.startServer
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/cli/server/server.go:331 +0x476
...
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 2199 [chan send, 870 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).Disconnect.func1()
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:366 +0x85
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: sync.(*Once).Do(0xc030ab403c, 0xc02f262788)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/usr/local/go/src/sync/once.go:44 +0xb3
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).Disconnect(0xc030ab4000, 0xd92440, 0xc000065a00)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:365 +0x6d
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).SendPing.func1()
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:394 +0x42
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by time.goFunc
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/usr/local/go/src/time/sleep.go:169 +0x44
...
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: goroutine 3448 [chan send, 854 minutes]:
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).handleConn(0xc01ed203f0)
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:143 +0x6c
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: created by github.com/CityOfZion/neo-go/pkg/network.(*TCPTransport).Accept
Feb 13 16:14:50 neo-go-node-2 neo-go[9448]: #011/go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_transport.go:62 +0x44c
...
The problem is that the select in putPacketIntoQueue() only works the way it
was intended to after the `close(p.done)`, but that happens only after
successful unregistration request send. Thus, do disconnects the other way
around, first unblock queueing and exit goroutines, then destroy the
connection (if it wasn't previously destroyed) and only after that signal to
the Server.
We can leak sending goroutines and stall broadcasts because of already gone
peers that happened to be cached by some s.Peers() user (more than 800 of
these can be seen in nodoka log along with (*Server).run blocking on
CMDGetAddr send):
Feb 10 16:35:15 nodoka neo-go[1563]: goroutine 41 [chan send, 3320 minutes]:
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).putPacketIntoQueue(...)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:81
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*TCPPeer).EnqueueHPPacket(0xc0083d57a0, 0xc017206100, 0x18, 0x40, 0x136a240, 0xc018ef9720)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/tcp_peer.go:119 +0x98
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).iteratePeersWithSendMsg(0xc0000ca000, 0xc0001848a0, 0xcb4550, 0x0)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:720 +0x12a
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).broadcastHPMessage(...)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:731
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).run(0xc0000ca000)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:203 +0xee4
Feb 10 16:35:15 nodoka neo-go[1563]: github.com/CityOfZion/neo-go/pkg/network.(*Server).Start(0xc0000ca000, 0xc000072c60)
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/pkg/network/server.go:173 +0x2ec
Feb 10 16:35:15 nodoka neo-go[1563]: created by github.com/CityOfZion/neo-go/cli/server.startServer
Feb 10 16:35:15 nodoka neo-go[1563]: /go/src/github.com/CityOfZion/neo-go/cli/server/server.go:331 +0x476
We can only add one block of the given height and we have two competing
goroutines to do that --- consensus and block queue. Whomever adds the block
first shouldn't trigger an error in another one.
Fix block relaying for blocks added via the block queue also, previously one
consensus-generated blocks were broadcasted.
Eliminate races between tx checks and adding them to the mempool, ensure the
chain doesn't change while we're working with the new tx. Ensure only one
block addition attempt could be in progress.
It can lead to some goroutine explosion, but supposedly it's better than
stalling other processing and eventually all of these goroutines should finish
their sends. Note that this doesn't change the behavior for RPC-relayed
transactions that are still waiting for the broadcast to finish ensuring
proper transaction distribution before returning the result to the client.
If we have already got Version message, we don't need the rest of handshake to
complete before being able to properly answer the PeerAddr() requests. Fixes
some duplicate connections between machines.
This one is designed to give more priority to direct nodes communication, that
is that their messaging would have more priority than generic broadcasts. It
should improve consensus process under TX pressure and allow to handle
pings in time (preventing disconnects).
They have the opposite order, height first and nonce second. It was done wrong
in 4e6ed902 and never fixed since. Fixes sending wrong peer state leading to
useless getheaders messages (and disconnects when the other side is lagging
behind).
We can have more than one connection attempt in progress and not yet completed
the handshake, so if there is a Version already received we should look it.
Our node was too pingy because of wrong timer setups (that divided timeout
Duration by time.Second), it also was wrong in its time calculations (using
UTC time to calculate intervals). At the same time missing block is a
server-wide problem, so it's better solved with server-wide protocol loop.
1) Make timeout a timeout, don't do magic ping counts.
2) Drop additional timer from the main peer's protocol loop, create it
dynamically and make it disconnect the peer.
3) Don't expose the ping counter to the outside, handle more logic inside the
Peer.
Relates to #430.
We don't and we won't have synchronized clocks in the network so the only
timestamp that we can compare our local time with is the one made
ourselves. What this ping mechanism is used for is to recover from missing the
block broadcast, thus it's appropriate for it to trigger after X seconds of
the local time since the last block received.
Relates to #430.
In reality it will never be true exactly in the case where we want this ping
mechanism to work --- when the node failed to get a block from the net. It
won't get the header either and thus its block height will be equal to header
height. The only moment when this condition is met is when the node does
initial synchronization and this synchronization works just fine without any
pings.
Relates to #430.
Two queues for high-priority and ordinary messages. Fixes#590. These queues
are deliberately made small to avoid buffer bloat problem, there is gonna be
another queueing layer above them to compensate for that. The queues are
designed to be synchronous in enqueueing, async capabilities are to be added
layer above later.
add pingInterval same as used in ref C# implementation with the same logic
add pingTimeout which is used to check whether pong received. If not -- drop the peer.
add pingLimit which is hardcoded to 4 in TCPPeer. It's limit for unsuccessful ping/pong calls (where pong wasn't received in pingTimeout interval)
It wasn't actually requesting transactions but rather sending an inventory
message telling everyone that we have them which is completely wrong and
easily leads to ChangeView that could be avoided.
go vet is not happy about them:
pkg/io/binaryReader.go:92:21: method ReadByte() byte should have signature ReadByte() (byte, error)
pkg/io/binaryWriter.go:75:21: method WriteByte(u8 byte) should have signature WriteByte(byte) error
This seriously improves the serialization/deserialization performance for
several reasons:
* no time spent in `binary` reflection
* no memory allocations being made on every read/write
* uses fast ReadBytes everywhere it's appropriate
It also makes Fixed8 Serializable just for convenience.
add dao which takes care about all CRUD operations on storage
remove blockchain state since everything is stored on change
remove storage operations from structs(entities)
move structs to entities package
It reduces heap pressure a little for these elements as we don't have to
allocate/free them individually. And they're directly tied to transactions or
block, not being shared or anything like that, so it makes little sense for
them to be pointer-based. It only makes building transactions a little easier,
but that's obviously a minor usecase.
Before this patch on block import we could easily be spending more than 6
seconds out of 30 in Uint256 encoding for UnspentBalance, now it's completely
off the radar.
Only request headers from the other peer if his height is bigger than
ours. Otherwise we routinely ask 0-height newcomers for some random headers
that they know nothing about.
This one is essential for the consensus nodes as otherwise they won't give out
the blocks they generate making their generation almost useless. It also makes
our networking part more complete.
We have a race between reader and writer goroutines for the same connection
that leads to handshake failures when reader is faster to read the incoming
version (and try to reply to it) than writer is to write our own Version:
WARN[0000] peer disconnected addr="172.200.0.4:20334" peerCount=5 reason="invalid handshake: tried to send VersionAck, but didn't send Version yet
Fix it by moving Version sending before the reader loop starts.
add processing of validators while block persist;
add validator structure with decoding/encoding;
add validator get from store;
add EnrollmentTX and StateTX processing;
add pubkey decode bytes, unique and contains functions;
Right now message can be written in several Write's so
concurrent calls of writeMsg() can in theory interleave.
This commit fixes it.
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Otherwise the node might crash in `startProtocol` because of missing Version
field in the peer. And it also keeps the sequence correct, Version MUST be
sent first and ACKs can only follow it.
Our node didn't respect the MaxPeers setting, fix it with a drop of random
connection when this limit is reached (to give a chance for newcomers to
communicate), but also introduce AttemptConnPeers setting to tune the number
of attempted connections.
This also raises the default MaxPeers for testnet/mainnet to 100, because
neo-go nodes love making friends.
This allows to start handshaking from both client and server (mainnet/testnet
nodes were seen to not care about string ordering for it), but still maintains
some sane checks in the process. It also makes functions thread-safe because
we have two goroutines servicing read and write side of the Peer connection,
so they can clash on access to the struct fields.
Add a test for it also.
Extend Blockchainer with one more method to spawn a VM for test runs and use
it to run scripts. Gas consumption is not counted or limited in any way at the
moment (see #424).
Keeping run() as the owner of all maps would mean adding at least three more
channels to keep address getters with thread-safety. But then there also is a
race between requestToWork() and run() which is way harder to solve with
channels because there are lots of possibilities for deadlocks. So rework all
of this with good old mutexes.
While at it, fix `requestCh` handling in the inner select of run, it will waste
one loop to handle it, so we should add one to the `requested`.
Fixes#445.
Goreport:
neo-go/pkg/core/contract_state_test.go
Line 21: warning: "Contracto" is a misspelling of "Contraction" (misspell)
Line 64: warning: "Contracto" is a misspelling of "Contraction" (misspell)
neo-go/pkg/core/interop_neo.go
Line 420: warning: "succeedes" is a misspelling of "succeeds" (misspell)
neo-go/pkg/network/discovery.go
Line 118: warning: "succeded" is a misspelling of "succeeded" (misspell)
Line 128: warning: "successfuly" is a misspelling of "successfully" (misspell)
golint suggests:
pkg/network/payload/address.go:48:12: should omit type net.IP from declaration of var netip; it will be inferred from the right-hand side
It's a temporary stub until proper encoding/decoding is implemented. It's
useful for testnet/mainnet connections because without it consensus message
receival leads to peer disconnection.
It's bogus and no other node implementation has anything like that. It fires
up for no good reason in the case when some other node connects to us and it
obviously doesn't use its listening port for it.
If the block references two ouputs in some other transaction the code failed
to verify it because of key collision. C# code implements it properly by using
full CoinReference type as a key, so let's do it in a similar fashion.
In the very specific case when the list of headers received is exactly one
block ahead of the chain of full blocks requestBlocks() failed to generate
request to get the next full block.
This one will replace blockCache in Blockchain itself as it can and should be
external from it. The idea is that we only feed successive blocks into the
Blockchain and it only stores valid proper Blockchain and nothing else.
For example, at the moment our node can't handle `consensus` message, so when
it received it before the patch it just crashed because of uninitialized `p`.
It's mostly used for Serializable and in other cases where one needs to
estimate binary-encoded size of the stucture. This also simplifies future
removal of the Size() from Serializable.
The logic here is that we'll have all binary encoding/decoding done via our io
package, which simplifies error handling. This functionality doesn't belong to
util, so it's moved.
This also expands BufBinWriter with Reset() method to fit the needs of core
package.
...and don't try to connect to the nodes we're already connected to.
Before this change we had a problem of discoverer throwing away good valid
addresses just because they are already known which lead to pool draining over
time (as address reuse was basically forbidden and getaddr may not get enough
new nodes).
Queuing one message is not reliable enough, the peer that gets it can fail to
actually make a request, so make this queue a bit deeper to have a higher
chance of success.
This makes writer side handle errors properly and fixes communication between
reader and writer goroutine to always correctly unregister the peer. This is
especially important for the case where error occurs before handshake
completes as in this case we don't even have goroutine in startProtocol()
running.
pkg/core/transaction/attribute.go:67:14: should omit type uint8 from declaration of var urllen; it will be inferred from the right-hand side
pkg/crypto/keys/publickey.go:184:8: should omit type []byte from declaration of var b; it will be inferred from the right-hand side
pkg/network/payload/version_test.go:15:12: should omit type bool from declaration of var relay; it will be inferred from the right-hand side
Refs. #213.
Fixes things like:
* exported type/method/function X should have comment or be unexported
* comment on exported type/method/function X should be of the form "X ..."
(with optional leading article)
Refs. #213.
Simplifies a lot of code and removes some duplication. Unfortunately I had to
move test_util random functions in same commit to avoid cycle
dependencies. One of these random functions was also used in core/transaction
testing, to simplify things I've just dropped it there and used a static
string (which is nice to have for a test anyway).
There is still sha256 left in wallet (but it needs to pass Hash structure into
the signing function).
* Added new config attributes: 'SecondsPerBlock','LowPriorityThreshold'
* Added new files:
* Added new method: CompareTo
* Fixed empty Slice case
* Added new methods: LessThan, GreaterThan, Equal, CompareTo
* Added new method: InputIntersection
* Added MaxTransactionSize, GroupOutputByAssetID
* Added ned method: ScriptHash
* Added new method: IsDoubleSpend
* Refactor blockchainer, Added Feer interface, Verify and GetMemPool method
* 1) Added MemPool
2) Added new methods to satisfy the blockchainer interface: IsLowPriority, Verify, GetMemPool
* Added new methods: RelayTxn, RelayDirectly
* Fixed tests
* Implemented RPC server method sendrawtransaction
* Refactor getrawtransaction, sendrawtransaction in separate methods
* Moved 'secondsPerBlock' to config file
* Implemented Kim suggestions:
1) Fixed data race issues
2) refactor Verify method
3) Get rid of unused InputIntersection method due to refactoring Verify method
4) Fixed bug in https://github.com/CityOfZion/neo-go/pull/174#discussion_r264108135
5) minor simplications of the code
* Fixed minor issues related to
1) space
2) getter methods do not need pointer on the receiver
3) error message
4) refactoring CompareTo method in uint256.go
* Fixed small issues
* Use sync.RWMutex instead of sync.Mutex
* Refined (R)Lock/(R)Unlock
* return error instead of bool in Verify methods
* Added utility function GetVarSize
* 1) Added Size method: this implied that Fixed8 implements now the serializable interface. 2) Added few arithmetic operation (Add, Sub, div): this will be used to calculated networkfeeand feePerByte. Changed return value of the Value() method to int instead of int64. Modified fixed8_test accordingly.
* Implemented Size or MarshalJSON method.
- Structs accepting the Size method implement the serializable interface.
- Structs accepting the MarshalJSON method implements the customized json marshaller interface.
* Added fee calculation
* Implemented rcp server method GetRawTransaction
* Updated Tests
* Fixed:
1) NewFixed8 will accept as input int64
2) race condition affecting configDeafault, blockchainDefault
* Simplified Size calculation
* 1) Removed global variable blockchainDefault, configDefault
2) Extended Blockchainer interface to include the methods: References, FeePerByte, SystemFee, NetworkFee
3) Deleted fee_test.go, fee.go. Moved corresponding methods to blockchain_test.go and blockchain.go respectively
4) Amended tx_raw_output.go
* Simplified GetVarSize Method
* Replaced ValueAtAndType with ValueWithType
* Cosmetic changes + Added test case getrawtransaction_7
* Clean up Print statement
* Filled up keys
* Aligned verbose logic with the C#-neo implementation
* Implemented @Kim requests
Refactor server_test.go
* Small fixes
* Fixed verbose logic
Added more tests
Cosmetic changes
* Replaced assert.NoError with require.NoError
* Fixed tests by adding context.Background() as argument
* Fixed tests
* Implemented rcp method GetAccountState
* code clean up
* Removed empty line
* Used consistently github.com/pkg/errors package. Amended error message
* Get rid of fmt.Sprintf and use either errors.Errorf or errors.Wrapf
* cosmetic changes
* Made Encode/Decode message public.
* Added Redis storage driver and made some optimizations for the initialising the blockchain
* removed log lines in tcp_peer
* Added missing comments on exported methods.
* bumped version
* Adds basic RPC supporting files
* Adds interrupt handling and error chan
* Add getblock RPC method
* Update request structure
* Update names of nodes
* Allow bad addresses to be registered in discovery externally
* Small tidy up
* Few tweaks
* Check if error is close error in tcp transport
* Fix tests
* Fix priv port
* Small tweak to param name
* Comment fix
* Remove version from server
* Moves submitblock to TODO block
* Remove old field
* Bumps version and fix hex issues
* Created test_data folder with block json files for testing + create separate file for block base.
* Fixed bug in WriteVarUint + Trim logic + unit tests
* Refactored store and add more tests for it.
* restore headerList from chain file
* Fix tx decode bug + lots of housekeeping.
* Implemented Node restore state from chain file.
* Created standalone package for storage. Added couple more methods to Batch and Store interfaces.
* Block persisting + tests
* bumped version -> 0.31.0
* Adds struct for config settings
* Bumps version
* Move settings around
* Only push version tag up not whole branch as well
* Move config into own path and add test and main nets
* Update config for timeouts
* Convert to yaml
* block partial persist
* replaced refactored files with old one.
* removed gokit/log from deps
* Tweaks to not overburden remote nodes with getheaders/getblocks
* Changed Transporter interface to not take the server as argument due to a cause of race warning from the compiler
* started server test suite
* more test + return errors from message handlers
* removed --race from build
* Little improvements.
* deleted transfer_output added asset type and transaction result to core
* removed writing 0x00 when buffer length is 0
* Refactored emit into VM package + moved tx to own package.
* implemented transaction along with claimTransaction.
* refactored naming of transaction + added decode address for uint160 types
* removed unnecessary folder and files.
* transaction/smartcontract logic
* bumped version 0.24.0
* Disconnect once go routines signal so
* Send msg if cmd is other than cmdUnknown
* Remove typo
* Fix lock file
* Updates README
* Comment typo
* Fix issue
* Revert old changes
* Handle error
* Initial draft of the neo-go wallet
* Cleanup + more test for util package
* integrated wallet into neo-cli partially
* base wallet implementation + smartcontract code.
* refactored tcp transport
* return errors on outgoing messages
* TCP transport should report its error after reading from connection
* handle error returned from peer transport
* bump version
* cleaned up error