Attempts to reuse elliptic.Unmarshal() and elliptic.UnmarshalCompressed() lead
to this:
name old time/op new time/op delta
PublicDecodeBytes-8 59.5µs ± 2% 61.8µs ± 1% +3.78% (p=0.000 n=10+9)
name old alloc/op new alloc/op delta
PublicDecodeBytes-8 3.99kB ± 0% 4.27kB ± 0% +6.81% (p=0.000 n=9+10)
name old allocs/op new allocs/op delta
PublicDecodeBytes-8 136 ± 0% 135 ± 0% -0.74% (p=0.000 n=10+10)
So it makes no sense. Refs. #1319.
Go 1.15 provides native (*ecdsa.PublicKey).Equal method, but we can't drop our
own Equal because the types are different and there is still code using our
Equal (forcing it to convert types is counterproductive), while changing
(*PublicKey).Equal to use (*ecdsa.PublicKey).Equal internally with some kind of
(*ecdsa.PublicKey)(p).Equal((*ecdsa.PublicKey)(key))
slows it down:
name old time/op new time/op delta
PublicEqual-8 14.9ns ± 1% 18.4ns ± 2% +23.55% (p=0.000 n=9+10)
name old alloc/op new alloc/op delta
PublicEqual-8 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
PublicEqual-8 0.00 0.00 ~ (all equal)
So leave it as is, but add this micro-bench. Refs. #1319.
Similar to c69670c85b, allows to eliminate one
allocation and reduce memory footprint a bit (tested on tx decoding):
name old time/op new time/op delta
DecodeFromBytes-8 1.78µs ± 3% 1.79µs ± 2% ~ (p=1.000 n=10+10)
name old alloc/op new alloc/op delta
DecodeFromBytes-8 888B ± 0% 800B ± 0% -9.91% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
DecodeFromBytes-8 11.0 ± 0% 10.0 ± 0% -9.09% (p=0.000 n=10+10)
Functions are usually immediately replaced (and it's OK for them to be nil,
searching through an array with length of zero is fine), Notifications are
usually appended to (and are absolutely useless in verification contexts).
* both 'to' and 'from' are either Null or Hash160, there is no other
possibility for valid NEP-17. So returning util.Uint160{} in case of
parsing error is wrong.
* but this is what allowed burns/mints to work at the expense of error
allocation inside of util.Uint160DecodeBytesBE()
* Uint160 can technically fit into regular VM integer, so even though it'd be
quite surprising to see it there, TryBytes() is more correct (and easier!)
to use
* same thing with `amount`, we have `TryInteger()` that easily covers all
possible cases and does appropriate error checking inside
Squash (*DAO).StoreAsTransaction and
(*DAO).StoreConflictingTransactions. It's better to keep them this way,
because StoreAsTransaction is always followed by
StoreConflictingTransactions, so it's an atomic operation.
The logic wasn't changed.
It is used a lot in clients (including our benchmark).
`Uint160` is already optimized.
```
name old time/op new time/op delta
Uint256DecodeStringLE-8 150ns ±15% 112ns ± 3% -25.23% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Uint256DecodeStringLE-8 96.0B ± 0% 64.0B ± 0% -33.33% (p=0.000 n=10+10)
name old allocs/op new allocs/op delta
Uint256DecodeStringLE-8 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.000 n=10+10)
```
Signed-off-by: Evgeniy Stratonikov <evgeniy@nspcc.ru>
We're using batches in wrong way during persist, we already have all changes
accumulated in two maps and then we move them to batch and then this is
applied. For some DBs like BoltDB this batch is just another MemoryStore, so
we essentially just shuffle the changeset from one map to another, for others
like LevelDB batch is just a serialized set of KV pairs, it doesn't help much
on subsequent PutBatch, we just duplicate the changeset again.
So introduce PutChangeSet that allows to take two maps with sets and deletes
directly. It also allows to simplify MemCachedStore logic.
neo-bench for single node with 10 workers, LevelDB:
Reference:
RPS 30189.132 30556.448 30390.482 ≈ 30379 ± 0.61%
TPS 29427.344 29418.687 29434.273 ≈ 29427 ± 0.03%
CPU % 33.304 27.179 33.860 ≈ 31.45 ± 11.79%
Mem MB 800.677 798.389 715.042 ≈ 771 ± 6.33%
Patched:
RPS 30264.326 30386.364 30166.231 ≈ 30272 ± 0.36% ⇅
TPS 29444.673 29407.440 29452.478 ≈ 29435 ± 0.08% ⇅
CPU % 34.012 32.597 33.467 ≈ 33.36 ± 2.14% ⇅
Mem MB 549.126 523.656 517.684 ≈ 530 ± 3.15% ↓ 31.26%
BoltDB:
Reference:
RPS 31937.647 31551.684 31850.408 ≈ 31780 ± 0.64%
TPS 31292.049 30368.368 31307.724 ≈ 30989 ± 1.74%
CPU % 33.792 22.339 35.887 ≈ 30.67 ± 23.78%
Mem MB 1271.687 1254.472 1215.639 ≈ 1247 ± 2.30%
Patched:
RPS 31746.818 30859.485 31689.761 ≈ 31432 ± 1.58% ⇅
TPS 31271.499 30340.726 30342.568 ≈ 30652 ± 1.75% ⇅
CPU % 34.611 34.414 31.553 ≈ 33.53 ± 5.11% ⇅
Mem MB 1262.960 1231.389 1335.569 ≈ 1277 ± 4.18% ⇅
VM always has istack and it doesn't even change, so doing this microallocation
makes no sense. Notice that estack is a bit harder to change we do replace it
in some cases and we compare pointers to it as well.
It requires only two methods from Blockchainer: AddBlock and
BlockHeight. New interface will allow to easily reuse the block queue
for state exchange purposes.