Commit graph

7287 commits

Author SHA1 Message Date
Anna Shaleva
b82374823e core: increase persist batch size for reset storage changes 2022-11-22 11:53:39 +03:00
Anna Shaleva
bdc42cd595 core: reset blocks, txs and AERs in several stages
Sometimes it can be hard to persist all changes at ones, the process
can take almost all RAM and a lot of time. Here's the example of reset
for mainnet from 2.4M to 1:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:16:48.236+0300	INFO	MaxBlockSize is not set or wrong, setting default value	{"MaxBlockSize": 262144}
2022-11-20T17:16:48.236+0300	INFO	MaxBlockSystemFee is not set or wrong, setting default value	{"MaxBlockSystemFee": 900000000000}
2022-11-20T17:16:48.237+0300	INFO	MaxTransactionsPerBlock is not set or wrong, using default value	{"MaxTransactionsPerBlock": 512}
2022-11-20T17:16:48.237+0300	INFO	MaxValidUntilBlockIncrement is not set or wrong, using default value	{"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:16:48.240+0300	INFO	restoring blockchain	{"version": "0.2.6"}
2022-11-20T17:16:48.297+0300	INFO	initialize state reset	{"target height": 1}
2022-11-20T17:16:48.300+0300	INFO	trying to reset blocks, transactions and AERs
2022-11-20T17:19:29.313+0300	INFO	blocks, transactions ans AERs are reset	{"took": "2m41.015126493s", "keys": 3958420}
...
```
To avoid OOM killer, split blocks reset into multiple stages. It increases
operation time due to intermediate DB persists, but makes things cleaner, the
result for almost the same DB height with the new approach:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -m --height 1
2022-11-20T17:39:42.023+0300	INFO	MaxBlockSize is not set or wrong, setting default value	{"MaxBlockSize": 262144}
2022-11-20T17:39:42.023+0300	INFO	MaxBlockSystemFee is not set or wrong, setting default value	{"MaxBlockSystemFee": 900000000000}
2022-11-20T17:39:42.023+0300	INFO	MaxTransactionsPerBlock is not set or wrong, using default value	{"MaxTransactionsPerBlock": 512}
2022-11-20T17:39:42.023+0300	INFO	MaxValidUntilBlockIncrement is not set or wrong, using default value	{"MaxValidUntilBlockIncrement": 5760}
2022-11-20T17:39:42.026+0300	INFO	restoring blockchain	{"version": "0.2.6"}
2022-11-20T17:39:42.071+0300	INFO	initialize state reset	{"target height": 1}
2022-11-20T17:39:42.073+0300	INFO	trying to reset blocks, transactions and AERs
2022-11-20T17:40:11.735+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 1, "took": "29.66363737s", "keys": 210973}
2022-11-20T17:40:33.574+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 2, "took": "21.839208683s", "keys": 241203}
2022-11-20T17:41:29.325+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 3, "took": "55.750698386s", "keys": 250593}
2022-11-20T17:42:12.532+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 4, "took": "43.205892757s", "keys": 321896}
2022-11-20T17:43:07.978+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 5, "took": "55.445398156s", "keys": 334822}
2022-11-20T17:43:35.603+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 6, "took": "27.625292032s", "keys": 317131}
2022-11-20T17:43:51.747+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 7, "took": "16.144359017s", "keys": 355832}
2022-11-20T17:44:05.176+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 8, "took": "13.428733899s", "keys": 357690}
2022-11-20T17:44:32.895+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 9, "took": "27.718548783s", "keys": 393356}
2022-11-20T17:44:51.814+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 10, "took": "18.917954658s", "keys": 366492}
2022-11-20T17:45:07.208+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 11, "took": "15.392642196s", "keys": 326030}
2022-11-20T17:45:18.776+0300	INFO	intermediate batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 12, "took": "11.568255716s", "keys": 299884}
2022-11-20T17:45:25.862+0300	INFO	last batch of removed blocks, transactions and AERs is persisted	{"batches persisted": 13, "took": "7.086079594s", "keys": 190399}
2022-11-20T17:45:25.862+0300	INFO	blocks, transactions ans AERs are reset	{"took": "5m43.791214084s", "overall persisted keys": 3966301}
...
```
2022-11-22 11:53:39 +03:00
Anna Shaleva
d67f0df516 core: reset block headers together with header height info
We need to keep the headers information consistent with header batches
and headers. This comit fixes the bug with failing blockchain
initialization on recovering from state reset interrupted after the
second stage (blocks/txs/AERs removal):
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T16:28:29.437+0300	INFO	MaxValidUntilBlockIncrement is not set or wrong, using default value	{"MaxValidUntilBlockIncrement": 5760}
2022-11-20T16:28:29.440+0300	INFO	restoring blockchain	{"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: could not get header 1898cd356a4a2688ed1c6c7ba1fd6ba7d516959d8add3f8dd26232474d4539bd: key not found
```
2022-11-22 11:53:39 +03:00
Anna Shaleva
283da8f599 core: use DAO-provided block height during during state reset
Don't use cache because it's not yet initialized. Also, perform
safety checks only if state reset wasn't yet started. These fixes
alloww to solve the following problem while recovering from
interrupted state reset:
```
anna@kiwi:~/Documents/GitProjects/nspcc-dev/neo-go$ ./bin/neo-go db reset -t --height 83000
2022-11-20T15:51:31.431+0300	INFO	MaxValidUntilBlockIncrement is not set or wrong, using default value	{"MaxValidUntilBlockIncrement": 5760}
2022-11-20T15:51:31.434+0300	INFO	restoring blockchain	{"version": "0.2.6"}
failed to create Blockchain instance: could not initialize blockchain: current block height is 0, can't reset state to height 83000
```
2022-11-22 11:53:39 +03:00
Anna Shaleva
7d55bf2cc1 core: log persisted storage item batches count during state reset 2022-11-22 11:53:39 +03:00
Anna Shaleva
f52451e582 core: fix state reset with broken contract
Sync up with #2802, bad contract -> no contract ID at all.
2022-11-22 11:53:39 +03:00
Anna Shaleva
ecda07736e core: stop storage items reset after any seek error 2022-11-22 11:53:39 +03:00
Anna Shaleva
bfe7aeae7b core: stop storage items reset after the first persist error
It's a bug, we mustn't continue if something bad had happend on persist,
otherwise this error will be overwritten by subsequent successfull persist.
2022-11-22 11:53:39 +03:00
Anna Shaleva
235518eb6c core: reset batch counter to zero after each persist in resetStateInternal
It's a bug, otherwise we'll persist each storage item after 10K-th one,
that's the reason of abnormous long storage items resetting stage.
2022-11-22 11:53:39 +03:00
Anna Shaleva
9f23fafc03 core: improve logging of resetStateInternal
Inform when starting subsequent stage, inform about keys persisted.
2022-11-22 11:53:36 +03:00
Roman Khimov
0039615ae3
Merge pull request #2816 from nspcc-dev/fix-pointer-serialization
Fix pointer serialization
2022-11-20 22:59:52 +07:00
Roman Khimov
9ba18b5dfa stackitem: serialize/deserialize pointers, fix #2815
They of course can't be serialized, but in protected mode we still need to
handle them somehow.
2022-11-20 16:02:50 +03:00
Roman Khimov
286f40b828 cli/vm: use tx system fee by default 2022-11-20 16:02:50 +03:00
Roman Khimov
2908965a8d cli/vm: add --gas parameter to load* commands
It's useful to reproduce different execution problems including executions
stopped because of GAS limit. Satoshi representation is deliberately used,
because that's what is usually found in logs.
2022-11-20 16:02:50 +03:00
Roman Khimov
48140320db
Merge pull request #2812 from nspcc-dev/improve-vm-context-handling
Improve vm istack/estack handling
2022-11-20 19:42:35 +07:00
Roman Khimov
ca9fde745b
Merge pull request #2809 from nspcc-dev/fix-subs
rpcsrv: do not block blockchain events receiver by subscription requests
2022-11-18 16:16:41 +07:00
Roman Khimov
8e7f65be17 vm: use proper estack for exception handler
v.estack might be some inner invoked contract and its stack must not be used
for exception handler set up by higher-order contract.
2022-11-18 11:36:38 +03:00
Roman Khimov
cb64957af5 vm: don't use Stack for istack
We don't use all of the Stack functionality for it, so drop useless methods
and avoid some interface conversions. It increases single-node TPS by about
0.9%, so nothing really important there, but not a bad change either. Maybe it
can be reworked again with generics though.
2022-11-18 11:35:29 +03:00
Anna Shaleva
4df9a5e379 rpcsrv: refactor subscribe routine
Move shutdown check after subsCounterLock is taken in the end of
`(s *Server) subscribe` in order to avoid extra locks holding.
2022-11-18 10:54:10 +03:00
Roman Khimov
bb9f17108e
Merge pull request #2811 from nspcc-dev/network-connections-fix
Network connections fix
2022-11-18 14:26:25 +07:00
Anna Shaleva
b7f19a54d5 services: fix chain locked by WS subscriptions handlers
Blockchain's subscriptions, unsubscriptions and notifications are
handled by a single notificationDispatcher routine. Thus, on attempt
to send the subsequent event to Blockchain's subscribers, dispatcher
can't handle subscriptions\unsubscriptions. Make subscription and
unsubscription to be a non-blocking operation for blockchain on the
server side, otherwise it may cause the dispatcher locks.

To achieve this, use a separate lock for those code that make calls
to blockchain's subscription API and for subscription counters on
the server side.
2022-11-18 09:30:12 +03:00
Roman Khimov
2bcb7bd06f compiler: don't use (*VM).Istack when it's not needed 2022-11-17 20:46:06 +03:00
Roman Khimov
b8c09f509f network: add random slight delay to connection attempts
Small (especially dockerized/virtualized) networks often start all nodes at
ones and then we see a lot of connection flapping in the log. This happens
because nodes try to connect to each other simultaneously, establish two
connections, then each one finds a duplicate and drops it, but this can be
different duplicate connections on other sides, so they retry and it all
happens for some time. Eventually everything settles, but we have a lot of
garbage in the log and a lot of useless attempts.

This random waiting timeout doesn't change the logic much, adds a minimal
delay, but increases chances for both nodes to establish a proper single
connection on both sides to only then see another one and drop it on both
sides as well. It leads to almost no flapping in small networks, doesn't
affect much bigger ones. The delay is close to unnoticeable especially if
there is something in the DB for node to process during startup.
2022-11-17 18:42:43 +03:00
Roman Khimov
075a54192c network: don't try too many connections
Consider mainnet, it has an AttemptConnPeers of 20, so may already have 3
peers and request 20 more, then have 4th connected and attemtp 20 more again,
this leads to a huge number of connections easily.
2022-11-17 18:03:04 +03:00
Roman Khimov
5d7b37a6ff
Merge pull request #2810 from nspcc-dev/fix-loadgo
cli: split TestLoad into multiple parts
2022-11-17 21:50:46 +07:00
Roman Khimov
6bce973ac2 network: drop duplicationg check from handleAddrCmd()
It was relevant with the queue-based discoverer, now it's not, discoverer
handles this internally.
2022-11-17 17:42:36 +03:00
Roman Khimov
1c7487b8e4 network: add a timer to check for peers
Consider initial connection phase for public networks:
 * simultaneous connections to seeds
 * very quick handshakes
 * got five handshaked peers and some getaddr requests sent
 * but addr replies won't trigger new connections
 * so we can stay with just five connections until any of them breaks or a
   (long) address checking timer fires

This new timers solves the problem, it's adaptive at the same time. If we have
enough peers we won't be waking up often.
2022-11-17 17:32:05 +03:00
Anna Shaleva
e73c3c7ec4 services: adjust WS waiter test
Make it more stable.
2022-11-17 17:15:01 +03:00
Roman Khimov
23f118a1a9 network: rework discoverer/server interaction
* treat connected/handshaked peers separately in the discoverer, save
   "original" address for connected ones, it can be a name instead of IP and
   it's important to keep it to avoid reconnections
 * store name->IP mapping for seeds if and when they're connected to avoid
   reconnections
 * block seed if it's detected to be our own node (which is often the case for
   small private networks)
 * add an event for handshaked peers in the server, connected but
   non-handshaked ones are not really helpful for MinPeers or GetAddr logic

Fixes #2796.
2022-11-17 17:07:19 +03:00
Roman Khimov
6ba4afc977 network: consider handshaked peers only when comparing with MinPeers
We don't know a lot about non-handshaked ones, so it's safer to try more
connection.
2022-11-17 16:40:29 +03:00
Anna Shaleva
f8f8d5effe cli: split TestLoad into multiple parts
Problem: failing part of TestLoad:
```
=== RUN   TestLoad/loadgo,_check_signers
    cli_test.go:160:
        	Error Trace:	/home/circleci/go/src/github.com/nspcc-dev/neo-go/cli/vm/cli_test.go:160
        	            				/home/circleci/go/src/github.com/nspcc-dev/neo-go/cli/vm/cli_test.go:147
        	            				/home/circleci/go/src/github.com/nspcc-dev/neo-go/cli/vm/cli_test.go:444
        	Error:      	command took too long time
        	Test:       	TestLoad/loadgo,_check_signers
```
Solution: split the test into multiple parts to reduce test execution time.
2022-11-17 16:23:33 +03:00
Roman Khimov
f8949564c8
Merge pull request #2807 from nspcc-dev/release-0.99.6
CHANGELOG: release 0.99.6
2022-11-17 04:29:10 +07:00
Roman Khimov
ad912392b4 CHANGELOG: release 0.99.6 2022-11-17 00:25:53 +03:00
Roman Khimov
ab0ff63ce1
Merge pull request #2804 from nspcc-dev/check-aer-sub
rpc: fix subscribers locking logic and properly drain poll-based waiter receiver
2022-11-17 04:24:35 +07:00
Anna Shaleva
1399496dfb rpcclient: refactor event-based waiting loop
Avoid receiver channels locks.
2022-11-16 23:57:00 +03:00
Anna Shaleva
95e23c8e46 actor: fix event-based tx awaiting
If VUB-th block is received, we still can't guaranty that transaction
wasn't accepted to chain. Back this situation by rolling back to a
poll-based waiter.
2022-11-16 23:44:31 +03:00
Anna Shaleva
6dbae7edc4 rpcclient: fix WS-client unsubscription process
Do not block subscribers until the unsubscription request to RPC server
is completed. Otherwise, another notification may be received from the
RPC server which will block the unsubscription process.

At the same time, fix event-based waiter. We must not block the receiver
channel during unsubscription because there's a chance that subsequent
event will be sent by the server. We need to read this event in order not
to block the WSClient's readloop.
2022-11-16 23:44:30 +03:00
Anna Shaleva
ddaba9e74d rpcsrv: fix "subscribe" parameters handling
If it's a subscription for AERs, we need to check the filter's state only
if it has been provided, otherwise filter is always valid.
2022-11-16 14:05:13 +03:00
Anna Shaleva
d043139b66 rpcsrv: adjust "subscribe" response error
Make it more detailed for better debugging experience.
2022-11-16 13:35:19 +03:00
Anna Shaleva
3f122fd591 rpcclient: adjust WS waiter error formatting
Follow the other errors formatting style.
2022-11-16 12:22:18 +03:00
Roman Khimov
5f763a26e7
Merge pull request #2802 from nspcc-dev/fix-node-startup-with-broken-contract
Fix node startup with broken contract
2022-11-16 16:21:56 +07:00
Roman Khimov
822722bd2e native: ignore decoding errors during cache init
Bad contract -> no contract. Unfortunately we've got a broken
6f1837723768f27a6f6a14452977e3e0e264f2cc contract on the mainnet which can't
be decoded (even though it had been saved successfully), so this is a
temporary fix for #2801 to be able to start mainnet node after shutdown.
2022-11-16 12:00:28 +03:00
Roman Khimov
0402a75d20 cli: fix error message formatting
This is not very useful:
   could not initialize blockchain: %!w([]interface {}=[0xc0018da2c0])
2022-11-16 11:34:29 +03:00
Roman Khimov
bd67fe5371
Merge pull request #2800 from nspcc-dev/fix-vm-bugz
Fix VM bug and istack printing
2022-11-16 13:24:07 +07:00
Roman Khimov
aef01bf663 vm: fix istack marshaling, fix #2799 2022-11-16 00:40:12 +03:00
Roman Khimov
90582faacd vm: save current stack slice when loading new context
v.estack is used throughout the code to work with estack, while ctx.sc.estack
is (theoretically) just a reference to it that is saved on script load and
restored to v.estack on context unload. The problem is that v.estack can grow
as we use it and can be reallocated away from its original slice (saved in the
ctx.sc.estack), so either ctx.sc.estack should be a pointer or we need to
ensure that it's correct when loading a new script. The second approach is a
bit safer for now and it fixes #2798.
2022-11-15 23:48:02 +03:00
Roman Khimov
e9407e2054
Merge pull request #2793 from nspcc-dev/ga-upd
.github: upgrade stale GithubActions workflow dependencies
2022-11-15 20:25:51 +07:00
Anna Shaleva
eaa0d42b9c .github: fix condition for adding latest tag to buider job 2022-11-15 16:16:02 +03:00
Anna Shaleva
59741bbffc .github: update deprecated workflows 2022-11-15 16:16:01 +03:00
Roman Khimov
5197723dfe
Merge pull request #2797 from nspcc-dev/adj-deposit-warning
network: adjust warning for deposit expiration
2022-11-15 18:40:22 +07:00