The key idea here is that even though we can't ensure MPT code won't make the
node active again we can order the changes made to the persistent store in
such a way that it practically doesn't matter. What happens is:
* after persist if it's time to collect our garbage we do it synchronously
right in the same thread working the underlying persistent store directly
* all the other node code doesn't see much of it, it works with bc.dao or
layers above it
* if MPT doesn't find some stale deactivated node in the storage it's OK,
it'll recreate it in bc.dao
* if MPT finds it and activates it, it's OK too, bc.dao will store it
* while GC is being performed nothing else changes the persistent store
* all subsequent bc.dao persists only happen after the GC is completed which
means that any changes to the (potentially) deleted nodes have a priority,
it's OK for GC to delete something that'll be recreated with the next
persist cycle
Otherwise it's a simple scheme with node status/last active height stored in
the value. Preliminary tests show that it works ~18% worse than the simple
KeepOnlyLatest scheme, but this seems to be the best result so far.
Fixes#2095.
Add "active" flag into the node data and make the remainder modal, for active
nodes it's a reference counter, for inactive ones the deactivation height is
stored.
Technically, refcounted chains storing just one trie don't need a flag, but
it's a bit simpler this way.
Turns out that caching the golang:windowsservercore-ltsc2022 image between
GithubAction workflow runs is a bad idea because `docker load` command still takes
too long to load image from cached archive (~9-10min on standard windows runner).
And after that runner still needs to build the neo-go image itself.
However, standard GA windows runner is supplied with prefetched latest
mcr.microsoft.com/windows/servercore:ltsc2022 image, so using it costs almost
nothing. Thus, the other approach is implemented: we use standard
mcr.microsoft.com/windows/servercore:ltsc2022 image as both build-base and
final-base. Then we install all required tools for building neo-go manually
on build-base image (these tools are git and go 1.17). Compared to the first
approach, the publishing job (included build and publish to DockerHub)
takes ~7-8min to finish.
Some tests are failing on Windows due to slow runners with errors like the following:
```
2022-02-09T17:11:20.3127016Z --- FAIL: TestGetData/transaction (1.82s)
2022-02-09T17:11:20.3127385Z server_test.go:500:
2022-02-09T17:11:20.3127878Z Error Trace: server_test.go:500
2022-02-09T17:11:20.3128533Z server_test.go:520
2022-02-09T17:11:20.3128978Z Error: Condition never satisfied
2022-02-09T17:11:20.3129479Z Test: TestGetData/transaction
```
These tests are slow on Windows, so refactor them a bit and avoid the following
error:
```
--- FAIL: TestRunWithDifferentArguments (4.01s)
cli_test.go:96:
Error Trace: cli_test.go:96
cli_test.go:321
Error: command took too long time
Test: TestRunWithDifferentArguments
```
Currently we can't properly stop running server on Windows and SIGHUP
is also not supported. This leads to occupied resources and failed
test cleanup:
```
--- FAIL: TestServerStart (0.35s)
--- FAIL: TestServerStart/good (0.10s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_good337747932\001\neogotestchain\000001.log:
The process cannot access the file because it is being used by another process.
2022-02-08T14:11:20.959+0300 INFO persisted to disk {"blocks": 0, "keys": 112, "headerHeight": 0, "blockHeight": 0, "took": "10.0049ms"}
```
Blockchain occupies resources (e.g. it opens log files for DB, etc.)
on creation and running. We need to release these resources if something
goes wrong during execution chain-related commands.
This commit solves the following problem on Windows:
```
--- FAIL: TestServerStart (0.32s)
--- FAIL: TestServerStart/stateroot_service_is_on_&&_StateRootInHeader=true (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_stateroot_service_is_on_&&_StateRootInHeader=true460557297\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Oracle_config (0.03s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Oracle_config810064028\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_consensus_config (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_consensus_config217270091\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Notary_config (0.07s)
--- FAIL: TestServerStart/invalid_Notary_config/malformed_config (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Notary_config_malformed_config754934830\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Notary_config/invalid_wallet (0.03s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Notary_config_invalid_wallet934249397\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/good (0.11s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_good596150160\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
```
This commit also unifies blockchain and services releasing code.
zap never closes open sinks except its own tests. This behaviour
prevents TestHandleLoggingParams from successful cleanup because
temp log output file can't be closed due to the following error:
```
TempDir RemoveAll cleanup: remove C:\\Users\\Anna\\AppData\\Local\\Temp\\TestHandleLoggingParams_debug5796883
33\\001\\file.log: The process cannot access the file because it is being used by another process.
```
So this tremendous cludge is made mosetly for our testing code.
It is not for concurrent usage (we don't have cases of
multithreaded access to logger output sink).
Let's keep the only coverage job, so that CircleCI and GithubAction
coverage do not conflict with each other.
Also, let's do not require all other tests to pass to report the
coverage.
They're misleading now that we have variable number of committee
members/validators. The standby list can be seen in the configuration and the
appropriate numbers can be received from it also.