Most of the time we don't need locking on the higher-level stores and we drop
them after Persist, so that's what private MemCachedStore is for.
It doesn't improve things in any noticeable way, some ~1% can be observed in
neo-bench under various loads and even less than that in chain processing. But
it seems to be a bit better anyway (less allocations, less locks).
They never return errors, so their interface should reflect that. This allows
to remove quite a lot of useless and never tested code.
Notice that Get still does return an error. It can be made not to do that, but
usually we need to differentiate between successful/unsuccessful accesses
anyway, so this doesn't help much.
Simple and dumb as it is, this allows to separate contract storage from other
things and dramatically improve Seek() time over storage (even though it's
still unordered!) which in turn improves block processing speed.
LevelDB LevelDB (KeepOnlyLatest) BoltDB BoltDB (KeepOnlyLatest)
Master real 16m27,936s real 10m9,440s real 16m39,369s real 8m1,227s
user 20m12,619s user 26m13,925s user 18m9,162s user 18m5,846s
sys 2m56,377s sys 1m32,051s sys 9m52,576s sys 2m9,455s
2 maps real 10m49,495s real 8m53,342s real 11m46,204s real 5m56,043s
user 14m19,922s user 24m6,225s user 13m25,691s user 15m4,694s
sys 1m53,021s sys 1m23,006s sys 4m31,735s sys 2m8,714s
neo-bench performance is mostly unaffected, ~0.5% for 1-1 test and 4% for
10K-10K test both fall within regular test error range.
Initially I thought of doing it in the next persist cycle, but testing shows
that it needs just ~2-5% of the time MPT GC does, so doing it in the same
cycle doesn't affect anything.
It's very special, single-purpose thing, but it improves cumulative time spent
in GC by ~10% for LevelDB and by ~36% for BoltDB during 1050K mainnet chain
processing. While the overall chain import time doesn't change in any
noticeable way (~1%), I think it's still worth it, for machines with slower
disks the difference might be more noticeable.
Batch is only relevant in multithreaded context, internally it'll do some
magic and use the same locking/updating Update does, so it makes little sense
for us. This doesn't change benchmarks in any noticeable way.
The key idea here is that even though we can't ensure MPT code won't make the
node active again we can order the changes made to the persistent store in
such a way that it practically doesn't matter. What happens is:
* after persist if it's time to collect our garbage we do it synchronously
right in the same thread working the underlying persistent store directly
* all the other node code doesn't see much of it, it works with bc.dao or
layers above it
* if MPT doesn't find some stale deactivated node in the storage it's OK,
it'll recreate it in bc.dao
* if MPT finds it and activates it, it's OK too, bc.dao will store it
* while GC is being performed nothing else changes the persistent store
* all subsequent bc.dao persists only happen after the GC is completed which
means that any changes to the (potentially) deleted nodes have a priority,
it's OK for GC to delete something that'll be recreated with the next
persist cycle
Otherwise it's a simple scheme with node status/last active height stored in
the value. Preliminary tests show that it works ~18% worse than the simple
KeepOnlyLatest scheme, but this seems to be the best result so far.
Fixes#2095.
Add "active" flag into the node data and make the remainder modal, for active
nodes it's a reference counter, for inactive ones the deactivation height is
stored.
Technically, refcounted chains storing just one trie don't need a flag, but
it's a bit simpler this way.
Turns out that caching the golang:windowsservercore-ltsc2022 image between
GithubAction workflow runs is a bad idea because `docker load` command still takes
too long to load image from cached archive (~9-10min on standard windows runner).
And after that runner still needs to build the neo-go image itself.
However, standard GA windows runner is supplied with prefetched latest
mcr.microsoft.com/windows/servercore:ltsc2022 image, so using it costs almost
nothing. Thus, the other approach is implemented: we use standard
mcr.microsoft.com/windows/servercore:ltsc2022 image as both build-base and
final-base. Then we install all required tools for building neo-go manually
on build-base image (these tools are git and go 1.17). Compared to the first
approach, the publishing job (included build and publish to DockerHub)
takes ~7-8min to finish.
Some tests are failing on Windows due to slow runners with errors like the following:
```
2022-02-09T17:11:20.3127016Z --- FAIL: TestGetData/transaction (1.82s)
2022-02-09T17:11:20.3127385Z server_test.go:500:
2022-02-09T17:11:20.3127878Z Error Trace: server_test.go:500
2022-02-09T17:11:20.3128533Z server_test.go:520
2022-02-09T17:11:20.3128978Z Error: Condition never satisfied
2022-02-09T17:11:20.3129479Z Test: TestGetData/transaction
```
These tests are slow on Windows, so refactor them a bit and avoid the following
error:
```
--- FAIL: TestRunWithDifferentArguments (4.01s)
cli_test.go:96:
Error Trace: cli_test.go:96
cli_test.go:321
Error: command took too long time
Test: TestRunWithDifferentArguments
```
Currently we can't properly stop running server on Windows and SIGHUP
is also not supported. This leads to occupied resources and failed
test cleanup:
```
--- FAIL: TestServerStart (0.35s)
--- FAIL: TestServerStart/good (0.10s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_good337747932\001\neogotestchain\000001.log:
The process cannot access the file because it is being used by another process.
2022-02-08T14:11:20.959+0300 INFO persisted to disk {"blocks": 0, "keys": 112, "headerHeight": 0, "blockHeight": 0, "took": "10.0049ms"}
```
Blockchain occupies resources (e.g. it opens log files for DB, etc.)
on creation and running. We need to release these resources if something
goes wrong during execution chain-related commands.
This commit solves the following problem on Windows:
```
--- FAIL: TestServerStart (0.32s)
--- FAIL: TestServerStart/stateroot_service_is_on_&&_StateRootInHeader=true (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_stateroot_service_is_on_&&_StateRootInHeader=true460557297\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Oracle_config (0.03s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Oracle_config810064028\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_consensus_config (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_consensus_config217270091\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Notary_config (0.07s)
--- FAIL: TestServerStart/invalid_Notary_config/malformed_config (0.04s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Notary_config_malformed_config754934830\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/invalid_Notary_config/invalid_wallet (0.03s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_invalid_Notary_config_invalid_wallet934249397\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
--- FAIL: TestServerStart/good (0.11s)
testing.go:894: TempDir RemoveAll cleanup: remove C:\Users\Anna\AppData\Local\Temp\TestServerStart_good596150160\001\neogotestchain\000001.log: The process cannot access the file because it is being used by another process.
```
This commit also unifies blockchain and services releasing code.
zap never closes open sinks except its own tests. This behaviour
prevents TestHandleLoggingParams from successful cleanup because
temp log output file can't be closed due to the following error:
```
TempDir RemoveAll cleanup: remove C:\\Users\\Anna\\AppData\\Local\\Temp\\TestHandleLoggingParams_debug5796883
33\\001\\file.log: The process cannot access the file because it is being used by another process.
```
So this tremendous cludge is made mosetly for our testing code.
It is not for concurrent usage (we don't have cases of
multithreaded access to logger output sink).