Commit graph

9 commits

Author SHA1 Message Date
Anna Shaleva
7ba88e98e2 core: optimize (*MemCachedStore).Seek operation
Real persistent storage guarantees that result of Seek is sorted
by keys. The idea of optimisation is to merge two sorted seek
results into one (memStore+persistentStore), so that
(*MemCachedStore).Seek will return sorted list. The only thing
that remains is to sort items got from (*MemoryStore).Seek.
2021-10-21 10:05:12 +03:00
Anna Shaleva
d8210c0137 core: add benchmarks for iterator.Next, MemCached.Seek, Mem.Seek 2021-10-21 10:05:12 +03:00
Anna Shaleva
8d8071f97e core: distinguish storage.KeyValue and storage.KeyValueExists
We need Exists field for storage batch related code; other cases may go
without Exists, so add new KeyValue structure and refactor related code.
2021-10-21 10:05:12 +03:00
Roman Khimov
ae071d4542 storage: introduce PutChangeSet and use it for Persist
We're using batches in wrong way during persist, we already have all changes
accumulated in two maps and then we move them to batch and then this is
applied. For some DBs like BoltDB this batch is just another MemoryStore, so
we essentially just shuffle the changeset from one map to another, for others
like LevelDB batch is just a serialized set of KV pairs, it doesn't help much
on subsequent PutBatch, we just duplicate the changeset again.

So introduce PutChangeSet that allows to take two maps with sets and deletes
directly. It also allows to simplify MemCachedStore logic.

neo-bench for single node with 10 workers, LevelDB:

  Reference:

  RPS    30189.132 30556.448 30390.482 ≈ 30379    ±  0.61%
  TPS    29427.344 29418.687 29434.273 ≈ 29427    ±  0.03%
  CPU %     33.304    27.179    33.860 ≈    31.45 ± 11.79%
  Mem MB   800.677   798.389   715.042 ≈   771    ±  6.33%

  Patched:

  RPS    30264.326 30386.364 30166.231 ≈ 30272    ± 0.36% ⇅
  TPS    29444.673 29407.440 29452.478 ≈ 29435    ± 0.08% ⇅
  CPU %     34.012    32.597    33.467 ≈   33.36  ± 2.14% ⇅
  Mem MB   549.126   523.656   517.684 ≈  530     ± 3.15% ↓ 31.26%

BoltDB:

  Reference:

  RPS    31937.647 31551.684 31850.408 ≈ 31780    ±  0.64%
  TPS    31292.049 30368.368 31307.724 ≈ 30989    ±  1.74%
  CPU %     33.792    22.339    35.887 ≈    30.67 ± 23.78%
  Mem MB  1271.687  1254.472  1215.639 ≈  1247    ±  2.30%

  Patched:

  RPS    31746.818 30859.485 31689.761 ≈ 31432    ± 1.58% ⇅
  TPS    31271.499 30340.726 30342.568 ≈ 30652    ± 1.75% ⇅
  CPU %     34.611    34.414    31.553 ≈    33.53 ± 5.11% ⇅
  Mem MB  1262.960  1231.389  1335.569 ≈  1277    ± 4.18% ⇅
2021-08-12 17:42:16 +03:00
Roman Khimov
b9be892bf9 storage: allow accessing MemCachedStore during Persist
Persist by its definition doesn't change MemCachedStore visible state, all KV
pairs that were acessible via it before Persist remain accessible after
Persist. The only thing it does is flushing of the current set of KV pairs
from memory to peristent store. To do that it needs read-only access to the
current KV pair set, but technically it then replaces maps, so we have to use
full write lock which makes MemCachedStore inaccessible for the duration of
Persist. And Persist can take a lot of time, it's about disk access for
regular DBs.

What we do here is we create new in-memory maps for MemCachedStore before
flushing old ones to the persistent store. Then a fake persistent store is
created which actually is a MemCachedStore with old maps, so it has exactly
the same visible state. This Store is never accessed for writes, so we can
read it without taking any internal locks and at the same time we no longer
need write locks for original MemCachedStore, we're not using it. All of this
makes it possible to use MemCachedStore as normally reads are handled going
down to whatever level is needed and writes are handled by new maps. So while
Persist for (*Blockchain).dao does its most time-consuming work we can process
other blocks (reading data for transactions and persisting storeBlock caches
to (*Blockchain).dao).

The change was tested for performance with neo-bench (single node, 10 workers,
LevelDB) on two machines and block dump processing (RC4 testnet up to 62800
with VerifyBlocks set to false) on i7-8565U.

Reference results (bbe4e9cd7b):

Ryzen 9 5950X:
RPS     23616.969 22817.086 23222.378  ≈ 23218   ± 1.72%
TPS     23047.316 22608.578 22735.540  ≈ 22797   ± 0.99%
CPU %      23.434    25.553    23.848  ≈    24.3 ± 4.63%
Mem MB    600.636   503.060   582.043  ≈   562   ± 9.22%

Core i7-8565U:
RPS     6594.007 6499.501 6572.902  ≈ 6555   ± 0.76%
TPS     6561.680 6444.545 6510.120  ≈ 6505   ± 0.90%
CPU %     58.452   60.568   62.474    ≈ 60.5 ± 3.33%
Mem MB   234.893  285.067  269.081   ≈ 263   ± 9.75%

DB restore:
real    0m22.237s 0m23.471s 0m23.409s  ≈ 23.04 ± 3.02%
user    0m35.435s 0m38.943s 0m39.247s  ≈ 37.88 ± 5.59%
sys      0m3.085s  0m3.360s  0m3.144s  ≈  3.20 ± 4.53%

After the change:

Ryzen 9 5950X:
RPS     27747.349 27407.726 27520.210  ≈ 27558   ± 0.63%  ↑ 18.69%
TPS     26992.010 26993.468 27010.966  ≈ 26999   ± 0.04%  ↑ 18.43%
CPU %      28.928    28.096    29.105  ≈    28.7 ± 1.88%  ↑ 18.1%
Mem MB    760.385   726.320   756.118  ≈   748   ± 2.48%  ↑ 33.10%

Core i7-8565U:
RPS     7783.229 7628.409 7542.340  ≈ 7651   ± 1.60%  ↑ 16.72%
TPS     7708.436 7607.397 7489.459  ≈ 7602   ± 1.44%  ↑ 16.85%
CPU %     74.899   71.020   72.697  ≈   72.9 ± 2.67%  ↑ 20.50%
Mem MB   438.047  436.967  416.350  ≈  430   ± 2.84%  ↑ 63.50%

DB restore:
real    0m20.838s 0m21.895s 0m21.794s  ≈ 21.51 ± 2.71%  ↓ 6.64%
user    0m39.091s 0m40.565s 0m41.493s  ≈ 40.38 ± 3.00%  ↑ 6.60%
sys      0m3.184s  0m2.923s  0m3.062s  ≈  3.06 ± 4.27%  ↓ 4.38%

It obviously uses more memory now and utilizes CPU more aggressively, but at
the same time it allows to improve all relevant metrics and finally reach a
situation where we process 50K transactions in less than second on Ryzen 9
5950X (going higher than 25K TPS). The other observation is much more stable
block time, on Ryzen 9 it's as close to 1 second as it could be.
2021-08-02 16:33:00 +03:00
Roman Khimov
4758de71ec storage: optimize (*MemCachedStore).Persist for memory-backed ps
Most of the time it's persisted into the MemoryStore or MemCachedStore, when
that's the case there is no real need to go through the Batch mechanism as it
incurs multiple copies of the data.

Importing 1.5M mainnet blocks with verification turned off, before:
real    12m39,484s
user    20m48,300s
sys     2m25,022s

After:
real    11m15,053s
user    18m2,755s
sys     2m4,162s

So it's around 10% improvement which looks good enough.
2020-03-28 17:21:50 +03:00
Evgenii Stratonikov
0a894db7f8 storage: add Exists flag to KeyValue in batch
Set Exists flag if an item with the specified key was already
present in storage before persisting.
2020-02-12 12:16:31 +03:00
Evgenii Stratonikov
fb9af98179 storage: implement GetBatch() to view storage changes
GetBatch returns changes to be persisted.
2020-02-12 12:16:31 +03:00
Roman Khimov
fc0031e5aa core: move write caching layer into MemCacheStore
Simplify Blockchain and associated functions, deduplicate code, fix Get() and
Seek() implementations.
2019-10-16 17:33:45 +03:00