Commit graph

698 commits

Author SHA1 Message Date
25d5995cef [#2210] pilorama: Allocate bucket name outside of batches
1. Reduce allocations inside transactions.
2. Do not encode container ID to string: it allocates a lot and takes more
space.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
165a600624 [#2210] pilorama: Reduce the amount of keys per node
Under high load we are limited by the _amount_ of keys we need to update
in a single transaction. In this commit we try storing all state
with a single key.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
64a5294b27 [#2200] shard: Do not fetch big objects from blobovniczas
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
91757329ae [#2200] shard: Fix blobstor obj fetching
In the previous implementation any non-nil error that preceded object
fetching from blobstor led to iterating over every storage (in other words,
no storage ID information was taken into account). Now storage ID is
skipped only if metabase (storage ID source) returns any error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
cf1a91a758 [#2206] blobovnicza: Use Latin letters in the code
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
6451f019d2 [#2203] shard: Do not panic in Close after unsuccessful Init
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov
ac81c70c09 [#1621] pilorama: Batch related operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
9009612a82 [#2198] blobovniczatree: Properly handle concurrent active blobovnicza update
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
cedbd380f2 [#2197] pilorama: Close database in degraded mode
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
b0ad1b9ed2 [#2193] pilorama: Use do in TreeMove
It should be similar to a `TreeAddByPath`. `applyOperation` is used for
`Apply` when the operation can be inserted in the middle of a log.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
ba393e3e91 [#2188] engine: Fix panic during setting shard mode
Under load changing shard mode can lead to it being removed from the
list during some other PUT.
```
Dec 28 07:01:26 az neofs-node[364505]: panic: runtime error: invalid memory address or nil pointer dereference
Dec 28 07:01:26 az neofs-node[364505]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xc9fbb1]
Dec 28 07:01:26 az neofs-node[364505]: goroutine 11791912 [running]:
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).putToShard(0xc000435490, {0xc0003f7a28?, 0xc0001192c0?}, 0x2, {0x0, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:91 +0x1b1
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put.func1(0xc000435490?, {0xc0003f7a28?, 0xc0001192c0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:71 +0x19c
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).iterateOverSortedShards(0x1?, {{0x62, 0x23, 0xfe, 0x60, 0x67, 0xd5, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/shards.go:225 +0xc8
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put(0xc000435490, {0x1?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:66 +0x2a9
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put.func1()
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:43 +0x2a
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).execIfNotBlocked(0x8?, 0x38?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/control.go:147 +0xcf
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put(0xc4df775a80?, {0x0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:42 +0x65
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.Put(0xc06d928b80?, 0xc06b1b8dc8?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:158 +0x19
Dec 28 07:01:26 az neofs-node[364505]: main.engineWithoutNotifications.Put({0x20301b?}, 0x20301b?)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
3d57f4c961 [#2179] test: Fix test TestEvacuateNetwork/multiple_shards,_evacuate_many
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-01-24 13:37:49 +03:00
9936b112b8 [#5] blobstor: Use generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
4155c1bdff [#5] writecache: Use generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
0272218eb9 [#2184] compression: Properly calculate upper bound
If the data is not compressible allocating `len(data)` will lead to a
slice reallocation. For a compressible data the results for small size
are flaky and we allocate a bit more. However, it feels right to use a
provided function if we need to pick any size at all.

```
name                                                           old time/op    new time/op    delta
Compression/size=128/zeroed_slice-8                              2.23µs ±12%    2.06µs ± 6%   -7.35%  (p=0.009 n=10+10)
Compression/size=128/not_so_random_slice_(block_=_123)-8         19.0µs ±10%    15.8µs ±16%  -17.09%  (p=0.000 n=9+10)
Compression/size=128/random_slice-8                              17.6µs ±15%    16.1µs ±16%     ~     (p=0.075 n=10+10)
Compression/size=1024/zeroed_slice-8                             3.05µs ±11%    2.84µs ±10%     ~     (p=0.089 n=10+10)
Compression/size=1024/not_so_random_slice_(block_=_123)-8        18.1µs ± 6%    18.2µs ±12%     ~     (p=0.971 n=10+10)
Compression/size=1024/random_slice-8                             48.6µs ± 6%    45.6µs ± 5%   -6.07%  (p=0.006 n=10+9)
Compression/size=32768/zeroed_slice-8                            26.8µs ± 3%    28.7µs ± 8%   +7.23%  (p=0.001 n=10+10)
Compression/size=32768/not_so_random_slice_(block_=_123)-8       44.3µs ± 8%    43.7µs ±13%     ~     (p=0.762 n=8+10)
Compression/size=32768/random_slice-8                            97.3µs ±32%    68.9µs ±15%  -29.13%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                         29.8ms ± 9%    30.3ms ±17%     ~     (p=1.000 n=9+9)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8    33.1ms ±14%    30.3ms ±11%   -8.61%  (p=0.043 n=10+10)
Compression/size=33554432/random_slice-8                         41.7ms ± 3%    30.1ms ± 8%  -27.72%  (p=0.000 n=9+10)

name                                                           old alloc/op   new alloc/op   delta
Compression/size=128/zeroed_slice-8                                128B ± 0%      144B ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=128/not_so_random_slice_(block_=_123)-8           384B ± 0%      144B ± 0%  -62.50%  (p=0.000 n=10+10)
Compression/size=128/random_slice-8                                384B ± 0%      144B ± 0%  -62.50%  (p=0.000 n=10+10)
Compression/size=1024/zeroed_slice-8                             1.02kB ± 0%    1.15kB ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=1024/not_so_random_slice_(block_=_123)-8        1.02kB ± 0%    1.15kB ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=1024/random_slice-8                             2.56kB ± 0%    1.15kB ± 0%  -55.00%  (p=0.000 n=10+10)
Compression/size=32768/zeroed_slice-8                            32.8kB ± 0%    41.0kB ± 0%  +25.00%  (p=0.000 n=10+10)
Compression/size=32768/not_so_random_slice_(block_=_123)-8       32.8kB ± 0%    41.0kB ± 0%  +25.00%  (p=0.000 n=10+10)
Compression/size=32768/random_slice-8                            81.9kB ± 0%    41.0kB ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                         33.6MB ± 0%    33.6MB ± 0%   +0.02%  (p=0.000 n=9+9)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8    33.6MB ± 0%    33.6MB ± 0%   +0.02%  (p=0.000 n=8+10)
Compression/size=33554432/random_slice-8                         75.5MB ± 0%    33.6MB ± 0%  -55.55%  (p=0.000 n=10+10)

name                                                           old allocs/op  new allocs/op  delta
Compression/size=128/zeroed_slice-8                                1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=128/not_so_random_slice_(block_=_123)-8           2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=128/random_slice-8                                2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=1024/zeroed_slice-8                               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=1024/not_so_random_slice_(block_=_123)-8          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=1024/random_slice-8                               2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=32768/zeroed_slice-8                              1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=32768/not_so_random_slice_(block_=_123)-8         1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=32768/random_slice-8                              2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=33554432/random_slice-8                           2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
0ace28e43d [#2175] blobovniczatree: Close all non-active blobovniczas
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
c1cf418956 [#2175] blobovniczatree: Make function parameters more descriptive
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
b4e90cdf51 [#2165] pilorama: Optimize TreeApply when used for synchronization
Because synchronization _most likely_ will have apply already existing
operations, it is much faster to check their presence in a read
transaction. However, always doing this will degrade the perfomance
for normal `Apply`. And, let's be honest, it is already not good.
Thus we add a separate parameter which specifies whether this logic is
enabled.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
21717262ec [#2016] shard: Check meta first on Get
`meta` should prevent returning removed objects (`GCMark` and `TS` relations
are `meta` abstractions).

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
74ec71446f [#2167] shard: Do not use write-cache by default in Head
Both `meta` and `write-cache` are expected to have a fast underlying disk,
so it does not seem like an optimisation. Moreover, `write-cache`'s `Head`
is a `Get` with payload cutting, it _must_ use more memory for no reason
(`meta` was created for such requests). Also, `write-cache` does not allow
performing any "meta" relations checks (such as locking, tombstoning).

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
1608fd1c07 [#2167] write-cache: Add "write-cache" to its logs
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
eea2892109 [#1956] node: Lock shard's mode on its methods switch
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
e1c3bdbfa6 [#1621] pilorama: Remove Timestamp field from nodeInfo
It is already present in `Meta`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
1044adbe94 [#1621] pilorama: Improve memory allocation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
2539d466a6 [#1621] pilorama: Seek after cursor invalidation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
e9ba8931f8 [#1621] pilorama: Simplify bucket creation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
fe7ddfdc6a [#1621] pilorama: Compare memory forests properly
Node children are not sorted and could occur in any order.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
edb1428248 [#2022] Add metric readonly to get shards mode
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-12-30 11:07:35 +03:00
e5c304536b [#2161] pilorama: Do not apply already existing operations
Speeds up synchronization a bit.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
21c58c92a9 [#2145] meta: Do allow force inhuming a locked object
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
0b78af467e [#2140] engine: Fix error handling in TreeMove
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Evgenii Stratonikov
42554a9298 [#2068] writecache: Remove deleted objects from the writecache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
4a49ea0855 [#2068] writecache: Allow to open FSTree in read-only mode
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
857d2dc3f5 [#2068] writecache: Optimize initial flush existence checking
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
63f604e948 [#2068] blobstor: Allow to provide storage ID in Exists
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
6ad2b5d5b8 [#2068] blobovnicza: Add Exists method
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Pavel Karpy
3d0768a1d3 [#2061] node: Unify meta.Get benchmarks
Make them get exactly one (different) object per a bench iteration.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 18:29:14 +03:00
Pavel Karpy
bc905f169d [#2061] meta: Add parallel bench for Get
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 18:29:14 +03:00
Evgenii Stratonikov
7335a52f29 [#1732] pilorama: Improve logical error handling
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-30 16:58:52 +03:00
Evgenii Stratonikov
ae7b473768 [#2064] blobovniczatree: Remove index too big log
There is no need to log about a situation which is expected.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-30 16:53:18 +03:00
Pavel Karpy
ed4351aab0 [#2074] write-cache: Do not flush same object twice
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
dd225906a0 [#2074] write-cache: Remove unused variables
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
b1025bdb42 [#2057] meta: Fail write operations in R/O mode
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
fdeea1dfac [#2057] meta: Fix concurrent mode changes
Includes:
1. mode change read lock operation in every exported method that r/w the
underlying database;
2. returning `ErrDegradedMode` logical error if any exported method is
called in degraded (without a metabase) mode.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
3d6defd3e8 [#2057] meta: Do not lock the whole meta on GET
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
fa231b8c56 [#2057] blobstor: Block operations on a mode change
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
b673d9e472 [#2053] engine: Do not switch mode because of logical errors
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
9a20498f34 [#1940] Removing all trees by container ID if tree ID is empty in pilorama.Forest.TreeDrop
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-11-19 11:01:04 +03:00
Pavel Karpy
634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00