Commit graph

780 commits

Author SHA1 Message Date
Pavel Karpy
f1f3c80dbf [#32] node: Init write-cache asynchronously
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-03-09 11:07:33 +00:00
Pavel Karpy
381e363a8b [#32] node: Always close general components after testing
It will prevent test fails with `-race` flag on components that have
background processes and make some actions on test framework.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-03-09 11:07:33 +00:00
20de74a505 Rename package name
Due to source code relocation from GitHub.

Signed-off-by: Alex Vanin <a.vanin@yadro.com>
2023-03-07 16:38:26 +03:00
e9f3c24229 [#65] Use strings.Cut instead of strings.Split* where possible
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-02-28 13:39:14 +03:00
6925fb4c59 [TrueCloudLab/hrw#2] node: Use typed HRW methods
Update HRW lib and use typed HRW methods to sort shards and nodes

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
c3a7039801 [TrueCloudLab/hrw#2] node: Optimize shard hash
Compute shard hash only once

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2023-02-28 13:36:25 +03:00
cb5468abb8 [#66] node: Replace interface{} with any
Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>
2023-02-21 16:47:07 +03:00
Pavel Karpy
337049b2ce [#56] node: Allow reading expired locked object
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-21 09:56:57 +03:00
Pavel Karpy
3beef10f89 [#61] node: Do not fetch missing objects
If an object is missing in a `meta`, shard should not look for it in
a `blobstor`.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 14:47:38 +03:00
d1d123d180 [#2234] writecache: Fix possible panic in initFlushMarks
In case we have many small objects in the write-cache, `indices` should
not be reused between iterations.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
315141dc2c [#2252] fstree: Allow concurrent writes
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
07ec51ea60 [#2244] node: Add object address to WC's operations
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
dbbbef9ddb [#2244] node: Update expired storage ID by WC
Previously, node could get an "infinite" small object: it could be expired
and thus could not be flushed (update its storage ID) to metabase => could
not be marked as flushed => node never removes such object and repeat all
the cycle one more time. If object exists and is not marked with GC (meta
returns `ErrObjectIsExpired`, not `ObjectNotFound` and not
`ObjectAlreadyRemoved`), its ID is safe to update _in the same_ bbolt
transaction.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
5cb2c5ae62 [#2238] engine: Add test for component initialization failures
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
427fe276f2 [#2238] shard: Try closing all components
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
c53903ccd0 [#2238] engine: Make Open and Init similar
1. Both could initialize shards in parallel.
2. Both should close shards after an error.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
e0309e398c [#2239] writecache: Fix possible deadlock
LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction.
`View` transaction itself takes `mmapLock` [1], which is lifted after tx
finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`)

When we evict items from LRU cache mutex order is different:
first we take LRU mutex and then execute `Batch` which _does_ take
`mmapLock` in case we need to remap. Thus the deadlock.

[1] 8f4a7e1f92/db.go (L708)

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
58367e4df6 [#2232] pilorama: Merge in-queue batches
To achieve high performance we must choose proper values for both
batch size and delay. For user operations we want to set low delay.
However it would prevent tree synchronization operations to form big
enough batches. For these operations, batching gives the most benefit
not only in terms of on-CPU execution cost, but also by speeding up
transaction persist (`fsync`).
In this commit we try merging batches that are already
_triggered_, but not yet _started to execute_. This way we can still
query batches for execution after the provided delay while also allowing
multiple formed batches to execute faster.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
40822adb51 [#2213] node: Do not return object expired object
"Object is expired" means that object is presented in `meta` but it is not
`ObjectNotFound` error. Previous implementation made `shard` search for an
object without `meta` which was an error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
362f24953a [#47] shard: Switch container size metric from physical to logical capacity
Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>
2023-02-17 12:03:42 +03:00
204cd3a11c [#31] fstree: Optimize treePath
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
dee4498c1e [#31] fstree: Do not check for a file existence twice
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
abbecf49d6 [#31] fstree: Speedup string-to-address conversion
```
name                  old time/op    new time/op    delta
_addressFromString-8    1.25µs ±30%    1.02µs ± 6%  -18.49%  (p=0.000 n=9+9)

name                  old alloc/op   new alloc/op   delta
_addressFromString-8      352B ± 0%      256B ± 0%  -27.27%  (p=0.000 n=9+10)

name                  old allocs/op  new allocs/op  delta
_addressFromString-8      6.00 ± 0%      4.00 ± 0%  -33.33%  (p=0.000
n=10+10)
```

Also, assure compiler that `s` doesn't escape:
Before this commit:
```
./fstree.go:74:24: leaking param: s
./fstree.go:90:6: moved to heap: addr
```

After this commit:
```
./fstree.go:74:24: s does not escape
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
ab21d90cfb [#1794] shard: Add increasing case for the payload size metric
Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>
2023-02-09 13:30:23 +03:00
cb016d53a6 [#1] Fix comments and error messages
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
Pavel Karpy
73bc1b0b68 [#38] node: Fix linter warnings
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-06 17:27:54 +03:00
Pavel Karpy
89a0266f5e [#1794] metrics: Track physical object capacity per shard
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
Evgenii Stratonikov
9513f163aa [#2116] metrics: Track physical object capacity in the container
Currently we track based on `PayloadSize`, because it is already stored
in the metabase and it is easier to calculate without slowing down the
whole system.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
d65a95a2c6 [#28] pilorama: Remove LogMove struct
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
c72576e72f [#2208] engine: Log time-consuming shard operations
Currently the only way to tell whether `evacuate/set-mode` is finished
is to set a very big timeout and _hope_ that the operation will finish.
In this commit we add INFO logs for such operations which should
simplify the life of an administrator.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
87f0e3ea25 [#2208] fstree: Rename file after write
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
792319a044 [#2208] fstree: Remove file if there was an error during write
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
25d5995cef [#2210] pilorama: Allocate bucket name outside of batches
1. Reduce allocations inside transactions.
2. Do not encode container ID to string: it allocates a lot and takes more
space.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
165a600624 [#2210] pilorama: Reduce the amount of keys per node
Under high load we are limited by the _amount_ of keys we need to update
in a single transaction. In this commit we try storing all state
with a single key.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
64a5294b27 [#2200] shard: Do not fetch big objects from blobovniczas
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
91757329ae [#2200] shard: Fix blobstor obj fetching
In the previous implementation any non-nil error that preceded object
fetching from blobstor led to iterating over every storage (in other words,
no storage ID information was taken into account). Now storage ID is
skipped only if metabase (storage ID source) returns any error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
cf1a91a758 [#2206] blobovnicza: Use Latin letters in the code
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
6451f019d2 [#2203] shard: Do not panic in Close after unsuccessful Init
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov
ac81c70c09 [#1621] pilorama: Batch related operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
9009612a82 [#2198] blobovniczatree: Properly handle concurrent active blobovnicza update
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
cedbd380f2 [#2197] pilorama: Close database in degraded mode
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
b0ad1b9ed2 [#2193] pilorama: Use do in TreeMove
It should be similar to a `TreeAddByPath`. `applyOperation` is used for
`Apply` when the operation can be inserted in the middle of a log.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
ba393e3e91 [#2188] engine: Fix panic during setting shard mode
Under load changing shard mode can lead to it being removed from the
list during some other PUT.
```
Dec 28 07:01:26 az neofs-node[364505]: panic: runtime error: invalid memory address or nil pointer dereference
Dec 28 07:01:26 az neofs-node[364505]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xc9fbb1]
Dec 28 07:01:26 az neofs-node[364505]: goroutine 11791912 [running]:
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).putToShard(0xc000435490, {0xc0003f7a28?, 0xc0001192c0?}, 0x2, {0x0, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:91 +0x1b1
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put.func1(0xc000435490?, {0xc0003f7a28?, 0xc0001192c0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:71 +0x19c
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).iterateOverSortedShards(0x1?, {{0x62, 0x23, 0xfe, 0x60, 0x67, 0xd5, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/shards.go:225 +0xc8
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put(0xc000435490, {0x1?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:66 +0x2a9
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put.func1()
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:43 +0x2a
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).execIfNotBlocked(0x8?, 0x38?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/control.go:147 +0xcf
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put(0xc4df775a80?, {0x0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:42 +0x65
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.Put(0xc06d928b80?, 0xc06b1b8dc8?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:158 +0x19
Dec 28 07:01:26 az neofs-node[364505]: main.engineWithoutNotifications.Put({0x20301b?}, 0x20301b?)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
3d57f4c961 [#2179] test: Fix test TestEvacuateNetwork/multiple_shards,_evacuate_many
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-01-24 13:37:49 +03:00
9936b112b8 [#5] blobstor: Use generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
4155c1bdff [#5] writecache: Use generic LRU cache
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-31 23:04:06 +03:00
0272218eb9 [#2184] compression: Properly calculate upper bound
If the data is not compressible allocating `len(data)` will lead to a
slice reallocation. For a compressible data the results for small size
are flaky and we allocate a bit more. However, it feels right to use a
provided function if we need to pick any size at all.

```
name                                                           old time/op    new time/op    delta
Compression/size=128/zeroed_slice-8                              2.23µs ±12%    2.06µs ± 6%   -7.35%  (p=0.009 n=10+10)
Compression/size=128/not_so_random_slice_(block_=_123)-8         19.0µs ±10%    15.8µs ±16%  -17.09%  (p=0.000 n=9+10)
Compression/size=128/random_slice-8                              17.6µs ±15%    16.1µs ±16%     ~     (p=0.075 n=10+10)
Compression/size=1024/zeroed_slice-8                             3.05µs ±11%    2.84µs ±10%     ~     (p=0.089 n=10+10)
Compression/size=1024/not_so_random_slice_(block_=_123)-8        18.1µs ± 6%    18.2µs ±12%     ~     (p=0.971 n=10+10)
Compression/size=1024/random_slice-8                             48.6µs ± 6%    45.6µs ± 5%   -6.07%  (p=0.006 n=10+9)
Compression/size=32768/zeroed_slice-8                            26.8µs ± 3%    28.7µs ± 8%   +7.23%  (p=0.001 n=10+10)
Compression/size=32768/not_so_random_slice_(block_=_123)-8       44.3µs ± 8%    43.7µs ±13%     ~     (p=0.762 n=8+10)
Compression/size=32768/random_slice-8                            97.3µs ±32%    68.9µs ±15%  -29.13%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                         29.8ms ± 9%    30.3ms ±17%     ~     (p=1.000 n=9+9)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8    33.1ms ±14%    30.3ms ±11%   -8.61%  (p=0.043 n=10+10)
Compression/size=33554432/random_slice-8                         41.7ms ± 3%    30.1ms ± 8%  -27.72%  (p=0.000 n=9+10)

name                                                           old alloc/op   new alloc/op   delta
Compression/size=128/zeroed_slice-8                                128B ± 0%      144B ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=128/not_so_random_slice_(block_=_123)-8           384B ± 0%      144B ± 0%  -62.50%  (p=0.000 n=10+10)
Compression/size=128/random_slice-8                                384B ± 0%      144B ± 0%  -62.50%  (p=0.000 n=10+10)
Compression/size=1024/zeroed_slice-8                             1.02kB ± 0%    1.15kB ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=1024/not_so_random_slice_(block_=_123)-8        1.02kB ± 0%    1.15kB ± 0%  +12.50%  (p=0.000 n=10+10)
Compression/size=1024/random_slice-8                             2.56kB ± 0%    1.15kB ± 0%  -55.00%  (p=0.000 n=10+10)
Compression/size=32768/zeroed_slice-8                            32.8kB ± 0%    41.0kB ± 0%  +25.00%  (p=0.000 n=10+10)
Compression/size=32768/not_so_random_slice_(block_=_123)-8       32.8kB ± 0%    41.0kB ± 0%  +25.00%  (p=0.000 n=10+10)
Compression/size=32768/random_slice-8                            81.9kB ± 0%    41.0kB ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                         33.6MB ± 0%    33.6MB ± 0%   +0.02%  (p=0.000 n=9+9)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8    33.6MB ± 0%    33.6MB ± 0%   +0.02%  (p=0.000 n=8+10)
Compression/size=33554432/random_slice-8                         75.5MB ± 0%    33.6MB ± 0%  -55.55%  (p=0.000 n=10+10)

name                                                           old allocs/op  new allocs/op  delta
Compression/size=128/zeroed_slice-8                                1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=128/not_so_random_slice_(block_=_123)-8           2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=128/random_slice-8                                2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=1024/zeroed_slice-8                               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=1024/not_so_random_slice_(block_=_123)-8          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=1024/random_slice-8                               2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=32768/zeroed_slice-8                              1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=32768/not_so_random_slice_(block_=_123)-8         1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=32768/random_slice-8                              2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
Compression/size=33554432/zeroed_slice-8                           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=33554432/not_so_random_slice_(block_=_123)-8      1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Compression/size=33554432/random_slice-8                           2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
0ace28e43d [#2175] blobovniczatree: Close all non-active blobovniczas
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
c1cf418956 [#2175] blobovniczatree: Make function parameters more descriptive
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
b4e90cdf51 [#2165] pilorama: Optimize TreeApply when used for synchronization
Because synchronization _most likely_ will have apply already existing
operations, it is much faster to check their presence in a read
transaction. However, always doing this will degrade the perfomance
for normal `Apply`. And, let's be honest, it is already not good.
Thus we add a separate parameter which specifies whether this logic is
enabled.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
21717262ec [#2016] shard: Check meta first on Get
`meta` should prevent returning removed objects (`GCMark` and `TS` relations
are `meta` abstractions).

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
74ec71446f [#2167] shard: Do not use write-cache by default in Head
Both `meta` and `write-cache` are expected to have a fast underlying disk,
so it does not seem like an optimisation. Moreover, `write-cache`'s `Head`
is a `Get` with payload cutting, it _must_ use more memory for no reason
(`meta` was created for such requests). Also, `write-cache` does not allow
performing any "meta" relations checks (such as locking, tombstoning).

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
1608fd1c07 [#2167] write-cache: Add "write-cache" to its logs
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
eea2892109 [#1956] node: Lock shard's mode on its methods switch
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
e1c3bdbfa6 [#1621] pilorama: Remove Timestamp field from nodeInfo
It is already present in `Meta`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
1044adbe94 [#1621] pilorama: Improve memory allocation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
2539d466a6 [#1621] pilorama: Seek after cursor invalidation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
e9ba8931f8 [#1621] pilorama: Simplify bucket creation
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
Evgenii Stratonikov
fe7ddfdc6a [#1621] pilorama: Compare memory forests properly
Node children are not sorted and could occur in any order.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-30 11:07:35 +03:00
edb1428248 [#2022] Add metric readonly to get shards mode
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-12-30 11:07:35 +03:00
e5c304536b [#2161] pilorama: Do not apply already existing operations
Speeds up synchronization a bit.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
21c58c92a9 [#2145] meta: Do allow force inhuming a locked object
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-30 11:07:35 +03:00
0b78af467e [#2140] engine: Fix error handling in TreeMove
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2022-12-30 11:07:35 +03:00
Pavel Karpy
923f84722a Move to frostfs-node
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2022-12-28 15:04:29 +03:00
Evgenii Stratonikov
42554a9298 [#2068] writecache: Remove deleted objects from the writecache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
4a49ea0855 [#2068] writecache: Allow to open FSTree in read-only mode
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
857d2dc3f5 [#2068] writecache: Optimize initial flush existence checking
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
63f604e948 [#2068] blobstor: Allow to provide storage ID in Exists
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Evgenii Stratonikov
6ad2b5d5b8 [#2068] blobovnicza: Add Exists method
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-12-02 11:52:05 +03:00
Pavel Karpy
3d0768a1d3 [#2061] node: Unify meta.Get benchmarks
Make them get exactly one (different) object per a bench iteration.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 18:29:14 +03:00
Pavel Karpy
bc905f169d [#2061] meta: Add parallel bench for Get
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-30 18:29:14 +03:00
Evgenii Stratonikov
7335a52f29 [#1732] pilorama: Improve logical error handling
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-30 16:58:52 +03:00
Evgenii Stratonikov
ae7b473768 [#2064] blobovniczatree: Remove index too big log
There is no need to log about a situation which is expected.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-30 16:53:18 +03:00
Pavel Karpy
ed4351aab0 [#2074] write-cache: Do not flush same object twice
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
dd225906a0 [#2074] write-cache: Remove unused variables
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
b1025bdb42 [#2057] meta: Fail write operations in R/O mode
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
fdeea1dfac [#2057] meta: Fix concurrent mode changes
Includes:
1. mode change read lock operation in every exported method that r/w the
underlying database;
2. returning `ErrDegradedMode` logical error if any exported method is
called in degraded (without a metabase) mode.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
3d6defd3e8 [#2057] meta: Do not lock the whole meta on GET
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
fa231b8c56 [#2057] blobstor: Block operations on a mode change
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
b673d9e472 [#2053] engine: Do not switch mode because of logical errors
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
9a20498f34 [#1940] Removing all trees by container ID if tree ID is empty in pilorama.Forest.TreeDrop
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2022-11-19 11:01:04 +03:00
Pavel Karpy
634792077e [#1502] node: Store lock object on every container node
Includes extending listing methods in the Storage Engine with object types.
It allows tuning replication/policer algorithms: container nodes do
not remove `LOCK` objects as redundant and try to fulfill `LOCK` placement
on the ohter container nodes.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
3b61cb4f49 [#1502] engine: Check all shards for LOCK'ing before inhuming
It allows keeping all the locked objects safe after metabase
resynchronization. Currently, all `LOCK` objects are broadcast to all nodes
in a container, it guarantees `LOCK` object presence in a regular situation.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
34e8d2ba56 [#1502] shard: Add IsLocked method
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
9a039ba582 [#1502] meta: Add IsLocked method
It gets an object and returns its locking status.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
d65604ad30 [#1985] blobstor: Allow to report multiple errors to caller
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
f2d7e65e39 [#2035] engine: Allow moving to degraded from background workers
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
b0e94b6a6b [#1906] writecache: Do not require read-only mode in Flush
It was needed before we started to flush during transition to
`degraded` mode. Now it is confusing.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Pavel Karpy
2849e465f9 [#1699] meta: Do not return SplitInfoError on Delete
It is not an error: removing virtual object is expected and should be just
skipped. Getting a virtual object with `raw` flag is considered as an
impossible action, all the virtual objects removals will be handled via
their children's removals implicitly.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
a3e7365cbd [#1732] pilorama: Fill parent mark correctly
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
134f2ba02e [#1732] pilorama: Fix backwards log insertion
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
2ef38cfbc4 [#1996] engine: Ignore pilorama.ErrTreeNotFound for write operations
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-19 11:01:04 +03:00
Evgenii Stratonikov
d8d3588e1b [#1996] engine: Always select proper shard for a tree
Currently there is a possibility for modifying operations to fail
because of I/O errors and a new tree to be created on another shard.
This commit adds existence check for modifying operations.
Read operations remain as they are, not to slow things.
`TreeDrop` is an exception, because this is a tree removal and trying
multiple shards is not an unwanted behaviour.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-03 15:29:23 +03:00
Evgenii Stratonikov
777fd32d4f [#1818] writecache: Increase error counter on background errors
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
bffb0f894c [#1818] writecache: Update storage ID during flush
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
5cf75404dc [#1818] metabase: Add UpdateStorageID operation
By default writecache puts the whole object to update storage ID.
This logic comes from the times when we needed to put objects
in the metabase by the writecache itself. Now this is done by the
blobstor at unmarshaling objects during flush only to update storage ID
is an overkill.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
b64b14eb54 [#1818] writecache: Reuse FSTree flushing code between flushes
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
a56927e3d4 [#1818] writecache: Remove unused variable
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-02 14:24:02 +03:00
Evgenii Stratonikov
98a152256b [#1992] writecache: Allow to open in NOSYNC mode
Applicable only to FSTree as we cannot handle corrupted databases
properly yet.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-01 09:42:26 +03:00
Evgenii Stratonikov
f564430b90 [#1992] fstree: Allow working in SYNC mode
Make O_SYNC the default and allow to opt-out explicitly.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-11-01 09:42:26 +03:00