Commit graph

2257 commits

Author SHA1 Message Date
Pavel Karpy
95ee905861 [#2244] node: Fix subscriptions lock
Subscribing without async listening could lead to a dead-lock in the
`neo-go` client.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
07ec51ea60 [#2244] node: Add object address to WC's operations
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
dbbbef9ddb [#2244] node: Update expired storage ID by WC
Previously, node could get an "infinite" small object: it could be expired
and thus could not be flushed (update its storage ID) to metabase => could
not be marked as flushed => node never removes such object and repeat all
the cycle one more time. If object exists and is not marked with GC (meta
returns `ErrObjectIsExpired`, not `ObjectNotFound` and not
`ObjectAlreadyRemoved`), its ID is safe to update _in the same_ bbolt
transaction.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
6fd88a036f [#2241] metrics: Fix request count metrics names
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
5cb2c5ae62 [#2238] engine: Add test for component initialization failures
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
427fe276f2 [#2238] shard: Try closing all components
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
c53903ccd0 [#2238] engine: Make Open and Init similar
1. Both could initialize shards in parallel.
2. Both should close shards after an error.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
e0309e398c [#2239] writecache: Fix possible deadlock
LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction.
`View` transaction itself takes `mmapLock` [1], which is lifted after tx
finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`)

When we evict items from LRU cache mutex order is different:
first we take LRU mutex and then execute `Batch` which _does_ take
`mmapLock` in case we need to remap. Thus the deadlock.

[1] 8f4a7e1f92/db.go (L708)

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
58367e4df6 [#2232] pilorama: Merge in-queue batches
To achieve high performance we must choose proper values for both
batch size and delay. For user operations we want to set low delay.
However it would prevent tree synchronization operations to form big
enough batches. For these operations, batching gives the most benefit
not only in terms of on-CPU execution cost, but also by speeding up
transaction persist (`fsync`).
In this commit we try merging batches that are already
_triggered_, but not yet _started to execute_. This way we can still
query batches for execution after the provided delay while also allowing
multiple formed batches to execute faster.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
Pavel Karpy
40822adb51 [#2213] node: Do not return object expired object
"Object is expired" means that object is presented in `meta` but it is not
`ObjectNotFound` error. Previous implementation made `shard` search for an
object without `meta` which was an error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-20 13:53:27 +03:00
9afe86ba3e [#2212] morph: Fix subscription restoration
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-20 13:53:27 +03:00
85cf1f47ac [#1465] node: Prevent process from killing by systemd when shutting down
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-02-17 12:13:00 +03:00
362f24953a [#47] shard: Switch container size metric from physical to logical capacity
Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>
2023-02-17 12:03:42 +03:00
Pavel Karpy
901d62567d [#57] node: Broadcast link objects
It boosts object assembling by an _average_ container node.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-17 11:58:27 +03:00
204cd3a11c [#31] fstree: Optimize treePath
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
dee4498c1e [#31] fstree: Do not check for a file existence twice
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
abbecf49d6 [#31] fstree: Speedup string-to-address conversion
```
name                  old time/op    new time/op    delta
_addressFromString-8    1.25µs ±30%    1.02µs ± 6%  -18.49%  (p=0.000 n=9+9)

name                  old alloc/op   new alloc/op   delta
_addressFromString-8      352B ± 0%      256B ± 0%  -27.27%  (p=0.000 n=9+10)

name                  old allocs/op  new allocs/op  delta
_addressFromString-8      6.00 ± 0%      4.00 ± 0%  -33.33%  (p=0.000
n=10+10)
```

Also, assure compiler that `s` doesn't escape:
Before this commit:
```
./fstree.go:74:24: leaking param: s
./fstree.go:90:6: moved to heap: addr
```

After this commit:
```
./fstree.go:74:24: s does not escape
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-02-10 12:49:31 +03:00
ab21d90cfb [#1794] shard: Add increasing case for the payload size metric
Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>
2023-02-09 13:30:23 +03:00
cb016d53a6 [#1] Fix comments and error messages
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
c761a95eef [#1] Fix project name in control service
Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>
2023-02-06 17:41:14 +03:00
Pavel Karpy
73bc1b0b68 [#38] node: Fix linter warnings
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-02-06 17:27:54 +03:00
515c60bdf4 [#1889] adm: Add command morph netmap-candidates
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-02-06 17:26:34 +03:00
Pavel Karpy
89a0266f5e [#1794] metrics: Track physical object capacity per shard
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
Evgenii Stratonikov
9513f163aa [#2116] metrics: Track physical object capacity in the container
Currently we track based on `PayloadSize`, because it is already stored
in the metabase and it is easier to calculate without slowing down the
whole system.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
2023-01-26 20:06:28 +03:00
d65a95a2c6 [#28] pilorama: Remove LogMove struct
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
46c62be7e8 [#28] Fix linter issues
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
c72576e72f [#2208] engine: Log time-consuming shard operations
Currently the only way to tell whether `evacuate/set-mode` is finished
is to set a very big timeout and _hope_ that the operation will finish.
In this commit we add INFO logs for such operations which should
simplify the life of an administrator.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
87f0e3ea25 [#2208] fstree: Rename file after write
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
792319a044 [#2208] fstree: Remove file if there was an error during write
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
67c97c6804 [#2210] services/tree: Drop messages not in queue
Currently, under high load clients are blocked on channel send
and the number of goroutines can increase indefinitely.
In this commit we drop replication messages if send/recv queue is full
and rely on a background synchronization.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
25d5995cef [#2210] pilorama: Allocate bucket name outside of batches
1. Reduce allocations inside transactions.
2. Do not encode container ID to string: it allocates a lot and takes more
space.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
165a600624 [#2210] pilorama: Reduce the amount of keys per node
Under high load we are limited by the _amount_ of keys we need to update
in a single transaction. In this commit we try storing all state
with a single key.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
64a5294b27 [#2200] shard: Do not fetch big objects from blobovniczas
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
91757329ae [#2200] shard: Fix blobstor obj fetching
In the previous implementation any non-nil error that preceded object
fetching from blobstor led to iterating over every storage (in other words,
no storage ID information was taken into account). Now storage ID is
skipped only if metabase (storage ID source) returns any error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
cf1a91a758 [#2206] blobovnicza: Use Latin letters in the code
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
c33ad3c474 [#2164] node: Use reconnect_interval from config
Not always the default one.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
0d8366f475 [#2207] object/acl: Return status error for expired session token
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
6451f019d2 [#2203] shard: Do not panic in Close after unsuccessful Init
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov
6efa93be0a [#1621] services/tree: Return Apply result asyncronously
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Evgenii Stratonikov
ac81c70c09 [#1621] pilorama: Batch related operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
9009612a82 [#2198] blobovniczatree: Properly handle concurrent active blobovnicza update
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
cedbd380f2 [#2197] pilorama: Close database in degraded mode
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
Pavel Karpy
1d21b1e3e8 [#1978] node: Do not drop clients on split errors
After the reconnection interval feature there was an bug related to the big
objects collecting: split error is returned from a client directly, not
via API status and was considered as a connection error.

Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
554b85411f [#2190] services/object: Log service error with INFO level
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
b0ad1b9ed2 [#2193] pilorama: Use do in TreeMove
It should be similar to a `TreeAddByPath`. `applyOperation` is used for
`Apply` when the operation can be inserted in the middle of a log.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
ba393e3e91 [#2188] engine: Fix panic during setting shard mode
Under load changing shard mode can lead to it being removed from the
list during some other PUT.
```
Dec 28 07:01:26 az neofs-node[364505]: panic: runtime error: invalid memory address or nil pointer dereference
Dec 28 07:01:26 az neofs-node[364505]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xc9fbb1]
Dec 28 07:01:26 az neofs-node[364505]: goroutine 11791912 [running]:
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).putToShard(0xc000435490, {0xc0003f7a28?, 0xc0001192c0?}, 0x2, {0x0, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:91 +0x1b1
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put.func1(0xc000435490?, {0xc0003f7a28?, 0xc0001192c0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:71 +0x19c
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).iterateOverSortedShards(0x1?, {{0x62, 0x23, 0xfe, 0x60, 0x67, 0xd5, 0x>
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/shards.go:225 +0xc8
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).put(0xc000435490, {0x1?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:66 +0x2a9
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put.func1()
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:43 +0x2a
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).execIfNotBlocked(0x8?, 0x38?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/control.go:147 +0xcf
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.(*StorageEngine).Put(0xc4df775a80?, {0x0?})
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:42 +0x65
Dec 28 07:01:26 az neofs-node[364505]: github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine.Put(0xc06d928b80?, 0xc06b1b8dc8?)
Dec 28 07:01:26 az neofs-node[364505]:         github.com/nspcc-dev/neofs-node/pkg/local_object_storage/engine/put.go:158 +0x19
Dec 28 07:01:26 az neofs-node[364505]: main.engineWithoutNotifications.Put({0x20301b?}, 0x20301b?)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-25 15:31:47 +03:00
3d57f4c961 [#2179] test: Fix test TestEvacuateNetwork/multiple_shards,_evacuate_many
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2023-01-24 13:37:49 +03:00
3d1d2ee7b1 [#11] Regenerate proto files
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-12 08:55:47 +03:00
054bc4a727 [#11] Rename contract-related NeoFS occurences
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-12 08:55:47 +03:00
9cb4b4cc17 [#11] Rename neofsid contract to frostfsid
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-01-12 08:55:47 +03:00