TrueCloudLab/frostfs-node

Author	SHA1	Message	Date
Evgenii Stratonikov	bf1e59bb83	[#2260 ] network/cache: Ignore `context cancelled` errors Timeouts on client side should node affect inter-node communication. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	2567f8020e	[#2260 ] services/object: Do not assemble object with TTL=1 Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	d1d123d180	[#2234 ] writecache: Fix possible panic in `initFlushMarks` In case we have many small objects in the write-cache, `indices` should not be reused between iterations. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	315141dc2c	[#2252 ] fstree: Allow concurrent writes Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	b422ac9f94	[#2164 ] node: Fix multi-client error reporting Missing `ReportError` method did not allow casing multi-client interface to `errorReporter` interface and dropping broken connections. `replicationClient` embeds that interface, and it is widely used across node's code. Embedded interface does not allow casting its parent structure to `errorReporter` and breaks multi client error reporting logic. Multi-client scheme is extremely hard to maintain, it makes unpredictable casts and does not allow tracking code flow, so it will be refactored in the future anyway. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	95ee905861	[#2244 ] node: Fix subscriptions lock Subscribing without async listening could lead to a dead-lock in the `neo-go` client. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	07ec51ea60	[#2244 ] node: Add object address to WC's operations Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	dbbbef9ddb	[#2244 ] node: Update expired storage ID by WC Previously, node could get an "infinite" small object: it could be expired and thus could not be flushed (update its storage ID) to metabase => could not be marked as flushed => node never removes such object and repeat all the cycle one more time. If object exists and is not marked with GC (meta returns `ErrObjectIsExpired`, not `ObjectNotFound` and not `ObjectAlreadyRemoved`), its ID is safe to update _in the same_ bbolt transaction. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	6fd88a036f	[#2241 ] metrics: Fix request count metrics names Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	5cb2c5ae62	[#2238 ] engine: Add test for component initialization failures Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	427fe276f2	[#2238 ] shard: Try closing all components Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	c53903ccd0	[#2238 ] engine: Make `Open` and `Init` similar 1. Both could initialize shards in parallel. 2. Both should close shards after an error. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	e0309e398c	[#2239 ] writecache: Fix possible deadlock LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction. `View` transaction itself takes `mmapLock` [1], which is lifted after tx finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`) When we evict items from LRU cache mutex order is different: first we take LRU mutex and then execute `Batch` which _does_ take `mmapLock` in case we need to remap. Thus the deadlock. [1] `8f4a7e1f92/db.go (L708)` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	58367e4df6	[#2232 ] pilorama: Merge in-queue batches To achieve high performance we must choose proper values for both batch size and delay. For user operations we want to set low delay. However it would prevent tree synchronization operations to form big enough batches. For these operations, batching gives the most benefit not only in terms of on-CPU execution cost, but also by speeding up transaction persist (`fsync`). In this commit we try merging batches that are already _triggered_, but not yet _started to execute_. This way we can still query batches for execution after the provided delay while also allowing multiple formed batches to execute faster. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	40822adb51	[#2213 ] node: Do not return object expired object "Object is expired" means that object is presented in `meta` but it is not `ObjectNotFound` error. Previous implementation made `shard` search for an object without `meta` which was an error. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	9afe86ba3e	[#2212 ] morph: Fix subscription restoration Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Anton Nikiforov	85cf1f47ac	[#1465 ] node: Prevent process from killing by systemd when shutting down Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-17 12:13:00 +03:00
Artem Tataurov	362f24953a	[#47 ] shard: Switch container size metric from physical to logical capacity Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>	2023-02-17 12:03:42 +03:00
Pavel Karpy	901d62567d	[#57 ] node: Broadcast link objects It boosts object assembling by an _average_ container node. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-17 11:58:27 +03:00
Evgenii Stratonikov	204cd3a11c	[#31 ] fstree: Optimize `treePath` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Evgenii Stratonikov	dee4498c1e	[#31 ] fstree: Do not check for a file existence twice Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Evgenii Stratonikov	abbecf49d6	[#31 ] fstree: Speedup string-to-address conversion ``` name old time/op new time/op delta _addressFromString-8 1.25µs ±30% 1.02µs ± 6% -18.49% (p=0.000 n=9+9) name old alloc/op new alloc/op delta _addressFromString-8 352B ± 0% 256B ± 0% -27.27% (p=0.000 n=9+10) name old allocs/op new allocs/op delta _addressFromString-8 6.00 ± 0% 4.00 ± 0% -33.33% (p=0.000 n=10+10) ``` Also, assure compiler that `s` doesn't escape: Before this commit: ``` ./fstree.go:74:24: leaking param: s ./fstree.go:90:6: moved to heap: addr ``` After this commit: ``` ./fstree.go:74:24: s does not escape ``` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Artem Tataurov	ab21d90cfb	[#1794 ] shard: Add increasing case for the payload size metric Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>	2023-02-09 13:30:23 +03:00
Stanislav Bogatyrev	cb016d53a6	[#1 ] Fix comments and error messages Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>	2023-02-06 17:41:14 +03:00
Stanislav Bogatyrev	c761a95eef	[#1 ] Fix project name in control service Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>	2023-02-06 17:41:14 +03:00
Pavel Karpy	73bc1b0b68	[#38 ] node: Fix linter warnings Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-06 17:27:54 +03:00
Anton Nikiforov	515c60bdf4	[#1889 ] adm: Add command `morph netmap-candidates` Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-06 17:26:34 +03:00
Pavel Karpy	89a0266f5e	[#1794 ] metrics: Track physical object capacity per shard Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-01-26 20:06:28 +03:00
Evgenii Stratonikov	9513f163aa	[#2116 ] metrics: Track physical object capacity in the container Currently we track based on `PayloadSize`, because it is already stored in the metabase and it is easier to calculate without slowing down the whole system. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com> Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-01-26 20:06:28 +03:00
Evgenii Stratonikov	d65a95a2c6	[#28 ] pilorama: Remove `LogMove` struct Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	46c62be7e8	[#28 ] Fix linter issues Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	c72576e72f	[#2208 ] engine: Log time-consuming shard operations Currently the only way to tell whether `evacuate/set-mode` is finished is to set a very big timeout and _hope_ that the operation will finish. In this commit we add INFO logs for such operations which should simplify the life of an administrator. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	87f0e3ea25	[#2208 ] fstree: Rename file after write Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	792319a044	[#2208 ] fstree: Remove file if there was an error during write Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	67c97c6804	[#2210 ] services/tree: Drop messages not in queue Currently, under high load clients are blocked on channel send and the number of goroutines can increase indefinitely. In this commit we drop replication messages if send/recv queue is full and rely on a background synchronization. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	25d5995cef	[#2210 ] pilorama: Allocate bucket name outside of batches 1. Reduce allocations inside transactions. 2. Do not encode container ID to string: it allocates a lot and takes more space. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	165a600624	[#2210 ] pilorama: Reduce the amount of keys per node Under high load we are limited by the _amount_ of keys we need to update in a single transaction. In this commit we try storing all state with a single key. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	64a5294b27	[#2200 ] shard: Do not fetch big objects from blobovniczas Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	91757329ae	[#2200 ] shard: Fix blobstor obj fetching In the previous implementation any non-nil error that preceded object fetching from blobstor led to iterating over every storage (in other words, no storage ID information was taken into account). Now storage ID is skipped only if metabase (storage ID source) returns any error. Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	cf1a91a758	[#2206 ] blobovnicza: Use Latin letters in the code Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	c33ad3c474	[#2164 ] node: Use `reconnect_interval` from config Not always the default one. Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	0d8366f475	[#2207 ] object/acl: Return status error for expired session token Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	6451f019d2	[#2203 ] shard: Do not panic in `Close` after unsuccessful `Init` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	6efa93be0a	[#1621 ] services/tree: Return `Apply` result asyncronously Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	ac81c70c09	[#1621 ] pilorama: Batch related operations Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	9009612a82	[#2198 ] blobovniczatree: Properly handle concurrent active blobovnicza update Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	cedbd380f2	[#2197 ] pilorama: Close database in degraded mode Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	1d21b1e3e8	[#1978 ] node: Do not drop clients on split errors After the reconnection interval feature there was an bug related to the big objects collecting: split error is returned from a client directly, not via API status and was considered as a connection error. Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	554b85411f	[#2190 ] services/object: Log service error with INFO level Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	b0ad1b9ed2	[#2193 ] pilorama: Use `do` in `TreeMove` It should be similar to a `TreeAddByPath`. `applyOperation` is used for `Apply` when the operation can be inserted in the middle of a log. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00

1 2 3 4 5 ...

2262 commits