aarifullin/frostfs-node

Author	SHA1	Message	Date
Anton Nikiforov	e9f3c24229	[#65 ] Use `strings.Cut` instead of `strings.Split*` where possible Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-28 13:39:14 +03:00
Dmitrii Stepanov	6925fb4c59	[TrueCloudLab/hrw#2 ] node: Use typed HRW methods Update HRW lib and use typed HRW methods to sort shards and nodes Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>	2023-02-28 13:36:25 +03:00
Dmitrii Stepanov	c3a7039801	[TrueCloudLab/hrw#2 ] node: Optimize shard hash Compute shard hash only once Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>	2023-02-28 13:36:25 +03:00
Alejandro Lopez	73bb590cb1	[#64 ] node: Use pool_size_local and separate pool for local puts Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>	2023-02-22 13:43:19 +03:00
Alejandro Lopez	cb5468abb8	[#66 ] node: Replace interface{} with any Signed-off-by: Alejandro Lopez <a.lopez@yadro.com>	2023-02-21 16:47:07 +03:00
Denis Kirillov	87e69b9349	[#44 ] node: Support multiple configs Signed-off-by: Denis Kirillov <d.kirillov@yadro.com>	2023-02-21 10:00:28 +03:00
Pavel Karpy	337049b2ce	[#56 ] node: Allow reading expired locked object Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-21 09:56:57 +03:00
Pavel Karpy	3beef10f89	[#61 ] node: Do not fetch missing objects If an object is missing in a `meta`, shard should not look for it in a `blobstor`. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 14:47:38 +03:00
Anton Nikiforov	22f3c7d080	[#1868 ] Reload config for pprof and metrics on SIGHUP Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	0b61a3c961	[#2260 ] network/cache: Ignore clients only on `Dial` errors The problem is that accidental timeout errors can make us to ignore other nodes for some time. The primary purpose of the whole ignore mechanism is not to degrade in case of failover. For this case, closing connection and limiting the amount of dials is enough. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	bf1e59bb83	[#2260 ] network/cache: Ignore `context cancelled` errors Timeouts on client side should node affect inter-node communication. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	2567f8020e	[#2260 ] services/object: Do not assemble object with TTL=1 Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	d1d123d180	[#2234 ] writecache: Fix possible panic in `initFlushMarks` In case we have many small objects in the write-cache, `indices` should not be reused between iterations. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	315141dc2c	[#2252 ] fstree: Allow concurrent writes Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	b422ac9f94	[#2164 ] node: Fix multi-client error reporting Missing `ReportError` method did not allow casing multi-client interface to `errorReporter` interface and dropping broken connections. `replicationClient` embeds that interface, and it is widely used across node's code. Embedded interface does not allow casting its parent structure to `errorReporter` and breaks multi client error reporting logic. Multi-client scheme is extremely hard to maintain, it makes unpredictable casts and does not allow tracking code flow, so it will be refactored in the future anyway. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	95ee905861	[#2244 ] node: Fix subscriptions lock Subscribing without async listening could lead to a dead-lock in the `neo-go` client. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	07ec51ea60	[#2244 ] node: Add object address to WC's operations Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	dbbbef9ddb	[#2244 ] node: Update expired storage ID by WC Previously, node could get an "infinite" small object: it could be expired and thus could not be flushed (update its storage ID) to metabase => could not be marked as flushed => node never removes such object and repeat all the cycle one more time. If object exists and is not marked with GC (meta returns `ErrObjectIsExpired`, not `ObjectNotFound` and not `ObjectAlreadyRemoved`), its ID is safe to update _in the same_ bbolt transaction. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	6fd88a036f	[#2241 ] metrics: Fix request count metrics names Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	5cb2c5ae62	[#2238 ] engine: Add test for component initialization failures Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	427fe276f2	[#2238 ] shard: Try closing all components Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	c53903ccd0	[#2238 ] engine: Make `Open` and `Init` similar 1. Both could initialize shards in parallel. 2. Both should close shards after an error. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	e0309e398c	[#2239 ] writecache: Fix possible deadlock LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction. `View` transaction itself takes `mmapLock` [1], which is lifted after tx finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`) When we evict items from LRU cache mutex order is different: first we take LRU mutex and then execute `Batch` which _does_ take `mmapLock` in case we need to remap. Thus the deadlock. [1] `8f4a7e1f92/db.go (L708)` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	58367e4df6	[#2232 ] pilorama: Merge in-queue batches To achieve high performance we must choose proper values for both batch size and delay. For user operations we want to set low delay. However it would prevent tree synchronization operations to form big enough batches. For these operations, batching gives the most benefit not only in terms of on-CPU execution cost, but also by speeding up transaction persist (`fsync`). In this commit we try merging batches that are already _triggered_, but not yet _started to execute_. This way we can still query batches for execution after the provided delay while also allowing multiple formed batches to execute faster. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	40822adb51	[#2213 ] node: Do not return object expired object "Object is expired" means that object is presented in `meta` but it is not `ObjectNotFound` error. Previous implementation made `shard` search for an object without `meta` which was an error. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-20 13:53:27 +03:00
Evgenii Stratonikov	9afe86ba3e	[#2212 ] morph: Fix subscription restoration Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Anton Nikiforov	85cf1f47ac	[#1465 ] node: Prevent process from killing by systemd when shutting down Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-17 12:13:00 +03:00
Artem Tataurov	362f24953a	[#47 ] shard: Switch container size metric from physical to logical capacity Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>	2023-02-17 12:03:42 +03:00
Pavel Karpy	901d62567d	[#57 ] node: Broadcast link objects It boosts object assembling by an _average_ container node. Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-17 11:58:27 +03:00
Evgenii Stratonikov	204cd3a11c	[#31 ] fstree: Optimize `treePath` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Evgenii Stratonikov	dee4498c1e	[#31 ] fstree: Do not check for a file existence twice Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Evgenii Stratonikov	abbecf49d6	[#31 ] fstree: Speedup string-to-address conversion ``` name old time/op new time/op delta _addressFromString-8 1.25µs ±30% 1.02µs ± 6% -18.49% (p=0.000 n=9+9) name old alloc/op new alloc/op delta _addressFromString-8 352B ± 0% 256B ± 0% -27.27% (p=0.000 n=9+10) name old allocs/op new allocs/op delta _addressFromString-8 6.00 ± 0% 4.00 ± 0% -33.33% (p=0.000 n=10+10) ``` Also, assure compiler that `s` doesn't escape: Before this commit: ``` ./fstree.go:74:24: leaking param: s ./fstree.go:90:6: moved to heap: addr ``` After this commit: ``` ./fstree.go:74:24: s does not escape ``` Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-10 12:49:31 +03:00
Artem Tataurov	ab21d90cfb	[#1794 ] shard: Add increasing case for the payload size metric Signed-off-by: Artem Tataurov <a.tataurov@yadro.com>	2023-02-09 13:30:23 +03:00
Stanislav Bogatyrev	cb016d53a6	[#1 ] Fix comments and error messages Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>	2023-02-06 17:41:14 +03:00
Stanislav Bogatyrev	c761a95eef	[#1 ] Fix project name in control service Signed-off-by: Stanislav Bogatyrev <s.bogatyrev@yadro.com>	2023-02-06 17:41:14 +03:00
Pavel Karpy	73bc1b0b68	[#38 ] node: Fix linter warnings Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-06 17:27:54 +03:00
Anton Nikiforov	515c60bdf4	[#1889 ] adm: Add command `morph netmap-candidates` Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2023-02-06 17:26:34 +03:00
Pavel Karpy	89a0266f5e	[#1794 ] metrics: Track physical object capacity per shard Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-01-26 20:06:28 +03:00
Evgenii Stratonikov	9513f163aa	[#2116 ] metrics: Track physical object capacity in the container Currently we track based on `PayloadSize`, because it is already stored in the metabase and it is easier to calculate without slowing down the whole system. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com> Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-01-26 20:06:28 +03:00
Evgenii Stratonikov	d65a95a2c6	[#28 ] pilorama: Remove `LogMove` struct Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	46c62be7e8	[#28 ] Fix linter issues Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	c72576e72f	[#2208 ] engine: Log time-consuming shard operations Currently the only way to tell whether `evacuate/set-mode` is finished is to set a very big timeout and _hope_ that the operation will finish. In this commit we add INFO logs for such operations which should simplify the life of an administrator. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	87f0e3ea25	[#2208 ] fstree: Rename file after write Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	792319a044	[#2208 ] fstree: Remove file if there was an error during write Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	67c97c6804	[#2210 ] services/tree: Drop messages not in queue Currently, under high load clients are blocked on channel send and the number of goroutines can increase indefinitely. In this commit we drop replication messages if send/recv queue is full and rely on a background synchronization. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	25d5995cef	[#2210 ] pilorama: Allocate bucket name outside of batches 1. Reduce allocations inside transactions. 2. Do not encode container ID to string: it allocates a lot and takes more space. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	165a600624	[#2210 ] pilorama: Reduce the amount of keys per node Under high load we are limited by the _amount_ of keys we need to update in a single transaction. In this commit we try storing all state with a single key. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	64a5294b27	[#2200 ] shard: Do not fetch big objects from blobovniczas Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	91757329ae	[#2200 ] shard: Fix blobstor obj fetching In the previous implementation any non-nil error that preceded object fetching from blobstor led to iterating over every storage (in other words, no storage ID information was taken into account). Now storage ID is skipped only if metabase (storage ID source) returns any error. Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Pavel Karpy	cf1a91a758	[#2206 ] blobovnicza: Use Latin letters in the code Signed-off-by: Pavel Karpy <p.karpy@yadro.com> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00

1 2 3 4 5 ...

2272 commits