frostfs-node

Author	SHA1	Message	Date
Evgenii Stratonikov	58367e4df6	[#2232 ] pilorama: Merge in-queue batches To achieve high performance we must choose proper values for both batch size and delay. For user operations we want to set low delay. However it would prevent tree synchronization operations to form big enough batches. For these operations, batching gives the most benefit not only in terms of on-CPU execution cost, but also by speeding up transaction persist (`fsync`). In this commit we try merging batches that are already _triggered_, but not yet _started to execute_. This way we can still query batches for execution after the provided delay while also allowing multiple formed batches to execute faster. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-02-20 13:53:27 +03:00
Pavel Karpy	73bc1b0b68	[#38 ] node: Fix linter warnings Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2023-02-06 17:27:54 +03:00
Evgenii Stratonikov	d65a95a2c6	[#28 ] pilorama: Remove `LogMove` struct Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	25d5995cef	[#2210 ] pilorama: Allocate bucket name outside of batches 1. Reduce allocations inside transactions. 2. Do not encode container ID to string: it allocates a lot and takes more space. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	165a600624	[#2210 ] pilorama: Reduce the amount of keys per node Under high load we are limited by the _amount_ of keys we need to update in a single transaction. In this commit we try storing all state with a single key. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	ac81c70c09	[#1621 ] pilorama: Batch related operations Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru> Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	cedbd380f2	[#2197 ] pilorama: Close database in degraded mode Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	b0ad1b9ed2	[#2193 ] pilorama: Use `do` in `TreeMove` It should be similar to a `TreeAddByPath`. `applyOperation` is used for `Apply` when the operation can be inserted in the middle of a log. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2023-01-25 15:31:47 +03:00
Evgenii Stratonikov	b4e90cdf51	[#2165 ] pilorama: Optimize `TreeApply` when used for synchronization Because synchronization _most likely_ will have apply already existing operations, it is much faster to check their presence in a read transaction. However, always doing this will degrade the perfomance for normal `Apply`. And, let's be honest, it is already not good. Thus we add a separate parameter which specifies whether this logic is enabled. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	e1c3bdbfa6	[#1621 ] pilorama: Remove `Timestamp` field from `nodeInfo` It is already present in `Meta`. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	1044adbe94	[#1621 ] pilorama: Improve memory allocation Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	2539d466a6	[#1621 ] pilorama: Seek after cursor invalidation Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	e9ba8931f8	[#1621 ] pilorama: Simplify bucket creation Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	fe7ddfdc6a	[#1621 ] pilorama: Compare memory forests properly Node children are not sorted and could occur in any order. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru> Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-12-30 11:07:35 +03:00
Evgenii Stratonikov	e5c304536b	[#2161 ] pilorama: Do not apply already existing operations Speeds up synchronization a bit. Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>	2022-12-30 11:07:35 +03:00
Pavel Karpy	923f84722a	Move to frostfs-node Signed-off-by: Pavel Karpy <p.karpy@yadro.com>	2022-12-28 15:04:29 +03:00
Evgenii Stratonikov	7335a52f29	[#1732 ] pilorama: Improve logical error handling Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-11-30 16:58:52 +03:00
Anton Nikiforov	9a20498f34	[#1940 ] Removing all trees by container ID if tree ID is empty in `pilorama.Forest.TreeDrop` Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>	2022-11-19 11:01:04 +03:00
Evgenii Stratonikov	a3e7365cbd	[#1732 ] pilorama: Fill parent mark correctly Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-11-19 11:01:04 +03:00
Evgenii Stratonikov	134f2ba02e	[#1732 ] pilorama: Fix backwards log insertion Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-11-19 11:01:04 +03:00
Evgenii Stratonikov	d8d3588e1b	[#1996 ] engine: Always select proper shard for a tree Currently there is a possibility for modifying operations to fail because of I/O errors and a new tree to be created on another shard. This commit adds existence check for modifying operations. Read operations remain as they are, not to slow things. `TreeDrop` is an exception, because this is a tree removal and trying multiple shards is not an unwanted behaviour. Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-11-03 15:29:23 +03:00
Evgenii Stratonikov	56de2f1363	[#1969 ] local_object_storage: Simplify logic error construction Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-10-31 11:41:24 +03:00
Evgenii Stratonikov	fcdbf5e509	[#1969 ] local_object_storage: Add a type for logical errors All logic errors are wrapped in `logicerr.Logical` type and do not affect shard error counter. Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-10-31 11:41:24 +03:00
Pavel Karpy	24e9e3f3bf	[#1902 ] engine, shard: Implement `TreeList` method Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>	2022-10-20 16:17:57 +03:00
Pavel Karpy	19850ef157	[#1902 ] pilorama: Add `TreeList` method To both `bolt` and `memory` forests; extend `Forest` interface. Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>	2022-10-20 16:17:57 +03:00
Evgenii Stratonikov	d772e35aba	[#1910 ] .golangci.yml: Add `godot` linker Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-10-18 15:08:26 +03:00
Evgenii Stratonikov	87bd49563e	[#1761 ] pilorama: Use batch size of 1 for tests Improve tests speed by a lot and use more iterations in `ApplyRandom` test. Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-09-12 09:54:15 +03:00
Evgenii Stratonikov	a2bb3a2a96	[#1630 ] pilorama: Support dropping trees Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-09-12 09:54:15 +03:00
Evgenii Stratonikov	ae52d53609	[#1698 ] pilorama: Add a test for the empty FileName Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-08-30 16:52:51 +03:00
Evgenii Stratonikov	482a7e7f2f	[#1686 ] pilorama: Add generic tests Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>	2022-08-30 10:02:43 +03:00
Evgenii Stratonikov	3df62769c0	[#1559 ] local_object_storage: Allow to set mode for all components Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 17:56:06 +03:00
Evgenii Stratonikov	1e786233bf	[#1559 ] local_object_storage: Provide readOnly flag to `Open` We should be able to reopen storage in readonly in runtime. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 17:56:06 +03:00
Evgenii Stratonikov	d62723f038	[#1505 ] pilorama: Provide timeout to `bbolt.Open` Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	26041f18bf	[#1505 ] pilorama: Allow to customize database parameters Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	5e843a73f9	[#1333 ] services/control: Return pilorama info in `ListShards` RPC Do not return backend type from the service for now, because memory backend is expected to vanish. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	735931c842	[#1481 ] pilorama: Fix `TreeApply` Current implementation prevents invalid operations to become valid at some later point (consider adding a child to the non-existent parent and then adding the parent). This seems to diverge from the paper algorithm and complicates implementation. Make it simpler. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	ad3038d16d	[#1444 ] pilorama: Fix `TreeMove` in bbolt backend Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	8027b7bb6b	[#1444 ] pilorama: Optimize internal encoding/decoding ``` name old time/op new time/op delta ApplySequential/bbolt-8 55.5µs ± 4% 55.5µs ± 3% ~ (p=1.000 n=10+7) ApplyReorderLast/bbolt-8 108µs ± 6% 112µs ± 8% ~ (p=0.077 n=9+9) name old alloc/op new alloc/op delta ApplySequential/bbolt-8 28.8kB ± 3% 27.7kB ± 6% -3.79% (p=0.005 n=10+10) ApplyReorderLast/bbolt-8 41.4kB ± 5% 38.9kB ± 5% -6.19% (p=0.001 n=10+9) name old allocs/op new allocs/op delta ApplySequential/bbolt-8 262 ± 2% 235 ±10% -10.41% (p=0.000 n=10+10) ApplyReorderLast/bbolt-8 684 ± 6% 616 ± 7% -10.04% (p=0.000 n=10+9) ``` Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	4437cd7113	[#1442 ] pilorama: Generate timestamp based on node position in the container Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	3312924b82	[#1431 ] pilorama: Use `Batch` for write transactions Helps a lot in case of concurrent request flow. ``` name old time/op new time/op delta ApplySequential/bbolt-8 78.0µs ± 9% 59.8µs ± 4% -23.39% (p=0.000 n=10+9) ApplyReorderLast/bbolt-8 143µs ± 5% 113µs ±15% -21.06% (p=0.000 n=10+10) name old alloc/op new alloc/op delta ApplySequential/bbolt-8 56.9kB ± 8% 28.9kB ± 3% -49.22% (p=0.000 n=10+10) ApplyReorderLast/bbolt-8 87.3kB ± 3% 40.9kB ±10% -53.16% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ApplySequential/bbolt-8 224 ±11% 262 ± 5% +16.93% (p=0.000 n=9+10) ApplyReorderLast/bbolt-8 518 ± 4% 674 ±11% +30.09% (p=0.000 n=10+10) ``` Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	f0a67f948d	[#1431 ] pilorama: Cache attributes in the index Currently to find a node by path we iterate over all the children on each level. This is far from optimal and scales badly with the number of nodes on a single level. Thus we introduce "indexed attributes" for which an additional information is stored and which can be use in `*ByPath` operations. Currently this set only includes `FileName` attribute but this may change in future. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	835170e452	[#1329 ] pilorama: Allow to benchmark all tree backends Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	536857ea5a	[#1329 ] services/tree: Implement `GetOpLog` RPC Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	7703dd5d7f	[#1419 ] pilorama: Create new nodes in path if needed Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add a new node by path `["dir", "file.txt"]`, create a new intermediate node with a single attribute. `GetByPath` now also considers only nodes with a single attribute while building a path. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	ad48918a97	[#1406 ] pilorama: Return parent from `TreeGetMeta` Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	8844d9b2db	[#1344 ] pilorama: Document errors for Get* methods Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	910db42748	[#1344 ] pilorama: Use `require.ErrorIs` in tests Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	aea855e8f3	[#1326 ] services/tree: Implement GetSubTree RPC Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00
Evgenii Stratonikov	8cf71b7f1c	[#1324 ] local_object_storage: Implement tree service backend In this commit we implement algorithm for CRDT trees from https://martin.klepmann.com/papers/move-op.pdf Each tree is identified by the ID of a container it belongs to and the tree name itself. Essentially, it is a sequence of operations which should be applied in chronological order to get a usual tree representation. There are 2 backends for now: bbolt database and in-memory. In-memory backend is here for debugging and will eventually act as a memory-cache for the on-disk database. Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>	2022-07-21 15:08:24 +03:00

49 commits