Commit graph

4461 commits

Author SHA1 Message Date
abd502215f [#970] fstree: Move file locking to the generic writer
It is not a part of FSTree itself, but rather a way to solve concurrent
counter update on non-linux implementations. New linux implementations
is pretty simple: link fails when the file exists, unlink fails when the
file doesn't exist.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
fb74524ac7 [#970] fstree: Move delete implementation to a separate file
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
7f692409cf [#970] fstree: Handle unsupported O_TMPFILE
Metabase test relied on this behaviour, so fix the test too.

Cherry-picking was hard and did too many conflicts,
here is an original PR:
https://github.com/nspcc-dev/neofs-node/pull/2624

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
Roman Khimov
fc31b9c947 [#970] fstree: Add linux-specific file writer using O_TMPFILE
O_TMPFILE is implemented for all modern FSes and it's much easier and safer to
use. If application crashes in the middle of writing this file would be gone
and won't leave any garbage.

Notice that this implementation makes a different choice wrt EEXIST handling,
generic one always overwrites, while this one keeps the old data.

There is no real performance difference.

SSD (slow&old), XFS, Core i7-8565U:

Sync
```
name                                  old time/op    new time/op    delta
Put/size=1024,thread=1/fstree-8         1.74ms ± 3%    0.06ms ± 7%  -96.31%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        10.0ms ±41%     1.1ms ±18%  -88.95%  (p=0.000 n=9+10)
Put/size=1024,thread=100/fstree-8       32.3ms ±60%     6.5ms ±14%  -79.97%  (p=0.000 n=10+10)
Put/size=1048576,thread=1/fstree-8      17.8ms ±90%     3.4ms ±70%  -81.08%  (p=0.000 n=10+10)
Put/size=1048576,thread=20/fstree-8     103ms ±174%    112ms ±158%     ~     (p=0.971 n=10+10)
Put/size=1048576,thread=100/fstree-8     949ms ±78%    583ms ±132%     ~     (p=0.089 n=10+10)

name                                  old alloc/op   new alloc/op   delta
Put/size=1024,thread=1/fstree-8         3.17kB ± 1%    1.96kB ± 0%  -38.09%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        59.6kB ± 1%    39.2kB ± 1%  -34.30%  (p=0.000 n=8+10)
Put/size=1024,thread=100/fstree-8        299kB ± 0%     198kB ± 0%  -33.90%  (p=0.000 n=7+9)
Put/size=1048576,thread=1/fstree-8      3.38kB ± 1%    2.36kB ± 1%  -30.22%  (p=0.000 n=10+10)
Put/size=1048576,thread=20/fstree-8     65.7kB ± 4%    47.7kB ± 6%  -27.27%  (p=0.000 n=10+10)
Put/size=1048576,thread=100/fstree-8     351kB ± 8%     245kB ± 8%  -30.22%  (p=0.000 n=10+10)

name                                  old allocs/op  new allocs/op  delta
Put/size=1024,thread=1/fstree-8           30.3 ± 2%      21.0 ± 0%  -30.69%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8           554 ± 1%       413 ± 0%  -25.35%  (p=0.000 n=8+10)
Put/size=1024,thread=100/fstree-8        2.77k ± 0%     2.07k ± 0%  -25.27%  (p=0.000 n=7+10)
Put/size=1048576,thread=1/fstree-8        32.0 ± 0%      25.0 ± 0%  -21.88%  (p=0.000 n=9+8)
Put/size=1048576,thread=20/fstree-8        609 ± 5%       494 ± 6%  -18.93%  (p=0.000 n=10+10)
Put/size=1048576,thread=100/fstree-8     3.25k ± 9%     2.50k ± 8%  -23.21%  (p=0.000 n=10+10)
```

No sync
```
name                                  old time/op    new time/op    delta
Put/size=1024,thread=1/fstree-8         71.3µs ±10%    59.8µs ±10%  -16.21%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        1.43ms ± 6%    1.22ms ±13%  -14.53%  (p=0.000 n=10+10)
Put/size=1024,thread=100/fstree-8       8.12ms ± 3%    6.36ms ± 2%  -21.67%  (p=0.000 n=8+9)
Put/size=1048576,thread=1/fstree-8      1.88ms ±70%    1.61ms ±78%     ~     (p=0.393 n=10+10)
Put/size=1048576,thread=20/fstree-8     32.7ms ±28%   34.2ms ±112%     ~     (p=0.968 n=9+10)
Put/size=1048576,thread=100/fstree-8     262ms ±56%     226ms ±34%     ~     (p=0.447 n=10+9)

name                                  old alloc/op   new alloc/op   delta
Put/size=1024,thread=1/fstree-8         2.89kB ± 0%    1.96kB ± 0%  -32.28%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        58.2kB ± 0%    39.5kB ± 0%  -32.09%  (p=0.000 n=8+8)
Put/size=1024,thread=100/fstree-8        291kB ± 0%     198kB ± 0%  -32.19%  (p=0.000 n=9+9)
Put/size=1048576,thread=1/fstree-8      3.05kB ± 1%    2.13kB ± 1%  -30.16%  (p=0.000 n=10+9)
Put/size=1048576,thread=20/fstree-8     62.6kB ± 0%    44.3kB ± 0%  -29.23%  (p=0.000 n=9+9)
Put/size=1048576,thread=100/fstree-8     302kB ± 0%     210kB ± 1%  -30.39%  (p=0.000 n=9+9)

name                                  old allocs/op  new allocs/op  delta
Put/size=1024,thread=1/fstree-8           27.0 ± 0%      21.0 ± 0%  -22.22%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8           539 ± 0%       415 ± 0%  -22.98%  (p=0.000 n=10+10)
Put/size=1024,thread=100/fstree-8        2.69k ± 0%     2.07k ± 0%  -23.09%  (p=0.000 n=9+9)
Put/size=1048576,thread=1/fstree-8        28.0 ± 0%      22.3 ± 3%  -20.36%  (p=0.000 n=8+10)
Put/size=1048576,thread=20/fstree-8        577 ± 0%       458 ± 0%  -20.72%  (p=0.000 n=9+9)
Put/size=1048576,thread=100/fstree-8     2.76k ± 0%     2.15k ± 0%  -22.05%  (p=0.000 n=9+8)
```

HDD (LVM), ext4, Ryzen 5 1600:

Sync
```
                                      │ fs.sync-generic │            fs.sync-linux            │
                                      │     sec/op      │    sec/op     vs base               │
Put/size=1024,thread=1/fstree-12           34.70m ± 19%   33.59m ± 16%       ~ (p=0.529 n=10)
Put/size=1024,thread=20/fstree-12          188.8m ±  8%   189.2m ± 16%       ~ (p=0.739 n=10)
Put/size=1024,thread=100/fstree-12         264.8m ± 22%   273.6m ± 28%       ~ (p=0.353 n=10)
Put/size=1048576,thread=1/fstree-12        54.90m ± 14%   47.08m ± 18%       ~ (p=0.063 n=10)
Put/size=1048576,thread=20/fstree-12       244.1m ± 14%   220.4m ± 22%       ~ (p=0.579 n=10)
Put/size=1048576,thread=100/fstree-12      847.2m ±  5%   893.6m ±  3%  +5.48% (p=0.000 n=10)
geomean                                    164.3m         158.9m        -3.29%

                                      │ fs.sync-generic │            fs.sync-linux             │
                                      │      B/op       │     B/op      vs base                │
Put/size=1024,thread=1/fstree-12           3.375Ki ± 1%   2.471Ki ± 1%  -26.80% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12          66.62Ki ± 6%   49.21Ki ± 6%  -26.15% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12         319.2Ki ± 1%   230.9Ki ± 2%  -27.64% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12        3.457Ki ± 1%   2.559Ki ± 1%  -25.97% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12       66.91Ki ± 1%   49.16Ki ± 1%  -26.52% (p=0.000 n=10)
Put/size=1048576,thread=100/fstree-12      338.8Ki ± 2%   252.3Ki ± 3%  -25.54% (p=0.000 n=10)
geomean                                    42.17Ki        31.02Ki       -26.44%

                                      │ fs.sync-generic │            fs.sync-linux            │
                                      │    allocs/op    │  allocs/op   vs base                │
Put/size=1024,thread=1/fstree-12             33.00 ± 0%    27.00 ± 0%  -18.18% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12            639.5 ± 1%    519.0 ± 2%  -18.84% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12          3.059k ± 1%   2.478k ± 2%  -18.99% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12          33.50 ± 1%    28.00 ± 4%  -16.42% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12         638.5 ± 1%    520.0 ± 1%  -18.56% (p=0.000 n=10)
Put/size=1048576,thread=100/fstree-12       3.209k ± 2%   2.655k ± 2%  -17.28% (p=0.000 n=10)
geomean                                      405.3         332.1       -18.05%
```

No sync
```
                                      │ fs.nosync-generic │             fs.nosync-linux              │
                                      │      sec/op       │    sec/op     vs base                    │
Put/size=1024,thread=1/fstree-12           148.2µ ± 20%     136.6µ ± 19%   -7.89% (p=0.029 n=10)
Put/size=1024,thread=20/fstree-12          1.140m ± 26%     1.364m ± 16%        ~ (p=0.143 n=10)
Put/size=1024,thread=100/fstree-12         11.93m ± 68%     26.89m ± 62%        ~ (p=0.123 n=10)
Put/size=1048576,thread=1/fstree-12        1.302m ±  3%     1.287m ±  5%        ~ (p=0.481 n=10)
Put/size=1048576,thread=20/fstree-12       77.52m ±  8%     74.07m ±  7%        ~ (p=0.278 n=10+9)
Put/size=1048576,thread=100/fstree-12      226.1m ±   ∞ ¹
geomean                                    5.986m           3.434m        +18.60%                  ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable

                                      │ fs.nosync-generic │             fs.nosync-linux              │
                                      │       B/op        │     B/op      vs base                    │
Put/size=1024,thread=1/fstree-12           2.879Ki ± 0%     1.972Ki ± 0%  -31.51% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12          55.94Ki ± 1%     37.90Ki ± 1%  -32.25% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12         272.6Ki ± 0%     182.1Ki ± 9%  -33.21% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12        3.158Ki ± 0%     2.259Ki ± 0%  -28.46% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12       58.87Ki ± 0%     41.03Ki ± 0%  -30.30% (p=0.000 n=10+9)
Put/size=1048576,thread=100/fstree-12      299.8Ki ±  ∞ ¹
geomean                                    36.71Ki          16.60Ki       -31.17%                  ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable

                                      │ fs.nosync-generic │            fs.nosync-linux            │
                                      │     allocs/op     │  allocs/op   vs base                  │
Put/size=1024,thread=1/fstree-12             28.00 ± 0%      22.00 ± 0%  -21.43% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12            530.0 ± 0%      407.5 ± 1%  -23.11% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12          2.567k ± 0%     1.956k ± 9%  -23.77% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12          30.00 ± 0%      24.00 ± 0%  -20.00% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12         553.5 ± 0%      434.0 ± 0%  -21.59% (n=10+9)
Put/size=1048576,thread=100/fstree-12       2.803k ±  ∞ ¹
geomean                                      347.9           178.8       -21.99%                ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable
```

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
ff488b53a1 [#970] fstree: Move write functions to a separate file
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
9a622a750d [#970] fstree: Move temporary path handling in a separate function
Allow to easier test different implementations.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
d19ade23c8 [#959] node: Set mode to shard's components when open it
Avoid opening database for `metabase` and `cache` in `Degraded` mode.

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-09 14:04:01 +00:00
60527abb65 [#947] docs: Extend evacuation docs
Add pilorama evacuation description and examples.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:33:15 +03:00
db67c21d55 [#947] engine: Evacuate trees to remote nodes
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:33:15 +03:00
728150d1d2 [#947] engine: Evacuate trees to local shards
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:33:15 +03:00
e4064c4394 [#947] cli: Print tree evacuation stat
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:20:39 +03:00
15d853ea22 [#947] controlSvc: Return tree evacuation stat
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:20:39 +03:00
b3f3505ada [#947] cli: Allow to specify evacuation scope
It may be required to evacuate only objects or only tree or all, so
now it spossible to specify.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:20:38 +03:00
a6eb66bf9c [#947] evacuate: Refactor evacuate parameters
Drop methods to make it easier to extend.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:20:38 +03:00
8e2a0611f4 [#947] tree: Add method to list all trees
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-09 11:20:38 +03:00
80b581d499 [#466] adm: Allow to download contracts from Gitea
Signed-off-by: Olga Konstantinova <kola43843@gmail.com>
2024-02-08 21:07:49 +00:00
805862f4b7 [#956] node: Allow to reload goroutine pool sizes
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 15:49:43 +00:00
426cf58b98 [#956] node: Remove pool sizes from config struct
They are available through the pool methods and unused outside of the
function that sets them.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 15:49:43 +00:00
edbe06e07e [#956] policer/test: Reuse testPool helper
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 15:49:43 +00:00
cbfeb72466 [#956] policer: Remove WithMaxCapacity option
We already provide the pool and this argument is used only for
preallocation. No functional changes.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 15:49:43 +00:00
053a195ac2 [#968] adm: Allow concurrent epoch ticks
Previous fix was incomplete, there are two possible places for this
error to occur.

Refs #592

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 08:10:24 +00:00
cfc5ce7853 [#964] metabase: Drop GC marks if object not found
GC inhumes expired locks and tombstones on all the shards.
So it could be GC mark without object.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-08 07:54:39 +00:00
c3fa902780 [#969] policer: Restrict the number of remembered errors
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 10:10:41 +03:00
6010dfdf3d [#969] policer: Make error skip thread-safe
Introduces in afd2ba9a66.
Refs #914

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-08 10:10:41 +03:00
a6c9a337cd [#965] morph: Get rid of container.List invocations
ContainersOf() is better in almost every aspect, besides creating a
session when the containers number is between 1024 and 2048 (prefetch
script does limited unwrapping). Making List() private helps to ensure
it is no longer used and can be safely removed in future.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-07 08:56:27 +00:00
b1a1b2107d [#909] cli: Make add-rule and list-rules recieve namespace param
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-07 06:54:41 +00:00
d7838790c6 [#917] dev: Extend launch.json example
Add storage and dev wallets to control authorized keys list.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-06 14:49:47 +03:00
20b4447df7 [#917] docs: Extend shard mode description
Add shards detach details.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-06 14:49:47 +03:00
9ba48c582d [#917] engine: Allow to detach shards
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-06 14:49:47 +03:00
4358d3c423 [#917] controlSvc: Add DetachShards handler
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-02-06 14:47:52 +03:00
afd2ba9a66 [#110] Add check for repeated error log in policer
processObject() returns 3 types of errors: container not found errors,
could not get container error and placement vector building error. Every
error will occur for all objects in container simultaneously, so we can
log each error once and safely ignore the rest.

Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
2024-02-06 00:56:41 +03:00
602ee11123 [#934] containersvc: Marhal public key in short format for APE
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-02 17:51:38 +00:00
c1a5b831b6 [#955] chainbase: Fix rule chain unmarshalling
* Use correct way DecodeBytes instead unmarshalling by json.

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-02 20:28:04 +03:00
befbaf9d56 [#922] cli: Add new command control list-targets
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-02 12:09:51 +00:00
f64c48b157 [#922] Fix linter issue
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-02 12:09:51 +00:00
9916598dfb [#922] control: Extend api with ListOverrideDefinedTargets
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-02 12:09:51 +00:00
95e15f499f [#922] Update files generated by protoc
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-02 12:09:51 +00:00
2cb04379a4 [#922] go.mod: Update APE
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-02 12:09:51 +00:00
a5446bc17d [#952] object: Pass namespace within context in ACL service
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-02 14:48:11 +03:00
d0eadf7ea2 [#799] engine: Skip put when object removed from shard
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-01 17:49:22 +00:00
6534252c22 [#799] policer: Refactor method processNodes
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-02-01 17:49:22 +00:00
5be2af881a [#934] container: Make container APE middleware read namespaces
* Those methods that can access already existing containers and thus
  can get container properties should read namespace from Zone
  property. If Zone is not set, take a namespace for root.
* Otherwise, define namespaces by owner ID via frostfs-id contract.
* Improve unit-tests, consider more cases.

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-01 17:38:24 +00:00
96c86c4637 [#934] adm: Make frostfsid commands read alphabet wallets
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-01 17:38:24 +00:00
4352bd0e8e [#934] ape: Transform empty namespace within chainbase
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2024-02-01 17:38:24 +00:00
483a67b170 [#937] ape: Validate chain resource name
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-01-31 11:34:35 +03:00
e3573de6db [#930] gc: Stop internal activity by context
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-01-31 08:30:34 +00:00
c441296592 [#930] policer: Release task pool when context cancelled
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2024-01-31 08:30:34 +00:00
675eec91f3 [#938] shard: Update only changed counters
If metric value hasn't changed, but we update metric, then
non existed metric will apear with zero value.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-30 12:37:48 +03:00
c681354afd [#938] engine: Fix container count removal
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2024-01-30 12:37:48 +03:00
df055fead5 [#931] morph: Provide batch size for container listing explicitly
Besides VM stack item limit we also have restrictions on the size of
JSON for a stackitem, which is 128k. This limit is much harder to
calculate, because JSON representation includes type and the encoding is
different for different items. Thus is makes no sense to invent our own
default, so use the one provided by neo-go. But for container listing we
know exactly what we process, so use big enough value, which is tested.

Introduced in be8607a1f6.

Refs #902
Refs https://github.com/nspcc-dev/neo-go/blob/v0.105.0/pkg/vm/stackitem/json.go#L353

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-01-29 14:00:11 +00:00