frostfs-node/pkg/local_object_storage
Roman Khimov fc31b9c947 [#970] fstree: Add linux-specific file writer using O_TMPFILE
O_TMPFILE is implemented for all modern FSes and it's much easier and safer to
use. If application crashes in the middle of writing this file would be gone
and won't leave any garbage.

Notice that this implementation makes a different choice wrt EEXIST handling,
generic one always overwrites, while this one keeps the old data.

There is no real performance difference.

SSD (slow&old), XFS, Core i7-8565U:

Sync
```
name                                  old time/op    new time/op    delta
Put/size=1024,thread=1/fstree-8         1.74ms ± 3%    0.06ms ± 7%  -96.31%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        10.0ms ±41%     1.1ms ±18%  -88.95%  (p=0.000 n=9+10)
Put/size=1024,thread=100/fstree-8       32.3ms ±60%     6.5ms ±14%  -79.97%  (p=0.000 n=10+10)
Put/size=1048576,thread=1/fstree-8      17.8ms ±90%     3.4ms ±70%  -81.08%  (p=0.000 n=10+10)
Put/size=1048576,thread=20/fstree-8     103ms ±174%    112ms ±158%     ~     (p=0.971 n=10+10)
Put/size=1048576,thread=100/fstree-8     949ms ±78%    583ms ±132%     ~     (p=0.089 n=10+10)

name                                  old alloc/op   new alloc/op   delta
Put/size=1024,thread=1/fstree-8         3.17kB ± 1%    1.96kB ± 0%  -38.09%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        59.6kB ± 1%    39.2kB ± 1%  -34.30%  (p=0.000 n=8+10)
Put/size=1024,thread=100/fstree-8        299kB ± 0%     198kB ± 0%  -33.90%  (p=0.000 n=7+9)
Put/size=1048576,thread=1/fstree-8      3.38kB ± 1%    2.36kB ± 1%  -30.22%  (p=0.000 n=10+10)
Put/size=1048576,thread=20/fstree-8     65.7kB ± 4%    47.7kB ± 6%  -27.27%  (p=0.000 n=10+10)
Put/size=1048576,thread=100/fstree-8     351kB ± 8%     245kB ± 8%  -30.22%  (p=0.000 n=10+10)

name                                  old allocs/op  new allocs/op  delta
Put/size=1024,thread=1/fstree-8           30.3 ± 2%      21.0 ± 0%  -30.69%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8           554 ± 1%       413 ± 0%  -25.35%  (p=0.000 n=8+10)
Put/size=1024,thread=100/fstree-8        2.77k ± 0%     2.07k ± 0%  -25.27%  (p=0.000 n=7+10)
Put/size=1048576,thread=1/fstree-8        32.0 ± 0%      25.0 ± 0%  -21.88%  (p=0.000 n=9+8)
Put/size=1048576,thread=20/fstree-8        609 ± 5%       494 ± 6%  -18.93%  (p=0.000 n=10+10)
Put/size=1048576,thread=100/fstree-8     3.25k ± 9%     2.50k ± 8%  -23.21%  (p=0.000 n=10+10)
```

No sync
```
name                                  old time/op    new time/op    delta
Put/size=1024,thread=1/fstree-8         71.3µs ±10%    59.8µs ±10%  -16.21%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        1.43ms ± 6%    1.22ms ±13%  -14.53%  (p=0.000 n=10+10)
Put/size=1024,thread=100/fstree-8       8.12ms ± 3%    6.36ms ± 2%  -21.67%  (p=0.000 n=8+9)
Put/size=1048576,thread=1/fstree-8      1.88ms ±70%    1.61ms ±78%     ~     (p=0.393 n=10+10)
Put/size=1048576,thread=20/fstree-8     32.7ms ±28%   34.2ms ±112%     ~     (p=0.968 n=9+10)
Put/size=1048576,thread=100/fstree-8     262ms ±56%     226ms ±34%     ~     (p=0.447 n=10+9)

name                                  old alloc/op   new alloc/op   delta
Put/size=1024,thread=1/fstree-8         2.89kB ± 0%    1.96kB ± 0%  -32.28%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8        58.2kB ± 0%    39.5kB ± 0%  -32.09%  (p=0.000 n=8+8)
Put/size=1024,thread=100/fstree-8        291kB ± 0%     198kB ± 0%  -32.19%  (p=0.000 n=9+9)
Put/size=1048576,thread=1/fstree-8      3.05kB ± 1%    2.13kB ± 1%  -30.16%  (p=0.000 n=10+9)
Put/size=1048576,thread=20/fstree-8     62.6kB ± 0%    44.3kB ± 0%  -29.23%  (p=0.000 n=9+9)
Put/size=1048576,thread=100/fstree-8     302kB ± 0%     210kB ± 1%  -30.39%  (p=0.000 n=9+9)

name                                  old allocs/op  new allocs/op  delta
Put/size=1024,thread=1/fstree-8           27.0 ± 0%      21.0 ± 0%  -22.22%  (p=0.000 n=10+10)
Put/size=1024,thread=20/fstree-8           539 ± 0%       415 ± 0%  -22.98%  (p=0.000 n=10+10)
Put/size=1024,thread=100/fstree-8        2.69k ± 0%     2.07k ± 0%  -23.09%  (p=0.000 n=9+9)
Put/size=1048576,thread=1/fstree-8        28.0 ± 0%      22.3 ± 3%  -20.36%  (p=0.000 n=8+10)
Put/size=1048576,thread=20/fstree-8        577 ± 0%       458 ± 0%  -20.72%  (p=0.000 n=9+9)
Put/size=1048576,thread=100/fstree-8     2.76k ± 0%     2.15k ± 0%  -22.05%  (p=0.000 n=9+8)
```

HDD (LVM), ext4, Ryzen 5 1600:

Sync
```
                                      │ fs.sync-generic │            fs.sync-linux            │
                                      │     sec/op      │    sec/op     vs base               │
Put/size=1024,thread=1/fstree-12           34.70m ± 19%   33.59m ± 16%       ~ (p=0.529 n=10)
Put/size=1024,thread=20/fstree-12          188.8m ±  8%   189.2m ± 16%       ~ (p=0.739 n=10)
Put/size=1024,thread=100/fstree-12         264.8m ± 22%   273.6m ± 28%       ~ (p=0.353 n=10)
Put/size=1048576,thread=1/fstree-12        54.90m ± 14%   47.08m ± 18%       ~ (p=0.063 n=10)
Put/size=1048576,thread=20/fstree-12       244.1m ± 14%   220.4m ± 22%       ~ (p=0.579 n=10)
Put/size=1048576,thread=100/fstree-12      847.2m ±  5%   893.6m ±  3%  +5.48% (p=0.000 n=10)
geomean                                    164.3m         158.9m        -3.29%

                                      │ fs.sync-generic │            fs.sync-linux             │
                                      │      B/op       │     B/op      vs base                │
Put/size=1024,thread=1/fstree-12           3.375Ki ± 1%   2.471Ki ± 1%  -26.80% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12          66.62Ki ± 6%   49.21Ki ± 6%  -26.15% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12         319.2Ki ± 1%   230.9Ki ± 2%  -27.64% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12        3.457Ki ± 1%   2.559Ki ± 1%  -25.97% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12       66.91Ki ± 1%   49.16Ki ± 1%  -26.52% (p=0.000 n=10)
Put/size=1048576,thread=100/fstree-12      338.8Ki ± 2%   252.3Ki ± 3%  -25.54% (p=0.000 n=10)
geomean                                    42.17Ki        31.02Ki       -26.44%

                                      │ fs.sync-generic │            fs.sync-linux            │
                                      │    allocs/op    │  allocs/op   vs base                │
Put/size=1024,thread=1/fstree-12             33.00 ± 0%    27.00 ± 0%  -18.18% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12            639.5 ± 1%    519.0 ± 2%  -18.84% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12          3.059k ± 1%   2.478k ± 2%  -18.99% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12          33.50 ± 1%    28.00 ± 4%  -16.42% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12         638.5 ± 1%    520.0 ± 1%  -18.56% (p=0.000 n=10)
Put/size=1048576,thread=100/fstree-12       3.209k ± 2%   2.655k ± 2%  -17.28% (p=0.000 n=10)
geomean                                      405.3         332.1       -18.05%
```

No sync
```
                                      │ fs.nosync-generic │             fs.nosync-linux              │
                                      │      sec/op       │    sec/op     vs base                    │
Put/size=1024,thread=1/fstree-12           148.2µ ± 20%     136.6µ ± 19%   -7.89% (p=0.029 n=10)
Put/size=1024,thread=20/fstree-12          1.140m ± 26%     1.364m ± 16%        ~ (p=0.143 n=10)
Put/size=1024,thread=100/fstree-12         11.93m ± 68%     26.89m ± 62%        ~ (p=0.123 n=10)
Put/size=1048576,thread=1/fstree-12        1.302m ±  3%     1.287m ±  5%        ~ (p=0.481 n=10)
Put/size=1048576,thread=20/fstree-12       77.52m ±  8%     74.07m ±  7%        ~ (p=0.278 n=10+9)
Put/size=1048576,thread=100/fstree-12      226.1m ±   ∞ ¹
geomean                                    5.986m           3.434m        +18.60%                  ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable

                                      │ fs.nosync-generic │             fs.nosync-linux              │
                                      │       B/op        │     B/op      vs base                    │
Put/size=1024,thread=1/fstree-12           2.879Ki ± 0%     1.972Ki ± 0%  -31.51% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12          55.94Ki ± 1%     37.90Ki ± 1%  -32.25% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12         272.6Ki ± 0%     182.1Ki ± 9%  -33.21% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12        3.158Ki ± 0%     2.259Ki ± 0%  -28.46% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12       58.87Ki ± 0%     41.03Ki ± 0%  -30.30% (p=0.000 n=10+9)
Put/size=1048576,thread=100/fstree-12      299.8Ki ±  ∞ ¹
geomean                                    36.71Ki          16.60Ki       -31.17%                  ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable

                                      │ fs.nosync-generic │            fs.nosync-linux            │
                                      │     allocs/op     │  allocs/op   vs base                  │
Put/size=1024,thread=1/fstree-12             28.00 ± 0%      22.00 ± 0%  -21.43% (p=0.000 n=10)
Put/size=1024,thread=20/fstree-12            530.0 ± 0%      407.5 ± 1%  -23.11% (p=0.000 n=10)
Put/size=1024,thread=100/fstree-12          2.567k ± 0%     1.956k ± 9%  -23.77% (p=0.000 n=10)
Put/size=1048576,thread=1/fstree-12          30.00 ± 0%      24.00 ± 0%  -20.00% (p=0.000 n=10)
Put/size=1048576,thread=20/fstree-12         553.5 ± 0%      434.0 ± 0%  -21.59% (n=10+9)
Put/size=1048576,thread=100/fstree-12       2.803k ±  ∞ ¹
geomean                                      347.9           178.8       -21.99%                ²
¹ need >= 6 samples for confidence interval at level 0.95
² benchmark set differs from baseline; geomeans may not be comparable
```

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2024-02-09 16:12:11 +00:00
..
blobovnicza [#874] engine: Check object existance concurrently 2024-01-23 09:28:29 +03:00
blobstor [#970] fstree: Add linux-specific file writer using O_TMPFILE 2024-02-09 16:12:11 +00:00
engine [#947] engine: Evacuate trees to remote nodes 2024-02-09 11:33:15 +03:00
internal [#959] node: Set mode to shard's components when open it 2024-02-09 14:04:01 +00:00
metabase [#959] node: Set mode to shard's components when open it 2024-02-09 14:04:01 +00:00
metrics [#661] metrcis: Add rebuild percent metric 2023-12-07 15:37:33 +03:00
pilorama [#959] node: Set mode to shard's components when open it 2024-02-09 14:04:01 +00:00
shard [#959] node: Set mode to shard's components when open it 2024-02-09 14:04:01 +00:00
util [#587] Do not use math/rand.Read 2023-08-09 16:02:44 +03:00
writecache [#959] node: Set mode to shard's components when open it 2024-02-09 14:04:01 +00:00