Commit graph

1302 commits

Author SHA1 Message Date
0ee7467da5 [#1715] config: Add compression config section
To group all `compression_*` parameters together.

Change-Id: I11ad9600f731903753fef1adfbc0328ef75bbf87
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-15 15:05:50 +00:00
8c746a914a [#1715] compression: Decouple Config and Compressor
Refactoring.

Change-Id: Ide2e1378f30c39045d4bacd13a902331bd4f764f
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-15 15:05:45 +00:00
98308d0cad [#1715] blobstor: Allow to specify custom compression level
Change-Id: I140c39b9dceaaeb58767061b131777af22242b19
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-15 15:05:39 +00:00
0712c113de
[#1700] gc: Fix deadlock
`HandleExpiredLocks` gets read lock, then `shard.Close` tries to acquire
write lock, but `HandleExpiredLocks` calls `inhumeUnlockedIfExpired` or
`selectExpired`, that try to acquire read lock again.

Change-Id: Ib2ed015e859328045b5a542a4f569e5e0ff8b05b
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-15 10:06:05 +03:00
f37babdc54
[#1700] shard: Lock shard's mode mutex on close
To prevent race between GC handlers and close.

Change-Id: I06219230964f000f666a56158d3563c760518c3b
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-14 14:35:16 +03:00
fd37cea443
[#1700] engine: Drop unused block execution methods
`BlockExecution` and `ResumeExecution` were used only by unit test.
So drop them and simplify code.

Change-Id: Ib3de324617e8a27fc1f015542ac5e94df5c60a6e
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-14 14:35:15 +03:00
e06ecacf57
[#1705] engine: Use condition var for evacuation unit tests
To know exactly when the evacuation was completed,
a conditional variable was added.

Closes #1705

Change-Id: I86f6d7d2ad2b9759905b6b5e9341008cb74f5dfd
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-11 09:20:46 +03:00
4f9d237042
[#1689] linter: Fix staticcheck warning: 'probably want to use time.Time.Equal instead'
Change-Id: Idb119d3f4f167c9e42ed48633d301185589553ed
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-04-09 16:55:35 +03:00
faec499b38
[#1689] linter: Fix staticcheck warning: 'variable naming format'
Change-Id: I8f8b63a6a5f9b6feb7c91f70fe8ac092575b145c
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-04-09 16:55:35 +03:00
b0ef737a74 [#1689] linter: Fix testifylint warning: 'len: use require.Len'
Change-Id: I7a08f09c169ac237647dcb20b0737f1c51c441ad
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-04-08 09:32:30 +00:00
6f7b6b65f3 [#1689] linter: Fix staticcheck warning: 'embedded field can be simplified'
Change-Id: I8f454f7d09973cdea096495c3949b88cdd01102e
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-04-08 09:32:24 +00:00
b4b053cecd
[#1679] node: Fix 'gocognit' warning
Change-Id: I6e2a278af51869c05c306c2910ba85130e39532e
Signed-off-by: Alexander Chuprov <a.chuprov@yadro.com>
2025-04-07 14:10:25 +03:00
5350632e01 [#1705] engine/test: Increase evacuation timeout
This test was flaky in CI probably because of runner load fluctuations.
Let's increase the timeout and see if the flakiness goes away.

(close #1705)

Change-Id: I76f96e3d6f4adb3d5de0e27b8ee6b47685236277
Signed-off-by: Vitaliy Potyarkin <v.potyarkin@yadro.com>
2025-04-04 14:41:56 +00:00
634de97509
[#1704] metabase: Do not ignore errors by Delete
Change-Id: Ie7b89071a007f53f55879ff9e7e0c25d24ad5dbf
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-03 10:09:05 +03:00
e142d25fac
[#1700] gc: Wait for handlers on GC stopping
First wait for goroutine handles epoch events to not to get data race
on `gc.newEpochHandlers.cancelFunc`.

Then cancel handlers and wait for them.

Change-Id: I71f11f8526961f8356f582a95b10eb8340c0aedd
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-01 16:00:41 +03:00
f62d81e26a
[#1700] gc: Take mode mutex in locks handlers
Change-Id: I4408eae3aed936f85427b6246dcf727bd6813a0d
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-01 14:23:03 +03:00
27899598dc
[#1700] gc: Drop Event interface
There is only one event: new epoch.

Change-Id: I982f3650f7bc753ff2782393625452f0f8cdcc35
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-01 14:23:02 +03:00
bc6cc9ae2a
[#1700] engine: Print stacks on test request limiter
Change-Id: I4952769ca431d1049955823b41b99b0984b385fc
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-04-01 14:23:02 +03:00
d144abc977 [#1692] metabase: Remove useless count variable
It is always equal to `len(to)`.

Change-Id: Id7a4c26e9711216b78f96e6b2511efa0773e3471
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-28 12:16:01 +00:00
a2053870e2 [#1692] metabase: Use bucket cache in ListWithCursor()
No changes in speed, but unified approach:
```
goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/metabase
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                           │    master    │                 new                 │
                           │    sec/op    │    sec/op     vs base               │
ListWithCursor/1_item-8      6.067µ ±  8%   5.731µ ± 10%       ~ (p=0.052 n=10)
ListWithCursor/10_items-8    25.40µ ± 11%   26.12µ ± 13%       ~ (p=0.971 n=10)
ListWithCursor/100_items-8   210.7µ ±  9%   203.2µ ±  6%       ~ (p=0.280 n=10)
geomean                      31.90µ         31.22µ        -2.16%

                           │    master    │                  new                  │
                           │     B/op     │     B/op      vs base                 │
ListWithCursor/1_item-8      3.287Ki ± 0%   3.287Ki ± 0%       ~ (p=1.000 n=10) ¹
ListWithCursor/10_items-8    15.63Ki ± 0%   15.62Ki ± 0%       ~ (p=0.328 n=10)
ListWithCursor/100_items-8   138.1Ki ± 0%   138.1Ki ± 0%       ~ (p=0.340 n=10)
geomean                      19.21Ki        19.21Ki       -0.00%
¹ all samples are equal

                           │   master    │                 new                  │
                           │  allocs/op  │  allocs/op   vs base                 │
ListWithCursor/1_item-8       109.0 ± 0%    109.0 ± 0%       ~ (p=1.000 n=10) ¹
ListWithCursor/10_items-8     380.0 ± 0%    380.0 ± 0%       ~ (p=1.000 n=10) ¹
ListWithCursor/100_items-8   3.082k ± 0%   3.082k ± 0%       ~ (p=1.000 n=10) ¹
geomean                       503.5         503.5       +0.00%
¹ all samples are equal
```

Change-Id: Ic11673427615053656b2a60068a6d4dbd27af2cb
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-28 12:16:01 +00:00
5470b205fd [#1619] gc: Fix metric frostfs_node.garbage_collector.marking_duration_seconds
Change-Id: I957f930d1babf179d0fb6de624a90f4fe9977862
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
2025-03-26 14:14:26 +03:00
3f4717a37f [#1692] metabase: Do not allocate map in cache unless needed
Change-Id: I8b1015a8c7c3df4153a08fdb788117d9f0d6c333
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-21 08:56:32 +00:00
60cea8c714 [#1692] metabase/test: Fix end of iteration error check
This is not good:
```
BenchmarkListWithCursor/1_item-8                --- FAIL: BenchmarkListWithCursor/1_item-8
    list_test.go:63: error: end of object listing
```

Change-Id: I61b70937ce30fefaf16ebeb0cdb51bdd39096061
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-21 08:56:18 +00:00
e8801dbf49 [#1691] metabase: Move cheaper conditions to the front in ListWithCursor()
`objectLocked` call is expensive, it does IO. We may omit it if object
is not expired.

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-20 12:52:36 +00:00
eb9df85b98 [#1685] metabase: Cache primary bucket
```
goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/metabase
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                          │   expired    │               primary               │
                          │    sec/op    │    sec/op     vs base               │
Select/string_equal-8       3.529m ± 11%   3.689m ±  7%  +4.55% (p=0.023 n=10)
Select/string_not_equal-8   3.440m ±  7%   3.543m ± 13%       ~ (p=0.190 n=10)
Select/common_prefix-8      3.240m ±  6%   3.050m ±  5%  -5.85% (p=0.005 n=10)
Select/unknown-8            3.198m ±  6%   2.928m ±  8%  -8.44% (p=0.003 n=10)
geomean                     3.349m         3.287m        -1.84%

                          │   expired    │               primary               │
                          │     B/op     │     B/op      vs base               │
Select/string_equal-8       1.885Mi ± 0%   1.786Mi ± 0%  -5.23% (p=0.000 n=10)
Select/string_not_equal-8   1.885Mi ± 0%   1.786Mi ± 0%  -5.23% (p=0.000 n=10)
Select/common_prefix-8      1.885Mi ± 0%   1.786Mi ± 0%  -5.23% (p=0.000 n=10)
Select/unknown-8            1.877Mi ± 0%   1.779Mi ± 0%  -5.26% (p=0.000 n=10)
geomean                     1.883Mi        1.784Mi       -5.24%

                          │   expired   │              primary               │
                          │  allocs/op  │  allocs/op   vs base               │
Select/string_equal-8       46.04k ± 0%   43.04k ± 0%  -6.50% (p=0.000 n=10)
Select/string_not_equal-8   46.04k ± 0%   43.04k ± 0%  -6.50% (p=0.000 n=10)
Select/common_prefix-8      46.04k ± 0%   43.04k ± 0%  -6.50% (p=0.000 n=10)
Select/unknown-8            45.05k ± 0%   42.05k ± 0%  -6.65% (p=0.000 n=10)
geomean                     45.79k        42.79k       -6.54%
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-20 12:52:01 +00:00
21bed3362c [#1685] metabase: Cache expired bucket
```
goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/metabase
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                          │    master    │               expired                │
                          │    sec/op    │    sec/op     vs base                │
Select/string_equal-8       4.007m ± 10%   3.529m ± 11%  -11.94% (p=0.000 n=10)
Select/string_not_equal-8   3.834m ± 12%   3.440m ±  7%  -10.29% (p=0.029 n=10)
Select/common_prefix-8      3.470m ±  9%   3.240m ±  6%        ~ (p=0.105 n=10)
Select/unknown-8            3.156m ±  3%   3.198m ±  6%        ~ (p=0.631 n=10)
geomean                     3.602m         3.349m         -7.03%

                          │    master    │               expired               │
                          │     B/op     │     B/op      vs base               │
Select/string_equal-8       1.907Mi ± 0%   1.885Mi ± 0%  -1.18% (p=0.000 n=10)
Select/string_not_equal-8   1.907Mi ± 0%   1.885Mi ± 0%  -1.18% (p=0.000 n=10)
Select/common_prefix-8      1.907Mi ± 0%   1.885Mi ± 0%  -1.18% (p=0.000 n=10)
Select/unknown-8            1.900Mi ± 0%   1.877Mi ± 0%  -1.18% (p=0.000 n=10)
geomean                     1.905Mi        1.883Mi       -1.18%

                          │   master    │              expired               │
                          │  allocs/op  │  allocs/op   vs base               │
Select/string_equal-8       47.03k ± 0%   46.04k ± 0%  -2.12% (p=0.000 n=10)
Select/string_not_equal-8   47.03k ± 0%   46.04k ± 0%  -2.12% (p=0.000 n=10)
Select/common_prefix-8      47.03k ± 0%   46.04k ± 0%  -2.12% (p=0.000 n=10)
Select/unknown-8            46.04k ± 0%   45.05k ± 0%  -2.16% (p=0.000 n=10)
geomean                     46.78k        45.79k       -2.13%
```

Change-Id: I9c7a5e1f5c8b9eb3f25a563fd74c6ad2a9d1b92e
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-20 12:52:01 +00:00
a49f0717b3 [#1685] metabase: Cache frequently accessed singleton buckets
There are some buckets we access almost always, to check whether an
object is alive. In search we also iterate over lots of objects, and
`tx.Bucket()` shows itself a lot in pprof.
```
goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/metabase
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                          │      1      │                  2                   │
                          │   sec/op    │    sec/op     vs base                │
Select/string_equal-8       4.753m ± 6%   3.969m ± 14%  -16.50% (p=0.000 n=10)
Select/string_not_equal-8   4.247m ± 9%   3.486m ± 11%  -17.93% (p=0.000 n=10)
Select/common_prefix-8      4.163m ± 5%   3.323m ±  5%  -20.18% (p=0.000 n=10)
Select/unknown-8            3.557m ± 3%   3.064m ±  8%  -13.85% (p=0.001 n=10)
geomean                     4.158m        3.445m        -17.15%

                          │      1       │                  2                   │
                          │     B/op     │     B/op      vs base                │
Select/string_equal-8       2.250Mi ± 0%   1.907Mi ± 0%  -15.24% (p=0.000 n=10)
Select/string_not_equal-8   2.250Mi ± 0%   1.907Mi ± 0%  -15.24% (p=0.000 n=10)
Select/common_prefix-8      2.250Mi ± 0%   1.907Mi ± 0%  -15.24% (p=0.000 n=10)
Select/unknown-8            2.243Mi ± 0%   1.900Mi ± 0%  -15.29% (p=0.000 n=10)
geomean                     2.248Mi        1.905Mi       -15.26%

                          │      1      │                  2                  │
                          │  allocs/op  │  allocs/op   vs base                │
Select/string_equal-8       56.02k ± 0%   47.03k ± 0%  -16.05% (p=0.000 n=10)
Select/string_not_equal-8   56.02k ± 0%   47.03k ± 0%  -16.05% (p=0.000 n=10)
Select/common_prefix-8      56.02k ± 0%   47.03k ± 0%  -16.05% (p=0.000 n=10)
Select/unknown-8            55.03k ± 0%   46.04k ± 0%  -16.34% (p=0.000 n=10)
geomean                     55.78k        46.78k       -16.12%
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-20 10:17:42 +00:00
39f549a7ab [#1642] tree: Intoduce a helper LastChild
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2025-03-20 10:12:49 +00:00
760b6a44ea [#1642] tree: Fix sorted getSubtree for multiversion filenames
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2025-03-20 10:12:49 +00:00
a11b2d27e4 [#1642] tree: Introduce Cursor type
* Use `Cursor` as parameter for `TreeSortedByFilename`

Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
2025-03-20 10:12:49 +00:00
a405fb1f39 [#1683] metabase: Check object status once in Select()
objectStatus() is called twice for the same object:
First, in selectObject() to filter removed objects.
Then, again, in getObjectForSlowFilters() via db.get().
The second call will return the same result, so remove useless branch.

```
goos: linux
goarch: amd64
pkg: git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/metabase
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
                          │     old     │                status                │
                          │   sec/op    │    sec/op     vs base                │
Select/string_equal-8       5.022m ± 7%   3.968m ±  8%  -20.98% (p=0.000 n=10)
Select/string_not_equal-8   4.953m ± 9%   3.990m ± 10%  -19.44% (p=0.000 n=10)
Select/common_prefix-8      4.962m ± 8%   3.971m ±  9%  -19.98% (p=0.000 n=10)
Select/unknown-8            5.246m ± 9%   3.548m ±  5%  -32.37% (p=0.000 n=10)
geomean                     5.045m        3.865m        -23.39%

                          │     old      │                status                │
                          │     B/op     │     B/op      vs base                │
Select/string_equal-8       2.685Mi ± 0%   2.250Mi ± 0%  -16.20% (p=0.000 n=10)
Select/string_not_equal-8   2.685Mi ± 0%   2.250Mi ± 0%  -16.20% (p=0.000 n=10)
Select/common_prefix-8      2.685Mi ± 0%   2.250Mi ± 0%  -16.20% (p=0.000 n=10)
Select/unknown-8            2.677Mi ± 0%   2.243Mi ± 0%  -16.24% (p=0.000 n=10)
geomean                     2.683Mi        2.248Mi       -16.21%

                          │     old     │               status                │
                          │  allocs/op  │  allocs/op   vs base                │
Select/string_equal-8       69.03k ± 0%   56.02k ± 0%  -18.84% (p=0.000 n=10)
Select/string_not_equal-8   69.03k ± 0%   56.02k ± 0%  -18.84% (p=0.000 n=10)
Select/common_prefix-8      69.03k ± 0%   56.02k ± 0%  -18.84% (p=0.000 n=10)
Select/unknown-8            68.03k ± 0%   55.03k ± 0%  -19.11% (p=0.000 n=10)
geomean                     68.78k        55.77k       -18.90%
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-18 11:48:51 +00:00
a7319bc979 [#1683] metabase/test: Report allocs in benchmarkSelect()
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-18 11:48:51 +00:00
07a660fbc4
[#1677] writecache: Add QoS limiter usage
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-03-14 16:23:33 +03:00
460e5cbccf [#1671] Use slices.Delete() where possible
gopatch is missing for this one, because
https://github.com/uber-go/gopatch/issues/179

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-13 08:12:20 +00:00
155d3ddb6e [#1671] Use min builtin where possible
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-13 08:12:20 +00:00
40536d8a06 [#1671] Use fmt.Appendf where warranted
Fix gopls warnings:
```
cmd/frostfs-adm/internal/modules/morph/config/config.go:68:20-64: Replace []byte(fmt.Sprintf...) with fmt.Appendf
````

gopatch:
```
@@
var f expression
@@
-[]byte(fmt.Sprintf(f, ...))
+fmt.Appendf(nil, f, ...)
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-13 08:12:20 +00:00
bcc84c85a0 [#1671] Replace interface{} with any
gopatch:
```
@@
@@
-interface{}
+any
```

Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2025-03-13 08:12:20 +00:00
2005fdda09
[#1667] shard: Drop shard pool
After adding an ops limiter, shard's `put` pool is redundant.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-03-11 13:59:51 +03:00
3727d60331 [#1653] qos: Add metrics
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-03-11 10:57:47 +00:00
8643e0abc5
[#1668] writecache: Use object size to check free space
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-03-10 17:52:57 +03:00
9e31cb249f [#1635] control: Add method to search shards by object
Added method `ListShardsForObject` to ControlService and to
StorageEngine. It returns information about shards storing
object on the node.

Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
2025-03-07 14:32:01 +03:00
4685afb1dc
[#1636] engine: Validate limiter release in unit tests
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:30 +03:00
eb8b9b2b3b
[#1636] blobovniczatree: Validate limiter release in rebuild unit tests
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:30 +03:00
6c6e463b73
[#1636] shard: Change ops limiter on shard reload
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:29 +03:00
c2d855aedd
[#1636] qos: Return Resource Exhausted error
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:28 +03:00
b9360be1dc
[#1636] blobovniczatree: Use RebuildLimiter
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:27 +03:00
ceff5e1f6a
[#1636] storage: Refactor shard rebuild
Drop redundant interfaces.
Rename fields.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:27 +03:00
e0dc3c3d0c
[#1636] shard: Add limiter usage
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 17:25:26 +03:00
0991077cb3 [#1657] engine: Fix data race in evacuation tests
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
2025-02-28 14:14:12 +00:00
02f3a7f65c
[#1648] writecache: Fix race condition when reporting cache size metrics
There is a race condition when multiple cache operation try to report
the cache size metrics simultaneously. Consider the following example:
- the initial total size of objects stored in the cache size is 2
- worker X deletes an object and reads the cache size, which is 1
- worker Y deletes an object and reads the cache size, which is 0
- worker Y reports the cache size it learnt, which is 0
- worker X reports the cache size it learnt, which is 1

As a result, the observed cache size is 1 (i. e. one object remains
in the cache), which is incorrect because the actual cache size is 0.

To fix this, let's report the metrics periodically in the flush loop.

Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
2025-02-19 17:05:40 +03:00