node: Evacuate objects without setting mode to MAINTENANCE #1349

acid-ant · 2024-09-03T12:47:27Z

acid-ant commented

2024-09-03 12:47:27 +00:00

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>

acid-ant added 1 commit 2024-09-03 12:47:33 +00:00

[#xx] node: Evacuate objects without setting mode to MAINTENANCE

DCO action / DCO (pull_request) Failing after 3m5s

Details

Tests and linters / Run gofumpt (pull_request) Successful in 3m3s

Details

Tests and linters / Tests (1.23) (pull_request) Successful in 4m45s

Details

Tests and linters / Tests (1.22) (pull_request) Successful in 4m47s

Details

Tests and linters / Lint (pull_request) Successful in 5m3s

Details

Pre-commit hooks / Pre-commit (pull_request) Successful in 5m12s

Details

Vulncheck / Vulncheck (pull_request) Successful in 4m57s

Details

Tests and linters / Staticcheck (pull_request) Successful in 5m39s

Details

Tests and linters / Tests with -race (pull_request) Successful in 6m47s

Details

Build / Build Components (1.23) (pull_request) Successful in 6m58s

Details

Build / Build Components (1.22) (pull_request) Successful in 8m34s

Details

Tests and linters / gopls check (pull_request) Successful in 8m50s

Details

d2c27341a9

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>

acid-ant force-pushed feature/evac-without-mm from d2c27341a9 to 280c981cb2

2024-09-03 12:49:51 +00:00

Compare

~~acid-ant referenced this pull request 2024-09-03 12:49:58 +00:00~~

node: Add ability to evacuate objects from REP 1 only #1350

acid-ant requested review from storage-core-committers 2024-09-03 13:44:31 +00:00

acid-ant requested review from storage-core-developers 2024-09-03 13:44:33 +00:00

fyrchik reviewed 2024-09-03 14:06:53 +00:00

pkg/local_object_storage/engine/evacuate.go Outdated

					
				@ -366,6 +366,7 @@ func (e *StorageEngine) evacuateShardObjects(ctx context.Context, shardID string

					listPrm.WithCount(defaultEvacuateBatchSize)

					sh := shardsToEvacuate[shardID]

					sh.SetEvacInProgress(true)

fyrchik commented

2024-09-03 14:00:24 +00:00

When is this disabled? What if we get error in ListWithCursor below?

When is this disabled? What if we get error in `ListWithCursor` below?

acid-ant commented

2024-09-04 06:23:45 +00:00

When we move shard to DISABLED mode. Think this should be the one-way trip. Otherwise, it is possible to lose the data.

When we move shard to `DISABLED` mode. Think this should be the one-way trip. Otherwise, it is possible to lose the data.

dstepanov-yadro commented

2024-09-04 07:30:41 +00:00

But for evacuation shard mode READ_ONLY is only required...

But for evacuation shard mode `READ_ONLY` is only required...

pkg/local_object_storage/shard/head.go Outdated

					
				@ -70,6 +72,11 @@ func (s *Shard) Head(ctx context.Context, prm HeadPrm) (HeadRes, error) {

						res, err = s.Get(ctx, getPrm)

						obj = res.Object()

					} else {

						s.m.RLock()

fyrchik commented

2024-09-03 14:03:42 +00:00

On an unrelated note: we should take mode for the whole operation duration.
I am surprised it wasn't the case before.
Not for this PR

On an unrelated note: we should take mode for the whole operation duration. I am surprised it wasn't the case before. Not for this PR

acid-ant commented

2024-09-04 06:06:03 +00:00

That was done in a such way because we take the same lock here res, err = s.Get(ctx, getPrm). Maybe that was the reason why we don't take lock for the whole operation.

That was done in a such way because we take the same lock here `res, err = s.Get(ctx, getPrm)`. Maybe that was the reason why we don't take lock for the whole operation.

pkg/local_object_storage/shard/info.go Outdated

					
				@ -17,2 +17,4 @@

					Mode mode.Mode

					// True when evacuation is in progress.

					EvacInProgress bool

fyrchik commented

2024-09-03 14:01:36 +00:00

EvacuationInProgress?

`EvacuationInProgress`?

acid-ant commented

2024-09-04 06:30:19 +00:00

Renamed.

pkg/local_object_storage/shard/shard.go Outdated

					
				@ -580,2 +580,4 @@

					}

				}

				func (s *Shard) SetEvacInProgress(val bool) {

fyrchik commented

2024-09-03 14:06:52 +00:00

Evacuation is done on the engine level, but we have a flag on the shard level and because of this we have SkipEvacCheck parameter to the operation.
Was having evacuationInProgress flag considered to be present somewhere in the engine structure?

Evacuation is done on the engine level, but we have a flag on the shard level _and_ because of this we have `SkipEvacCheck` parameter to the operation. Was having `evacuationInProgress` flag considered to be present somewhere in the engine structure?

acid-ant commented

2024-09-04 07:03:18 +00:00

It looks more complex and slower. Once it will be on an engine level, we should check for each request is evacuation for shard in progress, and go to internal map for checking shard and exclude it from iteration. If this overhead is ok for us, we can do that in a such way.

It looks more complex and slower. Once it will be on an `engine` level, we should check for each request is evacuation for shard in progress, and go to internal map for checking shard and exclude it from iteration. If this overhead is ok for us, we can do that in a such way.

acid-ant force-pushed feature/evac-without-mm from 280c981cb2 to 785403affd

2024-09-04 06:30:10 +00:00

Compare

acid-ant force-pushed feature/evac-without-mm from 785403affd to 659c5989ce

2024-09-04 13:07:13 +00:00

Compare

acid-ant commented

2024-09-04 13:09:47 +00:00

Propagate evacuation status into metrics:

# HELP frostfs_node_engine_evacuation_in_progress Shard evacuation in progress
# TYPE frostfs_node_engine_evacuation_in_progress gauge
frostfs_node_engine_evacuation_in_progress{mode="false",shard_id="ALW7GShy8aBd8CyG9VUKdW"} 1

Now evacuation status resetting by request StopShardEvacuationRequest.

Propagate evacuation status into metrics: ``` # HELP frostfs_node_engine_evacuation_in_progress Shard evacuation in progress # TYPE frostfs_node_engine_evacuation_in_progress gauge frostfs_node_engine_evacuation_in_progress{mode="false",shard_id="ALW7GShy8aBd8CyG9VUKdW"} 1 ``` Now evacuation status resetting by request `StopShardEvacuationRequest`.

acid-ant force-pushed feature/evac-without-mm from 659c5989ce to 5717d86173

2024-09-04 13:11:08 +00:00

Compare

dstepanov-yadro approved these changes 2024-09-04 13:28:41 +00:00

Dismissed

fyrchik reviewed 2024-09-05 08:08:39 +00:00

fyrchik left a comment

Please, adjust the documentation https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/docs/evacuation.md (need to mention new flag and the behaviour)

acid-ant force-pushed feature/evac-without-mm from 5717d86173 to fb01817d3d

2024-09-05 13:07:12 +00:00

Compare

acid-ant dismissed dstepanov-yadro's review 2024-09-05 13:07:12 +00:00