node: Evacuate objects without setting mode to MAINTENANCE #1349

Merged
fyrchik merged 1 commit from acid-ant/frostfs-node:feature/evac-without-mm into master 2024-09-05 13:12:56 +00:00
Member

Signed-off-by: Anton Nikiforov an.nikiforov@yadro.com

Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant added 1 commit 2024-09-03 12:47:33 +00:00
[#xx] node: Evacuate objects without setting mode to MAINTENANCE
Some checks failed
DCO action / DCO (pull_request) Failing after 3m5s
Tests and linters / Run gofumpt (pull_request) Successful in 3m3s
Tests and linters / Tests (1.23) (pull_request) Successful in 4m45s
Tests and linters / Tests (1.22) (pull_request) Successful in 4m47s
Tests and linters / Lint (pull_request) Successful in 5m3s
Pre-commit hooks / Pre-commit (pull_request) Successful in 5m12s
Vulncheck / Vulncheck (pull_request) Successful in 4m57s
Tests and linters / Staticcheck (pull_request) Successful in 5m39s
Tests and linters / Tests with -race (pull_request) Successful in 6m47s
Build / Build Components (1.23) (pull_request) Successful in 6m58s
Build / Build Components (1.22) (pull_request) Successful in 8m34s
Tests and linters / gopls check (pull_request) Successful in 8m50s
d2c27341a9
Signed-off-by: Anton Nikiforov <an.nikiforov@yadro.com>
acid-ant force-pushed feature/evac-without-mm from d2c27341a9 to 280c981cb2 2024-09-03 12:49:51 +00:00 Compare
acid-ant requested review from storage-core-committers 2024-09-03 13:44:31 +00:00
acid-ant requested review from storage-core-developers 2024-09-03 13:44:33 +00:00
fyrchik reviewed 2024-09-03 14:06:53 +00:00
@ -366,6 +366,7 @@ func (e *StorageEngine) evacuateShardObjects(ctx context.Context, shardID string
listPrm.WithCount(defaultEvacuateBatchSize)
sh := shardsToEvacuate[shardID]
sh.SetEvacInProgress(true)
Owner

When is this disabled? What if we get error in ListWithCursor below?

When is this disabled? What if we get error in `ListWithCursor` below?
Author
Member

When we move shard to DISABLED mode. Think this should be the one-way trip. Otherwise, it is possible to lose the data.

When we move shard to `DISABLED` mode. Think this should be the one-way trip. Otherwise, it is possible to lose the data.

But for evacuation shard mode READ_ONLY is only required...

But for evacuation shard mode `READ_ONLY` is only required...
@ -70,6 +72,11 @@ func (s *Shard) Head(ctx context.Context, prm HeadPrm) (HeadRes, error) {
res, err = s.Get(ctx, getPrm)
obj = res.Object()
} else {
s.m.RLock()
Owner

On an unrelated note: we should take mode for the whole operation duration.
I am surprised it wasn't the case before.
Not for this PR

On an unrelated note: we should take mode for the whole operation duration. I am surprised it wasn't the case before. Not for this PR
Author
Member

That was done in a such way because we take the same lock here res, err = s.Get(ctx, getPrm). Maybe that was the reason why we don't take lock for the whole operation.

That was done in a such way because we take the same lock here `res, err = s.Get(ctx, getPrm)`. Maybe that was the reason why we don't take lock for the whole operation.
@ -17,2 +17,4 @@
Mode mode.Mode
// True when evacuation is in progress.
EvacInProgress bool
Owner

EvacuationInProgress?

`EvacuationInProgress`?
Author
Member

Renamed.

Renamed.
@ -580,2 +580,4 @@
}
}
func (s *Shard) SetEvacInProgress(val bool) {
Owner

Evacuation is done on the engine level, but we have a flag on the shard level and because of this we have SkipEvacCheck parameter to the operation.
Was having evacuationInProgress flag considered to be present somewhere in the engine structure?

Evacuation is done on the engine level, but we have a flag on the shard level _and_ because of this we have `SkipEvacCheck` parameter to the operation. Was having `evacuationInProgress` flag considered to be present somewhere in the engine structure?
Author
Member

It looks more complex and slower. Once it will be on an engine level, we should check for each request is evacuation for shard in progress, and go to internal map for checking shard and exclude it from iteration. If this overhead is ok for us, we can do that in a such way.

It looks more complex and slower. Once it will be on an `engine` level, we should check for each request is evacuation for shard in progress, and go to internal map for checking shard and exclude it from iteration. If this overhead is ok for us, we can do that in a such way.
acid-ant force-pushed feature/evac-without-mm from 280c981cb2 to 785403affd 2024-09-04 06:30:10 +00:00 Compare
acid-ant force-pushed feature/evac-without-mm from 785403affd to 659c5989ce 2024-09-04 13:07:13 +00:00 Compare
Author
Member

Propagate evacuation status into metrics:

# HELP frostfs_node_engine_evacuation_in_progress Shard evacuation in progress
# TYPE frostfs_node_engine_evacuation_in_progress gauge
frostfs_node_engine_evacuation_in_progress{mode="false",shard_id="ALW7GShy8aBd8CyG9VUKdW"} 1

Now evacuation status resetting by request StopShardEvacuationRequest.

Propagate evacuation status into metrics: ``` # HELP frostfs_node_engine_evacuation_in_progress Shard evacuation in progress # TYPE frostfs_node_engine_evacuation_in_progress gauge frostfs_node_engine_evacuation_in_progress{mode="false",shard_id="ALW7GShy8aBd8CyG9VUKdW"} 1 ``` Now evacuation status resetting by request `StopShardEvacuationRequest`.
acid-ant force-pushed feature/evac-without-mm from 659c5989ce to 5717d86173 2024-09-04 13:11:08 +00:00 Compare
dstepanov-yadro approved these changes 2024-09-04 13:28:41 +00:00
Dismissed
fyrchik reviewed 2024-09-05 08:08:39 +00:00
fyrchik left a comment
Owner

Please, adjust the documentation https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/docs/evacuation.md (need to mention new flag and the behaviour)

Please, adjust the documentation https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/docs/evacuation.md (need to mention new flag and the behaviour)
acid-ant force-pushed feature/evac-without-mm from 5717d86173 to fb01817d3d 2024-09-05 13:07:12 +00:00 Compare
acid-ant dismissed dstepanov-yadro's review 2024-09-05 13:07:12 +00:00
Reason:

New commits pushed, approval review dismissed automatically according to repository settings

acid-ant force-pushed feature/evac-without-mm from fb01817d3d to 108e4e07be 2024-09-05 13:08:46 +00:00 Compare
Author
Member

Updated evacuation.md.

Updated `evacuation.md`.
fyrchik approved these changes 2024-09-05 13:12:51 +00:00
fyrchik merged commit 108e4e07be into master 2024-09-05 13:12:56 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1349
No description provided.