engine: Allow to remove redundant object copies #191

Merged
fyrchik merged 3 commits from fyrchik/frostfs-node:shard-reinsertion into master 2023-04-07 17:25:51 +00:00

RemoveDuplicates() accepts a single shard and removes all copies stored on
other shards. The naming comes from the fact that it could support
objects removal from the shard which are lower in the HRW vector.

What to test:

  1. Increasing --concurrency flag (1 vs 1000) should have noticeable effect on the command execution time.
  2. Cancelling CLI operation should cancel background command (no delete operations in logs).
  3. All deleted object should be available from the same node object get --ttl 1.
RemoveDuplicates() accepts a single shard and removes all copies stored on other shards. The naming comes from the fact that it could support objects removal from the shard which are lower in the HRW vector. What to test: 1. Increasing `--concurrency` flag (1 vs 1000) should have noticeable effect on the command execution time. 2. Cancelling CLI operation should cancel background command (no delete operations in logs). 3. All deleted object should be available from the same node `object get --ttl 1`.
Poster
Owner

Blocked until further discussion.

Blocked until further discussion.
fyrchik added the
blocked
label 2023-04-05 11:14:27 +00:00
fyrchik force-pushed shard-reinsertion from 7d2e088e68 to 945367bb94 2023-04-07 11:52:40 +00:00 Compare
fyrchik changed title from WIP: engine: Allow to remove redundant object copies to engine: Allow to remove redundant object copies 2023-04-07 11:52:48 +00:00
fyrchik added the
frostfs-node
label 2023-04-07 11:53:18 +00:00
aarifullin reviewed 2023-04-07 12:05:13 +00:00
@ -0,0 +24,4 @@
pk := key.Get(cmd)
req := &control.DoctorRequest{Body: new(control.DoctorRequest_Body)}
req.Body.Concurrency, _ = cmd.Flags().GetUint32(concurrencyFlag)
Collaborator

[Optinal]

Using GetUint32, GetBool is fine, but if new flags are added to doctor command, then it will be not obvious that concurrencyFlag is uint32 and someNewFlag is someType

I think that is fine to use global variables to initialize flags

ff := doctorCmd.Flags()
ff.BoolVar(&concurrencyVarFlag, concurrencyFlag, false, "Remove duplicate objects")
[Optinal] Using `GetUint32`, `GetBool` is fine, but if new flags are added to `doctor` command, then it will be not obvious that `concurrencyFlag` is uint32 and `someNewFlag` is `someType` I think that is fine to use global variables to initialize flags ``` ff := doctorCmd.Flags() ff.BoolVar(&concurrencyVarFlag, concurrencyFlag, false, "Remove duplicate objects") ```
Poster
Owner

We had a problem with this, because in some cases "the same" flag should have different descriptions/defaults in different commands. With many global variables this had become a mess.

Anyway, I suggest discussing it separately and implementing in all CLI commands atomically, after reaching consensus.

We had a problem with this, because in some cases "the same" flag should have different descriptions/defaults in different commands. With many global variables this had become a mess. Anyway, I suggest discussing it separately and implementing in all CLI commands atomically, after reaching consensus.
aarifullin marked this conversation as resolved
aarifullin reviewed 2023-04-07 12:28:22 +00:00
@ -0,0 +120,4 @@
}
var deletePrm shard.DeletePrm
deletePrm.SetAddresses(addr)
Collaborator

Wouldn't be helpful to log shards that have had the same object?

Wouldn't be helpful to log shards that have had the same object?
Poster
Owner

Given the amount of logs we have, no. The only use-case I see is for testing.
Deletion operation is already logged, may be we can add a single log entry when we start processing a shard.

Given the amount of logs we have, no. The only use-case I see is for testing. Deletion operation is already logged, may be we can add a single log entry when we start processing a shard.
aarifullin marked this conversation as resolved
dstepanov-yadro reviewed 2023-04-07 13:12:14 +00:00
@ -0,0 +61,4 @@
var cursor *meta.Cursor
for {
var listPrm shard.ListWithCursorPrm
listPrm.WithCount(uint32(prm.Concurrency))

Why count = prm.Concurrency?

Why count = ```prm.Concurrency```?
Poster
Owner

Even named constant is magic in this case and it seems logic to depend on the number of workers which process listed object.

What else could we use here?

Even named constant is magic in this case and it seems logic to depend on the number of workers which process listed object. What else could we use here?

If prm.Concurrency = 1 then there will be too many bbolt requests, it seems to me.

What else could we use here?

If I knew for sure... But if you don't have any other ideas, I agree with this approach.

If ```prm.Concurrency = 1``` then there will be too many bbolt requests, it seems to me. > What else could we use here? If I knew for sure... But if you don't have any other ideas, I agree with this approach.
dstepanov-yadro marked this conversation as resolved
@ -0,0 +81,4 @@
}
})
for i := 0; i < defaultRemoveDuplicatesConcurrency; i++ {

Not defaultRemoveDuplicatesConcurrency, but prm.Concurrency ?

Not ```defaultRemoveDuplicatesConcurrency```, but ```prm.Concurrency``` ?
Poster
Owner

Fixed.

Fixed.
dstepanov-yadro marked this conversation as resolved
@ -0,0 +109,4 @@
var existsPrm shard.ExistsPrm
existsPrm.SetAddress(addr)
res, err := shards[i].Exists(existsPrm)

Do we need to exclude shard, where object was found? If object is placed on single shard, will it be deleted?

Do we need to exclude shard, where object was found? If object is placed on single shard, will it be deleted?
Collaborator

If object is placed on single shard, will it be deleted?

It won't and I guess that why found flag is needed. The deletion of the first object is ignored by the flag. If object is met again, then _, err = shards[i].Delete(deletePrm)

> If object is placed on single shard, will it be deleted? It won't and I guess that why `found` flag is needed. The deletion of the first object is ignored by the flag. If object is met again, then `_, err = shards[i].Delete(deletePrm)`
Poster
Owner

Here is the logic:

  1. Take object X from the shard A.
  2. Sort shards with HRW.
  3. The first shard an object is found in is considered "the best".
  4. The object is removed from all other shards.
Here is the logic: 1. Take object X from the shard A. 2. Sort shards with HRW. 3. The first shard an object is found in is considered "the best". 4. The object is removed from all other shards.
dstepanov-yadro marked this conversation as resolved
@ -194,0 +198,4 @@
wResp := &doctorResponseWrapper{new(DoctorResponse)}
wReq := &requestWrapper{m: req}
err := client.SendUnary(cli, common.CallMethodInfoUnary(serviceName, rpcDoctor), wReq, wResp, opts...)

There is no timeout for RPC call?

There is no timeout for RPC call?
Poster
Owner
It is hidden inside the client. https://git.frostfs.info/TrueCloudLab/frostfs-node/src/branch/master/cmd/frostfs-cli/internal/client/sdk.go#L53 (yes, we could improve this)
dstepanov-yadro marked this conversation as resolved
@ -0,0 +9,4 @@
"google.golang.org/grpc/status"
)
func (s *Server) Doctor(ctx context.Context, req *control.DoctorRequest) (*control.DoctorResponse, error) {

Will context be canceled if the command execution is interrupted?

Will ```context``` be canceled if the command execution is interrupted?
Poster
Owner

Haven't tested this, but I would expect this from the gRPC.

Haven't tested this, but I would expect this from the gRPC.
dstepanov-yadro marked this conversation as resolved
fyrchik force-pushed shard-reinsertion from 945367bb94 to 4235c6425f 2023-04-07 13:31:25 +00:00 Compare
fyrchik force-pushed shard-reinsertion from 4235c6425f to d11b15e5da 2023-04-07 13:36:08 +00:00 Compare
Poster
Owner

Added some logs to aid testing (not too much), please look.

Added some logs to aid testing (not too much), please look.
fyrchik requested review from storage-core-committers 2023-04-07 13:43:45 +00:00
fyrchik requested review from storage-core-developers 2023-04-07 13:43:45 +00:00
fyrchik requested review from anikeev-yadro 2023-04-07 13:43:49 +00:00
fyrchik force-pushed shard-reinsertion from d11b15e5da to 988c55c6b9 2023-04-07 13:52:41 +00:00 Compare
aarifullin approved these changes 2023-04-07 13:56:08 +00:00
dstepanov-yadro approved these changes 2023-04-07 13:56:10 +00:00
anikeev-yadro approved these changes 2023-04-07 15:31:49 +00:00
acid-ant approved these changes 2023-04-07 15:58:07 +00:00
fyrchik merged commit b689027d57 into master 2023-04-07 17:25:51 +00:00
fyrchik deleted branch shard-reinsertion 2023-04-07 17:25:51 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#191
There is no content yet.