Add EC replication #1129

Merged
dstepanov-yadro merged 7 commits from dstepanov-yadro/frostfs-node:feat/ec_restore into master 2024-09-04 19:51:08 +00:00
  • Replicate EC chunk to another node

  • Pull EC chunk from another node

  • Lost EC chunk restore

Check scenario:

  • create an EC container
  • check object nodes with frostfs-cli
  • remove one node from netmap with frostfs-adm
  • put objects to created container
  • wait while removed node returns to netmap
  • check object nodes with frostfs-cli

Need to implement EC delete/inhume operations to get correct results.

* Replicate EC chunk to another node * Pull EC chunk from another node * Lost EC chunk restore Check scenario: - create an EC container - check object nodes with frostfs-cli - remove one node from netmap with frostfs-adm - put objects to created container - wait while removed node returns to netmap - check object nodes with frostfs-cli Need to implement EC delete/inhume operations to get correct results.
dstepanov-yadro force-pushed feat/ec_restore from 238aeee47c to 93b2e25f73 2024-05-13 14:11:13 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from c3cfd4d6d6 to c5e5c204c9 2024-05-15 07:25:07 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from c5e5c204c9 to 1afdf3409a 2024-05-15 11:40:56 +00:00 Compare
dstepanov-yadro changed title from WIP: Add EC replication to Add EC replication 2024-05-15 11:58:32 +00:00
dstepanov-yadro requested review from storage-core-committers 2024-05-15 11:58:39 +00:00
dstepanov-yadro requested review from storage-core-developers 2024-05-15 11:58:40 +00:00
fyrchik requested changes 2024-05-15 12:25:24 +00:00
@ -77,3 +77,3 @@
Nodes: nodes,
}
s.replicator.HandleTask(ctx, task, &res)
s.replicator.HandleReplicateTask(ctx, task, &res)
Owner

HandleReplicationTask? handle is a verb, replicate is a verb too

`HandleReplicationTask`? `handle` is a verb, `replicate` is a verb too
Author
Member

Done

Done
fyrchik marked this conversation as resolved
@ -53,11 +54,6 @@ func (p *Policer) processObject(ctx context.Context, addrWithType objectcore.Add
c := &placementRequirements{}
var numOfContainerNodes int
Owner

Seems like a separate fix, unrelated to the commit

Seems like a separate fix, unrelated to the commit
Author
Member

Done

Done
fyrchik marked this conversation as resolved
@ -0,0 +41,4 @@
c := &placementRequirements{}
checkedNodes := newNodeCache()
for i := range nn {
Owner

len(nn) is currently 1 for all EC policies.
Can we simplify the code having this in mind?
Because when it won't be 1, we will also likely have different EC X.Y or REP N descriptors for different indexes in this array.

The distinction between REP/EC could be done based on each nn element, not on the uppermost level.

`len(nn)` is currently 1 for all EC policies. Can we simplify the code having this in mind? Because when it won't be 1, we will also likely have different `EC X.Y` or `REP N` descriptors for different indexes in this array. The distinction between `REP`/`EC` could be done based on each `nn` element, not on the uppermost level.
Author
Member

fixed

fixed
fyrchik marked this conversation as resolved
@ -0,0 +48,4 @@
default:
}
rd := policy.ReplicaDescriptor(i)
// processNodes replaces rd.GetECDataCount() + rd.GetECParityCount() for len(nn[i]) for valid object type.
Owner

Could you elaborate a bit what this comment is trying to explain and why it is here?

Could you elaborate a bit what this comment is trying to explain and why it is here?
Author
Member

Refactored, see last commit.

Refactored, see last commit.
fyrchik marked this conversation as resolved
@ -0,0 +96,4 @@
}
// processECChunk replicates EC chunk if needed.
// Returns True if current chunk should be stored on current node.
Owner

It returns a struct.

It returns a `struct`.
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
@ -0,0 +177,4 @@
return false
}
func (p *Policer) collectRequiredECChunks(nodes []netmap.NodeInfo, objInfo objectcore.Info) map[uint32][]netmap.NodeInfo {
Owner

By design there is only 1 required chunk to store on the node (EC policy implies UNIQUE).
Why do we return a map here? It seems the code could be greatly simplified, having this in mind.

_By design_ there is only 1 required chunk to store on the node (EC policy implies `UNIQUE`). Why do we return a map here? It seems the code could be greatly simplified, having this in mind.
Author
Member

Not always: evacuation or not completed replication may lead that chunk may be stored on different chunks.

Not always: evacuation or not completed replication may lead that chunk may be stored on different chunks.
Owner

Evacuation does not affect the set of required chunks in any way.

Evacuation does not affect the set of required chunks in any way.
Author
Member

Agree, current implementation does not affect. But I think that in case of evacuation from a node, EC chunk should move to another node, even if it does not comply with the storage policy.

Agree, current implementation does not affect. But I think that in case of evacuation from a node, EC chunk should move to another node, even if it does not comply with the storage policy.
Owner

Yes, but it is moved to a node which should not store it, and I have thought this function determines whether the node should store it?
Another way to look at it: policer must depend only on the network map and placement policy, evacuation is a separate process.

Yes, but it is moved to a node which _should not_ store it, and I have thought this function determines whether the node _should_ store it? Another way to look at it: policer _must_ depend only on the network map and placement policy, evacuation is a separate process.
Owner

Actually, it may happen in a situation when we have EC 3.1 and 1 node went offline, so we have 3 nodes in the network map.
What is the behaviour in this case?

Actually, it may happen in a situation when we have `EC 3.1` and 1 node went offline, so we have 3 nodes in the network map. What is the behaviour in this case?
Author
Member

this function determines whether the node should store it

This function collects all nodes that store required chunks.

What is the behaviour in this case?

Policer will get an error from offline node and will skip object.

> this function determines whether the node should store it This function collects all nodes that store required chunks. > What is the behaviour in this case? Policer will get an error from offline node and will skip object.
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed feat/ec_restore from 1afdf3409a to 76bbd193da 2024-05-15 13:09:19 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from 779eb553a9 to ed7f18e3f6 2024-05-15 13:18:06 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from 851c929b62 to 66f540df65 2024-05-16 07:13:17 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from 783ed62992 to a78805c555 2024-05-16 13:29:14 +00:00 Compare
aarifullin approved these changes 2024-05-16 20:31:53 +00:00
fyrchik reviewed 2024-05-17 07:29:51 +00:00
@ -248,2 +247,4 @@
return remoteReader.Head(ctx, prm)
},
),
policer.WithLocalObjectHeaderFunc(func(ctx context.Context, a oid.Address) (*objectSDK.Object, error) {
Owner

Why have you decided to go with 2 func options instead of an interface?

Why have you decided to go with 2 func options instead of an interface?
Author
Member

Because this approach has already been used in policer.

Because this approach has already been used in policer.
fyrchik marked this conversation as resolved
@ -97,2 +107,4 @@
return res.Header(), nil
}
func (h *RemoteReader) Get(ctx context.Context, prm *RemoteRequestPrm) (*objectSDK.Object, error) {
Owner

I am surprised to see Get in the head package, given that we also have get on the same level.
Why is it here?

I am surprised to see `Get` in the `head` package, given that we also have `get` on the same level. Why is it here?
Author
Member

Moved to objectSvc.

Moved to objectSvc.
fyrchik marked this conversation as resolved
@ -0,0 +293,4 @@
return
}
if objInfo.ECInfo.Total-uint32(len(resolved)) > policy.ReplicaDescriptor(0).GetECParityCount() {
var founded []uint32
Owner

Do you mean found? It is already a past participle from find, founded is what happened with Saint-Petersburg in 1703.

Do you mean `found`? It is already a past participle from `find`, `founded` is what happened with Saint-Petersburg in 1703.
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
@ -0,0 +323,4 @@
p.log.Error(logs.PolicerFailedToRestoreObject, zap.Stringer("object", parentAddress), zap.Error(err))
return
}
parts, err = c.Split(parentObject, key)
Owner

You call Reconstruct(), then you call Split on the result. These operations are inverse to each other.
What is the purpose?

You call `Reconstruct()`, then you call `Split` on the result. These operations are inverse to each other. What is the purpose?
Author
Member

Fixed

Fixed
fyrchik marked this conversation as resolved
dstepanov-yadro force-pushed feat/ec_restore from a78805c555 to 4e772b8eb4 2024-05-17 07:56:52 +00:00 Compare
dstepanov-yadro force-pushed feat/ec_restore from 4e772b8eb4 to 7727b7fe21 2024-05-17 07:59:37 +00:00 Compare
aarifullin reviewed 2024-05-17 08:44:59 +00:00
@ -0,0 +168,4 @@
var addr oid.Address
addr.SetContainer(objInfo.Address.Container())
addr.SetObject(indexToObjectID[index])
p.replicator.HandlePullTask(ctx, replicator.Task{
Member

Just for curiosity: do we always need to pull missing chunks from remote node?
processECChunk sets removeLocal = true but a chunk, that is being replicated, is not removed yet until pullRequiredECChunks's run. Could a policer try to reconstruct missing chunk if the local node, by good fortune, keeps required chunks for reconstruction?

I suppose this is not a big deal and barely gives a big profit...

Just for curiosity: do we *always* need to pull missing chunks from remote node? `processECChunk` sets `removeLocal = true` but a chunk, that is being replicated, is not removed yet until `pullRequiredECChunks`'s run. Could a policer try to `reconstruct` missing chunk if the local node, *by good fortune*, keeps required chunks for reconstruction? I suppose this is not a big deal and barely gives a big profit...
Author
Member

By default local node should keep only one chunk.

By default local node should keep only one chunk.
aarifullin reviewed 2024-05-17 09:10:56 +00:00
@ -0,0 +272,4 @@
p.log.Error(logs.PolicerFailedToDecodeECChunkID, zap.Error(err), zap.Stringer("object", parentAddress))
return
}
if existed, exist := chunkIDs[ch.Index]; exist && existed != chunkID {
Member

I found existed, exist is not appropriate pair name. You could rename chunkID to ecInfoChunkID, but existed to chunkID

I found `existed, exist` is not appropriate pair name. You could rename `chunkID` to `ecInfoChunkID`, but `existed` to `chunkID`
Author
Member

Done

Done
aarifullin marked this conversation as resolved
aarifullin reviewed 2024-05-17 09:19:07 +00:00
@ -0,0 +273,4 @@
return
}
if existed, exist := chunkIDs[ch.Index]; exist && existed != chunkID {
p.log.Error(logs.PolicerDifferentObjectIDForTheSameECChunk, zap.Stringer("first", existed),
Member

Please, could you give a brief explanation how this could be that chunks related to the same parent may differ by IDs?

Please, could you give a brief explanation how this could be that chunks related to the same parent may differ by IDs?
Author
Member

A bug that I hope we don't have :)

A bug that I hope we don't have :)
aarifullin marked this conversation as resolved
dstepanov-yadro force-pushed feat/ec_restore from 7727b7fe21 to e9e5ba549e 2024-05-17 09:19:39 +00:00 Compare
aarifullin approved these changes 2024-05-17 09:30:29 +00:00
dstepanov-yadro force-pushed feat/ec_restore from e9e5ba549e to 436c9f5558 2024-05-17 11:36:32 +00:00 Compare
fyrchik approved these changes 2024-05-17 12:01:23 +00:00
acid-ant approved these changes 2024-05-17 14:20:56 +00:00
dstepanov-yadro merged commit 436c9f5558 into master 2024-05-17 14:32:38 +00:00
dstepanov-yadro referenced this pull request from a commit 2024-05-17 14:32:38 +00:00
dstepanov-yadro deleted branch feat/ec_restore 2024-05-17 14:32:39 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1129
No description provided.