Fix Get EC object from non container node #1253

Merged
dstepanov-yadro merged 1 commit from dstepanov-yadro/frostfs-node:fix/ec_get_non_container_node into master 2024-09-04 19:51:10 +00:00

How it was: first, we collected information about the chunks of the EC object, taking into account the TTL, then we requested the chunks. This led to the fact that there was not enough TTL to collect information about the chunks: client -> node1 (non container, ttl=2) -> node2 (container, ttl=1), node2 can't get other chunks because of ttl=1.
How it became: first we get information that the object is an EU object, and then we collect information about the chunks and the chunks themselves in separate requests from the container node.

How it was: first, we collected information about the chunks of the EC object, taking into account the TTL, then we requested the chunks. This led to the fact that there was not enough TTL to collect information about the chunks: client -> node1 (non container, ttl=2) -> node2 (container, ttl=1), node2 can't get other chunks because of ttl=1. How it became: first we get information that the object is an EU object, and then we collect information about the chunks and the chunks themselves in separate requests from the container node.
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 19db4bd93d to 795490b790 2024-07-17 06:11:28 +00:00 Compare
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 795490b790 to ead691417b 2024-07-17 06:12:35 +00:00 Compare
dstepanov-yadro reviewed 2024-07-17 06:24:09 +00:00
@ -32,2 +36,3 @@
}
p := getsvc.HeadPrm{}
p.SetCommonParameters(exec.commonParameters())
p.SetCommonParameters(&commonParameters)
Author
Member

commonParameters may be changed during Head request (ForgetTokens, WithLocalOnly), so need a copy. Founded with autotests.

`commonParameters` may be changed during Head request (`ForgetTokens`, `WithLocalOnly`), so need a copy. Founded with autotests.
Owner

Previously it was possible to pass nil, now we pass a pointer to the empty struct.
This is a change in behaviour, can anything bad happen?

Previously it was possible to pass `nil`, now we pass a pointer to the empty struct. This is a change in behaviour, can anything bad happen?
Owner

Previously it was possible to pass nil, now we pass a pointer to the empty struct.
This is a change in behaviour, can anything bad happen?

Previously it was possible to pass `nil`, now we pass a pointer to the empty struct. This is a change in behaviour, can anything bad happen?
Author
Member

Fixed

Fixed
dstepanov-yadro reviewed 2024-07-17 06:29:51 +00:00
@ -36,2 +37,4 @@
r.log.Debug(logs.GetTryingToAssembleTheECObject)
// initialize epoch number
ok := r.initEpoch()
Author
Member

Logging and status change - inside r.initEpoch() method.

Logging and status change - inside `r.initEpoch()` method.
dstepanov-yadro force-pushed fix/ec_get_non_container_node from ead691417b to 84d58795ce 2024-07-17 06:55:46 +00:00 Compare
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 84d58795ce to 47de122454 2024-07-17 07:05:46 +00:00 Compare
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 47de122454 to acc6c917b1 2024-07-17 07:22:36 +00:00 Compare
dstepanov-yadro force-pushed fix/ec_get_non_container_node from acc6c917b1 to 6917d9b285 2024-07-17 07:31:51 +00:00 Compare
fyrchik reviewed 2024-07-17 07:37:57 +00:00
@ -104,2 +142,2 @@
parts := a.retrieveParts(ctx, false)
c, err := a.getConstructor()
objID := a.addr.Object()
trav, cnr, err := a.traverserGenerator.GenerateTraverser(a.addr.Container(), &objID, a.epoch)
Owner

This part seems to be common for all reconstruct*() helpers. If we move it one level above, will it make the code cleaner?

trav, cnr, err := a.traverserGenerator.GenerateTraverser(a.addr.Container(), &objID, a.epoch)
if err != nil {
	return nil, err
}
c, err := a.getConstructor(cnr)
if err != nil {
	return nil, err
}
This part seems to be common for all `reconstruct*()` helpers. If we move it one level above, will it make the code cleaner? ``` trav, cnr, err := a.traverserGenerator.GenerateTraverser(a.addr.Container(), &objID, a.epoch) if err != nil { return nil, err } c, err := a.getConstructor(cnr) if err != nil { return nil, err } ```
Author
Member

Done

Done
@ -156,0 +200,4 @@
chunksGuard.Lock()
defer chunksGuard.Unlock()
for _, ch := range nodeChunks {
chunks[string(ch.ID.GetValue())] = ch
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Owner

We can override the old value here, does it matter?

We can override the old value here, does it matter?
Author
Member

ch.ID is objectID. In certain situations, you can get the same chunk from different nodes.

`ch.ID` is objectID. In certain situations, you can get the same chunk from different nodes.
fyrchik approved these changes 2024-07-17 07:43:19 +00:00
@ -159,1 +233,4 @@
}
func (a *assemblerec) processECNodesRequests(ctx context.Context, nodes []placement.Node, dataCount, parityCount int) ([]*objectSDK.Object, error) {
foundedChunks := make(map[uint32]*objectSDK.Object)
Owner

s/founded/found, founded is what happened with Saint-Petersburg in 1703 :)

`s/founded/found`, founded is what happened with Saint-Petersburg in 1703 :)
Author
Member

Fixed

Fixed
@ -474,6 +478,7 @@ func testNodeMatrix(t testing.TB, dim []int) ([][]netmap.NodeInfo, [][]string) {
var ni netmap.NodeInfo
ni.SetNetworkEndpoints(a)
ni.SetPublicKey([]byte(a))
Owner

Why is it important?

Why is it important?
Owner

Why is it important?

Why is it important?
Author
Member

Get service uses node's public key as map key for EC chunks. Without this change all chunks belong to the same node without public key in unit tests, so unit test fails.

`Get` service uses node's public key as map key for EC chunks. Without this change all chunks belong to the same node without public key in unit tests, so unit test fails.
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 6917d9b285 to 0d76caf8df 2024-07-17 08:02:57 +00:00 Compare
aarifullin reviewed 2024-07-17 11:15:00 +00:00
@ -160,0 +257,4 @@
return ctx.Err()
default:
}
chunks := a.tryGetChunkListFromNode(ctx, info)
Member

Just for my curiosity:

We get the list of chunks available on the observing node - could these sets of chunks be intersected by some reason?
It still works as you're using foundChunks map but can be some chunks is gotten by tryGetChunkFromRemoteStorage a few times?

Just for my curiosity: We get the list of chunks available on the observing `node` - could these sets of `chunks` be intersected by some reason? It still works as you're using `foundChunks` map but can be some chunks is gotten by `tryGetChunkFromRemoteStorage` a few times?
Author
Member

Yes, in case of remove shard/return shard scenario. But it looks like rare case.

Yes, in case of remove shard/return shard scenario. But it looks like rare case.
aarifullin approved these changes 2024-07-17 11:19:53 +00:00
dstepanov-yadro force-pushed fix/ec_get_non_container_node from 0d76caf8df to e83d39e33f 2024-07-17 11:25:00 +00:00 Compare
fyrchik approved these changes 2024-07-17 12:08:03 +00:00
@ -156,0 +203,4 @@
dataCount := policy.ECDataCount(cnr.Value.PlacementPolicy())
parityCount := policy.ECParityCount(cnr.Value.PlacementPolicy())
remoteNodes := make([]placement.Node, 0)
Owner

var remoteNodes []placement.Node?

`var remoteNodes []placement.Node`?
dstepanov-yadro merged commit e83d39e33f into master 2024-07-17 14:58:31 +00:00
dstepanov-yadro deleted branch fix/ec_get_non_container_node 2024-07-17 14:59:15 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#1253
No description provided.