netmap: Introduce Replica method #211

Merged
fyrchik merged 1 commit from aarifullin/frostfs-sdk-go:fix/ec_total_count into master 2024-09-04 19:51:15 +00:00
Member

* Logically, ECShardsCount is mutually exclusive with ReplicaNumberByIndex.
* ECShardsCount returns non-zero results if a policy is set by EC and zero if it is set by REP. Thus ECShardsCount and ReplicaNumberByIndex can be used to distinguish policies.
* Add explanatory comments for the methods.

  • Make ReplicaNumberByIndex deprecated.
  • Introduce Replica method that access i-th replica directly. This allows the direct access GetCount() and GetECDataCount, GetECParityCount.
  • Also, fix setIDWithSignature (we calculate checksum before ID is calculated)
~~* Logically, ECShardsCount is mutually exclusive with ReplicaNumberByIndex.~~ ~~* ECShardsCount returns non-zero results if a policy is set by `EC` and zero if it is set by `REP`. Thus ECShardsCount and ReplicaNumberByIndex can be used to distinguish policies.~~ ~~* Add explanatory comments for the methods.~~ * Make ReplicaNumberByIndex deprecated. * Introduce Replica method that access i-th replica directly. This allows the direct access `GetCount()` and `GetECDataCount`, `GetECParityCount`. * Also, fix `setIDWithSignature` (we calculate checksum before ID is calculated)
aarifullin requested review from storage-core-committers 2024-03-25 21:03:28 +00:00
aarifullin requested review from storage-core-developers 2024-03-25 21:03:29 +00:00
aarifullin force-pushed fix/ec_total_count from 2b712ac16f to 38a239fd15 2024-03-25 21:03:59 +00:00 Compare
fyrchik reviewed 2024-03-26 07:31:17 +00:00
netmap/policy.go Outdated
@ -185,1 +185,4 @@
// ECTotalCountByIndex returns the total count of data and parity parts used by EC
// from the i-th replica descriptor. Index MUST be in range [0; NumberOfReplicas()).
func (p PlacementPolicy) ECTotalCountByIndex(i int) uint32 {
Owner

The interface for EC is intentionally minimal, we haven't even written in node anything yet, why this PR?

The interface for EC is intentionally minimal, we haven't even written in node anything yet, why this PR?
Author
Member

Firstly, I thought we need to distinguish how we set replicas[i] - either by REP or by EC. Anyway, even the introduced function seems inconvenient, we need to fix ReplicaNumberByIndex then - it should use countNodes(p.replicas[i]) instread, because it retruns 0 for EC policy right now

Firstly, I thought we need to distinguish how we set `replicas[i]` - either by `REP` or by `EC`. Anyway, even the introduced function seems inconvenient, we need to fix `ReplicaNumberByIndex` then - it should use `countNodes(p.replicas[i])` instread, because it retruns `0` for EC policy right now
Author
Member

I have removed this method and fixed ReplicaNumberByIndex.
Commit message is also fixed. Please, check this out. We need this fix because we need to get EC parts total count. But still here is a problem how we can distinguish a replica is set - by REP or by EC?

I have removed this method and fixed `ReplicaNumberByIndex`. Commit message is also fixed. Please, check this out. We need this fix because we need to get `EC` parts total count. But still here is a problem how we can distinguish a replica is set - by `REP` or by `EC`?
Owner

Oh, I see what you did.
So it is motivated by internal SDK needs?
Doesn't node selection work right now? Probably missed the test in my initial PR.

Anyway, what about ECShardsCount (uint32, uint32) then? It seems if we need total amount from a public method, we also need data and parity?

Oh, I see what you did. So it is motivated by internal SDK needs? Doesn't node selection work right now? Probably missed the test in my initial PR. Anyway, what about `ECShardsCount (uint32, uint32)` then? It seems if we need total amount from a public method, we also need data and parity?
Author
Member

This sounds goods. Anyway, as I said, we need to distinguish the approach and thus ECShardsCount seems OK for that

This sounds goods. Anyway, as I said, we need to distinguish the approach and thus `ECShardsCount` seems OK for that
aarifullin force-pushed fix/ec_total_count from 38a239fd15 to 5007db4579 2024-03-26 08:36:11 +00:00 Compare
aarifullin force-pushed fix/ec_total_count from 5007db4579 to 56e75743ec 2024-03-26 08:38:22 +00:00 Compare
aarifullin changed title from netmap: Introduce ECTotalCountByIndex method to netmap: Fix ReplicaNumberByIndex method 2024-03-26 08:39:11 +00:00
fyrchik reviewed 2024-03-26 12:04:51 +00:00
netmap/policy.go Outdated
@ -181,3 +181,3 @@
// Zero PlacementPolicy has no replicas.
func (p PlacementPolicy) ReplicaNumberByIndex(i int) uint32 {
return p.replicas[i].GetCount()
return countNodes(p.replicas[i])
Owner

Comment needs to be adjusted too.
How do we distinguish between REP and EC policies in existing places?

Comment needs to be adjusted too. How do we distinguish between REP and EC policies in existing places?
Author
Member

You suggested the idea I liked - ECShardsCount (uint32, uint32). So, ReplicaNumberByIndex will be used for REP polices and ECShardsCount - for EC

You suggested the idea I liked - `ECShardsCount (uint32, uint32)`. So, `ReplicaNumberByIndex` will be used for `REP` polices and `ECShardsCount` - for `EC`
aarifullin force-pushed fix/ec_total_count from 56e75743ec to b42bee3454 2024-03-27 11:46:59 +00:00 Compare
aarifullin reviewed 2024-03-27 11:47:31 +00:00
go.mod Outdated
@ -2,6 +2,8 @@ module git.frostfs.info/TrueCloudLab/frostfs-sdk-go
go 1.20
replace git.frostfs.info/TrueCloudLab/frostfs-api-go/v2 v2.16.1-0.20240319122301-1772b921826b => git.frostfs.info/aarifullin/frostfs-api-go/v2 v2.15.1-0.20240327095603-491a47e7fe24
Author
Member

This will be fixed as soon as #70 is merged

This will be fixed as soon as [#70](https://git.frostfs.info/TrueCloudLab/frostfs-api-go/pulls/70/files) is merged
Owner

It is merged.

It is merged.
aarifullin marked this conversation as resolved
aarifullin force-pushed fix/ec_total_count from b42bee3454 to 9536991d7d 2024-03-27 11:48:21 +00:00 Compare
aarifullin force-pushed fix/ec_total_count from 9536991d7d to fb4a31d03e 2024-03-27 11:49:14 +00:00 Compare
aarifullin force-pushed fix/ec_total_count from fb4a31d03e to b59dac7aee 2024-03-27 11:50:22 +00:00 Compare
aarifullin changed title from netmap: Fix ReplicaNumberByIndex method to netmap: Introduce ECShardsCount method 2024-03-27 12:06:38 +00:00
fyrchik reviewed 2024-03-28 08:18:50 +00:00
netmap/policy.go Outdated
@ -182,1 +182,4 @@
//
// If the result is zero, then EC policy might be used - use ECShardsCountByIndex
// to check that.
func (p PlacementPolicy) ReplicaNumberByIndex(i int) uint32 {
Owner

I am not wondering:
Consider policer. It could happily remove the replica because ReplicaNumberByIndex returns 0. Not saying it will happen, but looks like a probable mistake. In general it seems dangerous for automatic processing. However, returning a non-zero number here is also not perfect.

How about returning replica with Replica(i int) Replicaor may be TypeOf(i int) and then switch based on type?
We can panic if ReplicaNumberByIndex is used for EC then.
Another option is to add , bool) to the return value which signifies success.

I am not wondering: Consider policer. It _could_ happily remove the replica because `ReplicaNumberByIndex` returns 0. Not saying it will happen, but looks like a probable mistake. In general it seems dangerous for automatic processing. However, returning a non-zero number here is also not perfect. How about returning replica with `Replica(i int) Replica`or may be `TypeOf(i int)` and then switch based on type? We can panic if `ReplicaNumberByIndex` is used for EC then. Another option is to add `, bool)` to the return value which signifies success.

Consider policer.

If policer uses old SDK version, then ReplicaNumberByIndex returns 0 for EC anyway.
If policer uses new (with EC) SDK version, then policer must know how to deal with EC.

> Consider policer. If policer uses old SDK version, then `ReplicaNumberByIndex` returns 0 for EC anyway. If policer uses new (with EC) SDK version, then policer must know how to deal with EC.
Author
Member

then policer must know how to deal with EC.

That's the good point. Anyway, we cannot distinguish how Replica was set - either by REP or by EC because there is no indicating flag. The only way is to use these numbers (approaches suggested by @fyrchik will use counting anyway): if ReplicaNumberByIndex > 0, then REP is used; if ECShardsCount > 0, then EC is used; if they are both zero, then the policer can consider that Replica is REP 0.

> then policer must know how to deal with EC. That's the good point. Anyway, we cannot distinguish how `Replica` was set - either by `REP` or by `EC` because there is no indicating flag. **The only way is to use these numbers** (approaches suggested by @fyrchik will use counting *anyway*): if `ReplicaNumberByIndex > 0`, then `REP` is used; if `ECShardsCount > 0`, then `EC` is used; if they are both zero, then the policer can consider that `Replica` is `REP 0`.
Owner

My point is that it is really easy to forget EC case and it is can be hard to see what was forgotten.

My point is that it is really easy to forget EC case and it is can be hard to see what was forgotten.
Author
Member

Alright. To avoid the confusion, I've introduced Replica method and made ReplicaNumberByIndex deprecated. So, Replica allows to check GetCount() and if it's > 0 then the policy is set by REP. EC is set if GetECDataCount() + GetECParityCount() > 0. The SDK's client won't forget this fact since he accesses GetECDataCount(), GetECParityCount()

Alright. To avoid the confusion, I've introduced `Replica` method and made `ReplicaNumberByIndex` deprecated. So, `Replica` allows to check `GetCount()` and if it's `> 0` then the policy is set by `REP`. `EC` is set if `GetECDataCount() + GetECParityCount() > 0`. The SDK's client won't forget this fact since he accesses `GetECDataCount(), GetECParityCount()`
aarifullin force-pushed fix/ec_total_count from b59dac7aee to 1c0244baf1 2024-03-28 09:34:34 +00:00 Compare
aarifullin force-pushed fix/ec_total_count from 1c0244baf1 to a6dffd2cb9 2024-03-28 12:37:17 +00:00 Compare
aarifullin changed title from netmap: Introduce ECShardsCount method to netmap: Introduce Replica method 2024-03-28 12:38:19 +00:00
aarifullin force-pushed fix/ec_total_count from a6dffd2cb9 to e1f70fa79d 2024-03-28 12:59:36 +00:00 Compare
fyrchik approved these changes 2024-03-28 15:29:36 +00:00
@ -44,12 +44,12 @@ func (c *Constructor) Split(obj *objectSDK.Object, key *ecdsa.PrivateKey) ([]*ob
}
func setIDWithSignature(obj *objectSDK.Object, key *ecdsa.PrivateKey) error {
objectSDK.CalculateAndSetPayloadChecksum(obj)
Owner

Is there some test that fails with the old behaviour? The change seems pretty important, could you write such test?

Is there some test that fails with the old behaviour? The change seems pretty important, could you write such test?

To check: create parts, then validate header by CheckHeaderVerificationFields

To check: create parts, then validate header by `CheckHeaderVerificationFields`
Author
Member

I added header verification in TestSplitMaxShardCount for both cases. Is that okay?
It seems it is better to verify it not with separte unit-test for setIDWithSignature but when parts are created!

I added header verification in `TestSplitMaxShardCount` for both cases. Is that okay? It seems it is better to verify it not with separte unit-test for `setIDWithSignature` but when parts are created!
dstepanov-yadro reviewed 2024-03-28 15:31:20 +00:00
netmap/policy.go Outdated
@ -184,2 +186,4 @@
}
// Replica returns i-th replica descriptor. Index MUST be in range [0; NumberOfReplicas()).
func (p PlacementPolicy) Replica(i int) *netmap.Replica {

netmap.Replica - it is api-go type. I think it is better to add some sdk wrapper.

`netmap.Replica` - it is `api-go` type. I think it is better to add some `sdk` wrapper.
Author
Member

Alright. Actually, the same can be said about object and container - they are also confusing names, because we've got the same in api-go :) I renamed the method to ReplicaDescriptor to make entities more distinguishable

Alright. Actually, the same can be said about `object` and `container` - they are also confusing names, because we've got the same in api-go :) I renamed the method to `ReplicaDescriptor` to make entities more distinguishable
Owner

The point is thatObject and Container have aliases and are used in node as objectSDK.Object

The point is that`Object` and `Container` have aliases and are used in node as `objectSDK.Object`
aarifullin force-pushed fix/ec_total_count from dc68f99dad to 0dfd57aa6b 2024-03-28 15:55:38 +00:00 Compare
aarifullin requested review from fyrchik 2024-03-28 15:57:22 +00:00
aarifullin force-pushed fix/ec_total_count from 0dfd57aa6b to 34ede4cfaf 2024-03-28 21:25:19 +00:00 Compare
dstepanov-yadro approved these changes 2024-03-29 06:19:08 +00:00
acid-ant approved these changes 2024-03-29 06:48:09 +00:00
aarifullin force-pushed fix/ec_total_count from 34ede4cfaf to 9cf5d4b797 2024-03-29 08:45:53 +00:00 Compare
aarifullin force-pushed fix/ec_total_count from 9cf5d4b797 to 903e33b311 2024-03-29 08:47:17 +00:00 Compare
aarifullin requested review from dstepanov-yadro 2024-03-29 08:49:09 +00:00
aarifullin requested review from acid-ant 2024-03-29 08:49:09 +00:00
aarifullin force-pushed fix/ec_total_count from 903e33b311 to 0632bfb190 2024-03-29 08:51:46 +00:00 Compare
dstepanov-yadro approved these changes 2024-03-29 09:31:50 +00:00
acid-ant approved these changes 2024-03-29 10:34:19 +00:00
aarifullin force-pushed fix/ec_total_count from 0632bfb190 to ec0cb2169f 2024-03-29 10:48:32 +00:00 Compare
aarifullin requested review from dstepanov-yadro 2024-03-29 10:48:37 +00:00
aarifullin requested review from acid-ant 2024-03-29 10:48:38 +00:00
dstepanov-yadro approved these changes 2024-03-29 10:57:18 +00:00
fyrchik approved these changes 2024-03-29 10:57:56 +00:00
fyrchik merged commit ec0cb2169f into master 2024-03-29 10:58:13 +00:00
fyrchik referenced this pull request from a commit 2024-03-29 10:58:15 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-sdk-go#211
No description provided.