object/transformer: Add expiration epoch to each part of big object #319

Merged
fyrchik merged 1 commit from a-savchuk/frostfs-sdk-go:parent-parent-attrs-in-ec into master 2025-02-03 11:13:41 +00:00
Member

Suppose you have a container with an EC placement policy and you want to store a big object. The object is divided as follows: object -> parts -> chunks. After splitting, the original object's attributes are stored in the last part and linking object as the parent's header. When the parts are divided into chunks, the chunks don't contain information about the original object's attributes, particularly expiration epoch. As a result, chunks of a big object do not expire.

Let's add expiration epoch to each (for now only parts on splitting a big object.

I've checked this new solution not only for the case described above, but also for all possible scenarios: different object sizes (small and big objects) and different placement policies (REP and EC). Checked with dev-env, for details, see my script. I also attached the script output here.

Suppose you have a container with an EC placement policy and you want to store a big object. The object is divided as follows: object -> parts -> chunks. After splitting, the original object's attributes are stored in the last part and linking object as the parent's header. When the parts are divided into chunks, the chunks don't contain information about the original object's attributes, particularly expiration epoch. As a result, chunks of a big object do not expire. Let's add expiration epoch to each (for now only parts on splitting a big object. I've checked this new solution not only for the case described above, but also for all possible scenarios: different object sizes (small and big objects) and different placement policies (REP and EC). Checked with dev-env, for details, see [my script](https://git.frostfs.info/a-savchuk/misc/src/commit/b7ed457b617a37a558be1a667b8792e49f8c19b5/object_expiration.py). I also attached the script output here.
a-savchuk added 1 commit 2025-01-15 11:20:37 +00:00
[#xx] object: Store EC chunk's parent parent's attributes
Some checks failed
DCO / DCO (pull_request) Failing after 26s
Tests and linters / Tests (pull_request) Successful in 50s
Tests and linters / Lint (pull_request) Successful in 2m14s
cda1c5b40d
In FrostFS we can:
- Split a big object into parts: Object -> Parts
- Split an object into chunks: Object -> Chunks
- Do both: Object -> Parts -> Chunks

And object's attributes are propagated in the following way:
- Object (attributes) -> Parts (no attributes)
- Object (attributes) -> Chunks (attributes)
- Object (attributes) -> Parts (no attributes) -> Chunks (no attributes)

As a result, in a FrostFS node, there's no way to determine the expiration epoch
of an expirable regular object. Now attributes are stored in the following way:
- If a chunk's parent has no parent, store chunk's parent's attributes
- If a chunk's parent has a parent, store chunk's parent's parent's attributes

Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
requested reviews from storage-services-developers, storage-core-committers, storage-core-developers, storage-services-committers 2025-01-15 11:20:37 +00:00
a-savchuk force-pushed parent-parent-attrs-in-ec from cda1c5b40d to 0da5be5632 2025-01-15 11:22:03 +00:00 Compare
fyrchik reviewed 2025-01-15 11:56:08 +00:00
@ -33,19 +33,21 @@ func (c *Constructor) Split(obj *objectSDK.Object, key *ecdsa.PrivateKey) ([]*ob
chunk.SetPayload(payloadShards[i])
chunk.SetPayloadSize(uint64(len(payloadShards[i])))
attributes := obj.Attributes()
Owner
  1. We cannot modify Split() function so easily -- we have already created EC objects on deployed systems, what will happen with them?
  2. What about obj.Attributes()? Does it contain any info if obj.Parent() != nil?
  3. We tried to explicitly avoid copying parent attributes to EC attributes, now we duplicate them in every chunk?
  4. ParentAttributes() has parent attributes, but Attributes() has actually Parent.Parent.Attributes(). It seems misleading.
1. We cannot modify `Split()` function so easily -- we have already created EC objects on deployed systems, what will happen with them? 2. What about `obj.Attributes()`? Does it contain any info if `obj.Parent() != nil`? 3. We tried to explicitly avoid copying parent attributes to EC attributes, now we duplicate them in every chunk? 4. `ParentAttributes()` has parent attributes, but `Attributes()` has actually `Parent.Parent.Attributes()`. It seems misleading.
Author
Member

We cannot modify Split() function so easily -- we have already created EC objects on deployed systems, what will happen with them?

I think nothing will happen. It'll make new objects expire correctly, but old object won't be deleted upon expiration.

What about obj.Attributes()? Does it contain any info if obj.Parent() != nil?

For now, no. After the recent changes, each part will have an expiration epoch.

We tried to explicitly avoid copying parent attributes to EC attributes, now we duplicate them in every chunk?

Yes, this is a problem. New solution is to add an expiration epoch only.

ParentAttributes() has parent attributes, but Attributes() has actually Parent.Parent.Attributes(). It seems misleading.

I think that's not accurate. I only changed the attributes returned by ParentAttributes(), as far as I can see, for now Attributes() currently returns nothing.

Anyway, now I add an expiration epoch during the first splitting of an object instead.

> We cannot modify Split() function so easily -- we have already created EC objects on deployed systems, what will happen with them? I think nothing will happen. It'll make _new_ objects expire correctly, but _old_ object won't be deleted upon expiration. > What about obj.Attributes()? Does it contain any info if obj.Parent() != nil? For now, no. After the recent changes, each part will have an expiration epoch. > We tried to explicitly avoid copying parent attributes to EC attributes, now we duplicate them in every chunk? Yes, this is a problem. New solution is to add an expiration epoch only. > ParentAttributes() has parent attributes, but Attributes() has actually Parent.Parent.Attributes(). It seems misleading. I think that's not accurate. I only changed the attributes returned by `ParentAttributes()`, as far as I can see, for now `Attributes()` currently returns nothing. **Anyway, now I add an expiration epoch during the first splitting of an object instead.**
a-savchuk changed title from object: Store EC chunk's parent parent's attributes to WIP: object: Store EC chunk's parent parent's attributes 2025-01-15 12:49:47 +00:00
a-savchuk force-pushed parent-parent-attrs-in-ec from 0da5be5632 to 48b48ced15 2025-01-28 11:40:12 +00:00 Compare
a-savchuk force-pushed parent-parent-attrs-in-ec from 48b48ced15 to 8389887a34 2025-02-02 15:16:45 +00:00 Compare
a-savchuk changed title from WIP: object: Store EC chunk's parent parent's attributes to object/transformer: Add expiration epoch to each part of big object 2025-02-02 19:04:40 +00:00
dstepanov-yadro approved these changes 2025-02-03 06:27:18 +00:00
acid-ant approved these changes 2025-02-03 06:33:26 +00:00
achuprov approved these changes 2025-02-03 06:57:27 +00:00
fyrchik approved these changes 2025-02-03 09:34:48 +00:00
@ -330,0 +331,4 @@
// add expiration epoch to each part
for _, attr := range s.parAttrs {
if attr.Key() == objectV2.SysAttributeExpEpoch {
Owner

We may rely on s.current having empty set of attributes in other places.

  1. Could you check that if you emit the first part and then override attribute set (for some reason), there would be no data races?
  2. This code belongs to prepareFirstChild, why do second child inherit the attribute?
We may rely on `s.current` having empty set of attributes in other places. 1. Could you check that if you emit the first part and then override attribute set (for some reason), there would be no data races? 2. This code belongs to `prepareFirstChild`, why do second child inherit the attribute?
Author
Member

Could you check that if you emit the first part and then override attribute set (for some reason), there would be no data races?

I'll check

This code belongs to prepareFirstChild, why do second child inherit the attribute?

Because each child is initialized from the previous one

s.current = fromObject(s.current)

> Could you check that if you emit the first part and then override attribute set (for some reason), there would be no data races? I'll check > This code belongs to prepareFirstChild, why do second child inherit the attribute? Because each child is initialized from the previous one https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/commit/593dd77d841aa6652377d3755684d0a968e25fff/object/transformer/transformer.go#L87
fyrchik merged commit 8389887a34 into master 2025-02-03 11:13:41 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-services-developers
TrueCloudLab/storage-services-committers
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-sdk-go#319
No description provided.