Optimize complex object assembly #539

Merged
fyrchik merged 2 commits from dstepanov-yadro/frostfs-node:fix/assembling_optimizations into master 2023-07-27 14:30:36 +00:00

As is: The node reads objects in memory sequentially, allocating a new buffer for each child object.

Optimization, version 1: Removed the allocation of a new buffer for each child object.

Optimization, version 2: The child object is written direct to the output stream. Side effect: previously, the object was validated first, then the payload was sent; now we send the payload first, then we validate.

See commits for realization details.

Dev-env test results on local PC: 10 readers, 10 minutes, object size 512MB

version/metric rate, MB/s Max process_resident_memory, GB GC count
master branch 204 1.55 3.5K
opt v1 200 1.55 (but avg lower than master) 2.7K
opt v2 225 1.3 (1.1 avg) 2.6K

master
image

v1
image

v2
image

As is: The node reads objects in memory sequentially, allocating a new buffer for each child object. Optimization, version 1: Removed the allocation of a new buffer for each child object. Optimization, version 2: The child object is written direct to the output stream. Side effect: previously, the object was validated first, then the payload was sent; now we send the payload first, then we validate. See commits for realization details. Dev-env test results on local PC: 10 readers, 10 minutes, object size 512MB | version/metric| rate, MB/s | Max process_resident_memory, GB | GC count | | ------------- | ---------- | ---------------------------------| -------- | | master branch | 204 | 1.55 | 3.5K | | opt v1 | 200 | 1.55 (but avg lower than master) | 2.7K | | opt v2 | 225 | 1.3 (1.1 avg) | 2.6K | master ![image](/attachments/33aecd05-427b-45c7-9dc4-2c52b206ebf0) v1 ![image](/attachments/e4586d1c-37aa-43ed-b8df-9d7498b96d05) v2 ![image](/attachments/6763a694-c752-47fb-ad35-aad43f172d06)
134 KiB
134 KiB
137 KiB
dstepanov-yadro force-pushed fix/assembling_optimizations from 5962672fae to d3128378a1 2023-07-24 11:51:49 +00:00 Compare
dstepanov-yadro requested review from storage-core-committers 2023-07-24 12:03:51 +00:00
dstepanov-yadro requested review from storage-core-developers 2023-07-24 12:03:51 +00:00
fyrchik approved these changes 2023-07-24 13:17:12 +00:00
fyrchik reviewed 2023-07-24 13:17:33 +00:00
@ -51,9 +51,7 @@ func NewSimpleObjectWriter() *SimpleObjectWriter {
func (s *SimpleObjectWriter) WriteHeader(_ context.Context, obj *objectSDK.Object) error {
s.obj = obj

argh

argh
dstepanov-yadro force-pushed fix/assembling_optimizations from d3128378a1 to 4e9e0e66fc 2023-07-25 14:29:52 +00:00 Compare
aarifullin reviewed 2023-07-27 09:04:00 +00:00
@ -11,3 +11,3 @@
type objectGetter interface {
GetObject(ctx context.Context, id oid.ID, rng *objectSDK.Range) (*objectSDK.Object, error)
GetObjectAndWritePayload(ctx context.Context, id oid.ID, rng *objectSDK.Range, writer ChunkWriter) (*objectSDK.Object, error)
Collaborator

I find the idea for the such optimization as excellent, but didn't you consider to return payload from the method instead passing the writer?

type ObjectAndFriends struct {
   object *objectSDK.Object
   pld    []byte
}

for

func (r *request) GetObjectAndWritePayload(ctx context.Context, id oid.ID, rng *objectSDK.Range, writer ChunkWriter) (ObjectAndFriends, error) {
}

I am afraid that with this hack (that works perfectly anyway) we may break SRP (I am really confused when one method gets and sends something in the same place) and dependency inversion (the point to pass an instance that implements the interface gives us the opportunity that we barely can use - we pass NewSimpleObjectWriter but will we ever wrap something else?)

Fix me, if I skipped or didn't get something!

I find the idea for the such optimization as excellent, but didn't you consider to return payload from the method instead passing the writer? ``` type ObjectAndFriends struct { object *objectSDK.Object pld []byte } ``` for ``` func (r *request) GetObjectAndWritePayload(ctx context.Context, id oid.ID, rng *objectSDK.Range, writer ChunkWriter) (ObjectAndFriends, error) { } ``` I am afraid that with this hack (that works perfectly anyway) we may break SRP (I am really confused when one method gets and sends something in the same place) and dependency inversion (the point to pass an instance that implements the interface gives us the opportunity that we barely can use - we pass `NewSimpleObjectWriter` but will we ever wrap something else?) Fix me, if I skipped or didn't get something!
Poster
Collaborator

return payload from the method instead passing the writer?

Then we will have to allocate memory for the entire object, and I'm just trying to avoid this.

we pass NewSimpleObjectWriter but will we ever wrap something else

NewSimpleObjectWriter() passes only in one place. See assemblePayloadByObjectIDs: writer writes directly to response stream.

> return payload from the method instead passing the writer? Then we will have to allocate memory for the entire object, and I'm just trying to avoid this. > we pass NewSimpleObjectWriter but will we ever wrap something else `NewSimpleObjectWriter()` passes only in one place. See `assemblePayloadByObjectIDs`: `writer` writes directly to response stream.
dstepanov-yadro force-pushed fix/assembling_optimizations from 4e9e0e66fc to 99bb488ebd 2023-07-27 14:02:28 +00:00 Compare
aarifullin approved these changes 2023-07-27 14:26:08 +00:00
aarifullin left a comment
Collaborator

Awesome 👍

Awesome 👍
fyrchik merged commit 99bb488ebd into master 2023-07-27 14:30:36 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#539
There is no content yet.