GetSubTree with sort order does not return more than 1000 elements with the same FileName attribute #1642
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1642
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Expected Behavior
GetSubTree RPC returns all available nodes if they have non-empty
FileName
attributeCurrent Behavior
GetSubTree RPC returns up to 1000 nodes for nodes with the same non-empty
FileName
attribute.Steps to Reproduce (for bugs)
Context
S3 gateway may store multiple versions of the same object, so these tree node all have the same FilePath attribute but different object id in the meta. When gate tries to list all available versions, it does not show more than 1000 versions of the same object. In the meantime, it works as expected when FileName attribute is different.
We've already met this issue with 1000 objects while listing multipart data: TrueCloudLab/frostfs-s3-gw#472. Multipart tree nodes does not have
FileName
attribute so we had to disable sorting to return more than 1000 objects. However this is not the case now.Regression
Seems like no.
Your Environment
frostfs-node v0.42.18
frostfs-noed v0.44.8
The problem can't be solved so easily. We create "same-version" nodes outgoing from the same parent ID:
Let's have a look at getSortedSubTree. The invocation performs depth-traversal.
When
parentID
is processed here, we go into boltForest.1000
nodes.1000
comes from batchSize"same-version"
)The problem is that
fillSortedChildren
is aiming to not duplicate results, but can't seek to correct childID to start repicking nodes with the same filenameTemporary solution: increase batch size
I think we can keep it as it is for now, 1000 limit is no worse than 2000 or 3000 limit, in my opinion.
Can we say that this batch limit can be reworked with next version on tree service?