[#118] node: add unit concurrent tests for blobstor #233
No reviewers
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#233
Loading…
Reference in a new issue
No description provided.
Delete branch "aarifullin/frostfs-node:feature/blobstor_concurrent_tests"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Signed-off-by: Airat Arifullin a.arifullin@yadro.com
@ -176,0 +236,4 @@
})
}
func TestConcurrentDelete(t *testing.T) {
No more good useful scenario came to my mind apart from the two ones. Would be glad to listen to your ideas
@ -176,0 +182,4 @@
smallSizeLimit = 512
// concurrentPutCount is fstree implementation specific
concurrentPutCount = 5
@fyrchik
I've got this from this constant.
It looks like the constant sets the restriction for the concurent putting. Should we get this documented somehow?
I'm a bit skeptical about the value of these tests. A few questions:
I guess they will still be useful to catch obvious/basic concurrency issues, just wondering whether we can or should do better.
@ -176,0 +264,4 @@
prm.Address = object.AddressOf(obj)
if _, err := b.Delete(prm); err != nil {
require.ErrorContains(t, err, "object not found")
}
Maybe add check that object is not exists?
There are many ways to check if an object exists. But for me it's more logically correct to check the error from deletion because I check concurrent deletes
Yes, it's true. I'm talking about the fact that after deletion there should be no object exist. Maybe
delete
does not return an error, but the object is not deleted.You're right. That must be checked. I've added check about object's existence
I think at least we should add
-race
flag forgo test
.a3a54534af
to91213fede3
As far as I understood the issue, here we should check logical correctness and data consistency (in the case of concurrent writes)
Yes, I did. See the test
"put the same big object with error"
From my point of view they should be run as regular tests because we rather try to check logical error but not incorrect resource sharing, but I don't mind if enable
-race
option forgo test
91213fede3
toea6ad1e13b
@ -176,0 +182,4 @@
smallSizeLimit = 512
// concurrentPutCount is fstree implementation specific
concurrentPutCount = 5
kinda magic number to me: the const from the
FSTree
may vary independently@fyrchik, what problem cases are expected to be tested by that PR?
Agree. That constant in Tree affects the concurrent write. If concurrent writers are
> concurrentPutCount
, then errors occur@ -176,0 +185,4 @@
concurrentPutCount = 5
)
newBlobStor := func(t *testing.T, compress bool) *BlobStor {
not used var here and below?
@ -176,0 +202,4 @@
}
testPut := func(t *testing.T, wg *sync.WaitGroup, b *BlobStor, obj *objectSDK.Object) {
defer wg.Done()
why
defer
here? looks useless without multiplereturn
same in
testPutFileExistsError
wg is no longer used in the such methods
@ -176,0 +235,4 @@
bigObj := testObject(smallSizeLimit * 2)
wg := &sync.WaitGroup{}
for range [concurrentPutCount + 1]int{} {
have never seen such a [0; n) range -- live and learn :)
but may the
for i := 0; ...; i++
be more expected by a regular GO reader? also, your way should allocate more memory for nothing@ -176,0 +304,4 @@
wg := &sync.WaitGroup{}
for range [2]int{} {
wg.Add(1)
go testDelete(t, wg, blobStor, bigObj)
how about:
and do not pass
wg
to the testing funcs?Sounds reasonble!
But with one fix:
Otherwise
wg.Wait()
can be passed meanwhile goroutine has not been run yet (nowg.Add
)oh, yeah, for sure. fast browser programming may lead to stupid bugs
ea6ad1e13b
to1dafa8fb07
The issue only suggests to test the behavior when get/put is done concurrently, but doesn't suggest how to do it exactly or what properties the author (who's the author, btw?) had in mind. That's why I'm asking.
This is not what I meant. What I meant is, whether you checked whether introducing a relevant bug in the codebase itself causes this verbatim test to catch it, and how often, if at all.
I don't quite get the "check logical error but not incorrect resource sharing". Can you elaborate? It would seem like "incorrect resource sharing" (i.e. a concurrency bug) is a potential source of "logical error"s, no?
My concern is simply to evaluate the utility of these tests somehow. Using the race detector is a good builtin tool to use random scheduling to try to surface some of the possible bugs, but it's probably not the only way.
@ -176,0 +215,4 @@
bigObj := testObject(smallSizeLimit * 2)
wg := &sync.WaitGroup{}
for range [concurrentPutCount]int{} {
cute loop. But why? Just because it's slightly shorter to type or is there another reason?
It shouldn't be here. I just wanted to try if this expression would be valid and it worked out :)
I'll replace it by
for i:=0...
It certainly works. It's probably useful for code-golfing :) first time I see it, so TIL.
@ -176,0 +214,4 @@
t.Run("put the same big object", func(t *testing.T) {
bigObj := testObject(smallSizeLimit * 2)
wg := &sync.WaitGroup{}
the zero
WaitGroup
is valid:Fixed 👍
@ale64bit we would like to test the behaviour when different operations are being done on the same object: there should be no data corruption and consistent results for both.
@fyrchik can we enable
race
flag forgo test
then?1dafa8fb07
to1478cf631a