Flaky test in CI: testWaitForEvacuationCompleted #1705
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1705
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I've noticed another flaky test in Jenkins: one, two, three, four. In all cases reruns were successful.
Failing tests are TestEvacuateShardObjects, TestEvacuateShardObjectsRepOneOnly, TestEvacuateTreesLocal but they all fail in one place:
func testWaitForEvacuationCompleted(t *testing.T, e *StorageEngine) *EvacuationState {
var st *EvacuationState
var err error
require.Eventually(t, func() bool {
st, err = e.GetEvacuationState(context.Background())
require.NoError(t, err)
return st.ProcessingStatus() == EvacuateProcessStateCompleted
}, 3*time.Second, 10*time.Millisecond)
return st
Error message:
I suppose that failure is caused by CI runner being slow while executing other jobs in parallel. Test code just does not get to the desired state fast enough. If this suggestion seems sensible, let's increase the timeout: https://review.frostfs.info/c/TrueCloudLab/frostfs-node/+/100
I also welcome anyone coming up with a better (not time-based) solution.
Increasing the timeout from 3 to 6 seconds did not help: build 361, build 363 and build 402 have failed even with increased timeout.
It is possible that 6 seconds is just not enough and we should increase the timeout further. I'm not familiar with the codebase enough to make that judgement so I'll stop my guesswork here.