mclock: Fix timer-based scheduling #12

Merged
fyrchik merged 3 commits from dstepanov-yadro/frostfs-qos:fix/mclock_timer_scheduling into master 2025-03-21 13:41:31 +00:00

Let's assume that there are two requests in the queue with execution time t1 and t2.
The timer is set to t1. The timer is triggered, schedules the t1 request,
calculates the time for the next timer t2 to be triggered.
But it doesn't schedules timer to this time because of the
q.timeBasedScheduleTs > nextTs check.

Let's assume that there are two requests in the queue with execution time t1 and t2. The timer is set to t1. The timer is triggered, schedules the t1 request, calculates the time for the next timer t2 to be triggered. But it doesn't schedules timer to this time because of the `q.timeBasedScheduleTs > nextTs` check.
dstepanov-yadro added 1 commit 2025-03-19 13:00:23 +00:00
[#9999] mclock: Fix timer-based scheduling
All checks were successful
DCO action / DCO (pull_request) Successful in 25s
Tests and linters / Run gofumpt (pull_request) Successful in 28s
Tests and linters / Tests (pull_request) Successful in 43s
Vulncheck / Vulncheck (pull_request) Successful in 40s
Tests and linters / Tests with -race (pull_request) Successful in 50s
Tests and linters / Staticcheck (pull_request) Successful in 59s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m28s
Tests and linters / Lint (pull_request) Successful in 1m27s
Tests and linters / gopls check (pull_request) Successful in 1m48s
480b9aee3d
Let's assume that there are two requests in the queue with execution time t1 and t2.
The timer is set to t1. The timer is triggered, schedules the t1 request,
calculates the time for the next timer t2 to be triggered.
But it doesn't schedules timer to this time because of the
`q.timeBasedScheduleTs > nextTs` check.

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
requested review from fyrchik 2025-03-19 13:00:23 +00:00
dstepanov-yadro force-pushed fix/mclock_timer_scheduling from 480b9aee3d to f546741c3a 2025-03-19 13:01:13 +00:00 Compare
requested reviews from storage-core-committers, storage-core-developers 2025-03-19 13:01:29 +00:00
dstepanov-yadro reviewed 2025-03-19 13:07:37 +00:00
dstepanov-yadro changed title from mclock: Fix timer-based scheduling to WIP: mclock: Fix timer-based scheduling 2025-03-19 13:20:50 +00:00
fyrchik approved these changes 2025-03-19 13:25:19 +00:00
fyrchik left a comment
Owner

How have you caught this bug?

How have you caught this bug?
@ -300,3 +295,1 @@
q.scheduleRequest()
})
q.timeBasedScheduleTs = nextTs
if nextTs == math.MaxFloat64 {
Owner

Comparing float64 with == rubs me the wrong way, do we have any alternative?

Comparing `float64` with `==` rubs me the wrong way, do we have any alternative?
Author
Member

fixed

fixed
dstepanov-yadro force-pushed fix/mclock_timer_scheduling from f546741c3a to 1ebed20b9d 2025-03-19 14:02:17 +00:00 Compare
dstepanov-yadro force-pushed fix/mclock_timer_scheduling from 1ebed20b9d to 346752477b 2025-03-19 14:12:07 +00:00 Compare
Author
Member

@fyrchik wrote in #12 (comment):

How have you caught this bug?

Hardware testing

@fyrchik wrote in https://git.frostfs.info/TrueCloudLab/frostfs-qos/pulls/12#issuecomment-70762: > How have you caught this bug? Hardware testing
dstepanov-yadro reviewed 2025-03-19 14:14:49 +00:00
@ -63,0 +59,4 @@
if currentTask != nil {
c.wg.Add(1)
f := currentTask
go func() {
Author
Member

Run f() on separate goroutine to not to stuck on channel push.

Run f() on separate goroutine to not to stuck on channel push.
dstepanov-yadro added 1 commit 2025-03-20 07:35:33 +00:00
[#12] mclock: Fix deadlock caused by mclock.Close
All checks were successful
DCO action / DCO (pull_request) Successful in 21s
Vulncheck / Vulncheck (pull_request) Successful in 42s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m14s
Tests and linters / Lint (pull_request) Successful in 1m14s
Tests and linters / Run gofumpt (pull_request) Successful in 1m10s
Tests and linters / Staticcheck (pull_request) Successful in 1m17s
Tests and linters / Tests (pull_request) Successful in 1m6s
Tests and linters / Tests with -race (pull_request) Successful in 1m22s
Tests and linters / gopls check (pull_request) Successful in 1m29s
1ca213ee7c
Deadlock scenario:
- mclock closed by `Close` method, it locks mutex and calls `clock.close`
- clock starts `scheduleRequest` goroutine, it tries to lock mutex
- `clock.Close` waits for all goroutines

Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro added 1 commit 2025-03-20 14:25:05 +00:00
[#12] grpc: Fix method name
All checks were successful
DCO action / DCO (pull_request) Successful in 27s
Vulncheck / Vulncheck (pull_request) Successful in 40s
Pre-commit hooks / Pre-commit (pull_request) Successful in 1m17s
Tests and linters / Run gofumpt (pull_request) Successful in 1m15s
Tests and linters / Tests (pull_request) Successful in 1m21s
Tests and linters / Staticcheck (pull_request) Successful in 1m21s
Tests and linters / Tests with -race (pull_request) Successful in 1m22s
Tests and linters / Lint (pull_request) Successful in 1m25s
Tests and linters / gopls check (pull_request) Successful in 1m25s
Vulncheck / Vulncheck (push) Successful in 48s
Tests and linters / Run gofumpt (push) Successful in 1m11s
Tests and linters / Staticcheck (push) Successful in 1m16s
Tests and linters / Lint (push) Successful in 1m22s
Pre-commit hooks / Pre-commit (push) Successful in 1m34s
Tests and linters / Tests (push) Successful in 1m43s
Tests and linters / Tests with -race (push) Successful in 1m45s
Tests and linters / gopls check (push) Successful in 1m46s
32079ad7c2
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
dstepanov-yadro changed title from WIP: mclock: Fix timer-based scheduling to mclock: Fix timer-based scheduling 2025-03-21 07:34:23 +00:00
requested reviews from fyrchik, storage-core-committers, storage-core-developers 2025-03-21 07:34:32 +00:00
aarifullin reviewed 2025-03-21 11:27:54 +00:00
@ -68,0 +73,4 @@
if s.ts >= currentTs {
// current timer will fire earlier
// so next scheduleRequest will push new schedule event
continue
Member

But what does happen to the scheduled task? It's gone? Will it be rescheduled?

But what does happen to the scheduled task? It's gone? Will it be rescheduled?
Author
Member

current timer will fire and call scheduleRequest, and scheduleRequest will set timer again on s.ts

current timer will fire and call `scheduleRequest`, and `scheduleRequest` will set timer again on `s.ts`
aarifullin marked this conversation as resolved
aarifullin approved these changes 2025-03-21 13:09:03 +00:00
fyrchik merged commit 32079ad7c2 into master 2025-03-21 13:41:31 +00:00
fyrchik referenced this pull request from a commit 2025-03-21 13:41:32 +00:00
Sign in to join this conversation.
No description provided.