Possible panics when client has zero free space on disk #125

Closed
opened 2023-07-19 12:03:38 +00:00 by alexvanin · 5 comments
Owner

Not sure this is an issue we have to solve, but it would be nice to look at it anyway.

When client's hard disk is out of space, S3 gateway might produce such panic.

Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: 2023-07-18T14:51:45.252Z        info        http/server.go:3228        http: panic serving 10.78.70.146:44480: runtime error: index out of range [-1]
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: goroutine 1382670212 [running]:
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.(*conn).serve.func1()
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         net/http/server.go:1850 +0xbf
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: panic({0x1012500, 0xc009dee0a8})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         runtime/panic.go:890 +0x262
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Uint64(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rng.go:249
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Int63(0x40dc27?)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rng.go:234 +0x92
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Int63(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rand.go:84
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Float64(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rand.go:195
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*sampler).Next(0xc00ac187c0)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/sampler.go:64 +0x54
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*innerPool).connection(0xc00091ac80)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1870 +0x1c5
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).connection(0x0?)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1849 +0x56
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).GetContainer(0x0?, {0x145ec98, 0xc0097a1500}, {{0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:2344 +0x5d
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs.(*FrostFS).Container(0xc001bca1f8, {0x145ec98, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs/frostfs.go:95 +0xef
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).containerInfo(0xc00031e000, {0x145ec98?, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/container.go:45 +0x345
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).GetBucketInfo(0xc00031e000, {0x145ec98, 0xc0097a1500}, {0xc014e28705?, 0xc0097a1590?})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/layer.go:358 +0x351
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).getBucketAndCheckOwner(0x145ec98?, 0xc007895b00, {0xc014e28705?, 0x4a?}, {0x0, 0x0, 0x145?})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/util.go:63 +0x6c
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).GetObjectHandler(0xc0004c0450, {0x145d040?, 0xc0097a1530}, 0xc007895b00)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/get.go:129 +0xe5
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.HandlerFunc.ServeHTTP(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         net/http/server.go:2109	
Not sure this is an issue we have to solve, but it would be nice to look at it anyway. When client's hard disk is out of space, S3 gateway might produce such panic. ``` Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: 2023-07-18T14:51:45.252Z info http/server.go:3228 http: panic serving 10.78.70.146:44480: runtime error: index out of range [-1] Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: goroutine 1382670212 [running]: Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.(*conn).serve.func1() Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http/server.go:1850 +0xbf Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: panic({0x1012500, 0xc009dee0a8}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: runtime/panic.go:890 +0x262 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Uint64(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rng.go:249 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Int63(0x40dc27?) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rng.go:234 +0x92 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Int63(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rand.go:84 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Float64(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rand.go:195 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*sampler).Next(0xc00ac187c0) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/sampler.go:64 +0x54 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*innerPool).connection(0xc00091ac80) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1870 +0x1c5 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).connection(0x0?) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1849 +0x56 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).GetContainer(0x0?, {0x145ec98, 0xc0097a1500}, {{0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:2344 +0x5d Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs.(*FrostFS).Container(0xc001bca1f8, {0x145ec98, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs/frostfs.go:95 +0xef Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).containerInfo(0xc00031e000, {0x145ec98?, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/container.go:45 +0x345 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).GetBucketInfo(0xc00031e000, {0x145ec98, 0xc0097a1500}, {0xc014e28705?, 0xc0097a1590?}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/layer.go:358 +0x351 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).getBucketAndCheckOwner(0x145ec98?, 0xc007895b00, {0xc014e28705?, 0x4a?}, {0x0, 0x0, 0x145?}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/util.go:63 +0x6c Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).GetObjectHandler(0xc0004c0450, {0x145d040?, 0xc0097a1530}, 0xc007895b00) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/get.go:129 +0xe5 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.HandlerFunc.ServeHTTP(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http/server.go:2109 ```
alexvanin added the
bug
pool
labels 2023-07-19 12:03:38 +00:00
alexvanin self-assigned this 2023-07-20 07:40:38 +00:00
Owner

@alexvanin Could you elaborate a bit on how free disk space is connected to this panic?

@alexvanin Could you elaborate a bit on how free disk space is connected to this panic?
Author
Owner

@alexvanin Could you elaborate a bit on how free disk space is connected to this panic?

Panic happened when disk was 100% full, that is the only connection. May be pure coincidence.

> @alexvanin Could you elaborate a bit on how free disk space is connected to this panic? Panic happened when disk was 100% full, that is the only connection. May be pure coincidence.
Author
Owner

Didn't reproduce for a while, close.

Didn't reproduce for a while, close.
Owner

Given that stdlib is well tested, the behavior could be the result of a data race.

Line 2134 in 02c936f
func (p *innerPool) connection() (client, error) {

func (p *innerPool) connection() (client, error) {
	p.lock.RLock() // need lock because of using p.sampler
	defer p.lock.RUnlock()

Why do we have Rlock here, not Lock?

Given that stdlib is well tested, the behavior could be the result of a data race. https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/commit/02c936f397c7bb9e1e7ca4e71b548f4deaaa32a1/pool/pool.go#L2134 ```golang func (p *innerPool) connection() (client, error) { p.lock.RLock() // need lock because of using p.sampler defer p.lock.RUnlock() ``` Why do we have `Rlock` here, not `Lock`?
Member

We don't change the p.sampler field (just read it) that's why we are using RLock

We don't change the `p.sampler` field (just read it) that's why we are using `RLock`
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-sdk-go#125
No description provided.