Possible panics when client has zero free space on disk

alexvanin commented

2023-07-19 12:03:38 +00:00

Owner

Not sure this is an issue we have to solve, but it would be nice to look at it anyway.

When client's hard disk is out of space, S3 gateway might produce such panic.

Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: 2023-07-18T14:51:45.252Z        info        http/server.go:3228        http: panic serving 10.78.70.146:44480: runtime error: index out of range [-1]
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: goroutine 1382670212 [running]:
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.(*conn).serve.func1()
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         net/http/server.go:1850 +0xbf
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: panic({0x1012500, 0xc009dee0a8})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         runtime/panic.go:890 +0x262
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Uint64(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rng.go:249
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Int63(0x40dc27?)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rng.go:234 +0x92
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Int63(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rand.go:84
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Float64(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         math/rand/rand.go:195
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*sampler).Next(0xc00ac187c0)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/sampler.go:64 +0x54
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*innerPool).connection(0xc00091ac80)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1870 +0x1c5
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).connection(0x0?)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1849 +0x56
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).GetContainer(0x0?, {0x145ec98, 0xc0097a1500}, {{0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:2344 +0x5d
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs.(*FrostFS).Container(0xc001bca1f8, {0x145ec98, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs/frostfs.go:95 +0xef
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).containerInfo(0xc00031e000, {0x145ec98?, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/container.go:45 +0x345
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).GetBucketInfo(0xc00031e000, {0x145ec98, 0xc0097a1500}, {0xc014e28705?, 0xc0097a1590?})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/layer.go:358 +0x351
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).getBucketAndCheckOwner(0x145ec98?, 0xc007895b00, {0xc014e28705?, 0x4a?}, {0x0, 0x0, 0x145?})
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/util.go:63 +0x6c
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).GetObjectHandler(0xc0004c0450, {0x145d040?, 0xc0097a1530}, 0xc007895b00)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/get.go:129 +0xe5
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.HandlerFunc.ServeHTTP(...)
Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]:         net/http/server.go:2109

Not sure this is an issue we have to solve, but it would be nice to look at it anyway. When client's hard disk is out of space, S3 gateway might produce such panic. ``` Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: 2023-07-18T14:51:45.252Z info http/server.go:3228 http: panic serving 10.78.70.146:44480: runtime error: index out of range [-1] Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: goroutine 1382670212 [running]: Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.(*conn).serve.func1() Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http/server.go:1850 +0xbf Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: panic({0x1012500, 0xc009dee0a8}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: runtime/panic.go:890 +0x262 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Uint64(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rng.go:249 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*rngSource).Int63(0x40dc27?) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rng.go:234 +0x92 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Int63(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rand.go:84 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand.(*Rand).Float64(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: math/rand/rand.go:195 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*sampler).Next(0xc00ac187c0) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/sampler.go:64 +0x54 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*innerPool).connection(0xc00091ac80) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1870 +0x1c5 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).connection(0x0?) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:1849 +0x56 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go/pool.(*Pool).GetContainer(0x0?, {0x145ec98, 0xc0097a1500}, {{0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-sdk-go@v0.0.0-20230329125804-552219b8e130/pool/pool.go:2344 +0x5d Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs.(*FrostFS).Container(0xc001bca1f8, {0x145ec98, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/internal/frostfs/frostfs.go:95 +0xef Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).containerInfo(0xc00031e000, {0x145ec98?, 0xc0097a1500}, {0x58, 0xfe, 0x6c, 0x16, 0x93, 0x3c, 0xa2, ...}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/container.go:45 +0x345 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer.(*layer).GetBucketInfo(0xc00031e000, {0x145ec98, 0xc0097a1500}, {0xc014e28705?, 0xc0097a1590?}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/layer/layer.go:358 +0x351 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).getBucketAndCheckOwner(0x145ec98?, 0xc007895b00, {0xc014e28705?, 0x4a?}, {0x0, 0x0, 0x145?}) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/util.go:63 +0x6c Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler.(*handler).GetObjectHandler(0xc0004c0450, {0x145d040?, 0xc0097a1530}, 0xc007895b00) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: git.frostfs.info/TrueCloudLab/frostfs-s3-gw/api/handler/get.go:129 +0xe5 Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http.HandlerFunc.ServeHTTP(...) Jul 18 14:51:45 vedi frostfs-s3-gw[3148786]: net/http/server.go:2109 ```

alexvanin added the

bug

pool

labels 2023-07-19 12:03:38 +00:00

alexvanin self-assigned this 2023-07-20 07:40:38 +00:00

fyrchik commented

2023-08-18 08:22:26 +00:00

Owner

@alexvanin Could you elaborate a bit on how free disk space is connected to this panic?

alexvanin commented

2023-08-21 08:26:23 +00:00

Author

Owner

@alexvanin Could you elaborate a bit on how free disk space is connected to this panic?

Panic happened when disk was 100% full, that is the only connection. May be pure coincidence.

> @alexvanin Could you elaborate a bit on how free disk space is connected to this panic? Panic happened when disk was 100% full, that is the only connection. May be pure coincidence.

alexvanin commented

2023-09-28 14:39:30 +00:00

Author

Owner

Didn't reproduce for a while, close.

alexvanin closed this issue

2023-09-28 14:39:30 +00:00

fyrchik commented

2024-05-08 11:05:31 +00:00

Owner

Given that stdlib is well tested, the behavior could be the result of a data race.

func (p *innerPool) connection() (client, error) {

func (p *innerPool) connection() (client, error) {
	p.lock.RLock() // need lock because of using p.sampler
	defer p.lock.RUnlock()

Why do we have Rlock here, not Lock?

Given that stdlib is well tested, the behavior could be the result of a data race. https://git.frostfs.info/TrueCloudLab/frostfs-sdk-go/src/commit/02c936f397c7bb9e1e7ca4e71b548f4deaaa32a1/pool/pool.go#L2134 ```golang func (p *innerPool) connection() (client, error) { p.lock.RLock() // need lock because of using p.sampler defer p.lock.RUnlock() ``` Why do we have `Rlock` here, not `Lock`?

dkirillov commented

2024-05-08 11:16:02 +00:00

Member

We don't change the p.sampler field (just read it) that's why we are using RLock

We don't change the `p.sampler` field (just read it) that's why we are using `RLock`

fyrchik referenced this issue

2024-05-08 11:53:11 +00:00

pool: Sampler isn't protected by mutex #221

Possible panics when client has zero free space on disk #125