Do not change shard mode to DEGRADED_READ_ONLY
in case of no space left
from blobovnicza #1166
Labels
No labels
P0
P1
P2
P3
badger
frostfs-adm
frostfs-cli
frostfs-ir
frostfs-lens
frostfs-node
good first issue
triage
Infrastructure
blocked
bug
config
discussion
documentation
duplicate
enhancement
go
help wanted
internal
invalid
kludge
observability
perfomance
question
refactoring
wontfix
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: TrueCloudLab/frostfs-node#1166
Loading…
Reference in a new issue
No description provided.
Delete branch "dstepanov-yadro/frostfs-node:fix/out_of_space_dro"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Now engine doesn't change shard mode in case of
no space left
errors and error threshold defined.118da6a174
toe916b5765c
@ -80,2 +75,4 @@
res, err = b.deleteObjectFromLevel(ctx, bPrm, p)
if err != nil {
if isErrNoSpaceLeft(err) {
return false, common.ErrNoSpace // stop iteration if no space left
It is tricky: one db may have no space because it wanted to do a remap, another one can have a large freelist with already allocated memory.
If we exit prematurely, we can make it harder to free space, e.g. there would be no place to put tombstone into.
Not actual.
@ -84,3 +82,4 @@
i.B.log.Debug(logs.BlobovniczatreeCouldNotGetActiveBlobovnicza,
zap.String("error", err.Error()),
zap.String("trace_id", tracingPkg.GetTraceID(ctx)))
} else if isErrNoSpaceLeft(err) { // stop iteration if no space left
Why have you decided to add this handlers on the
blobovniczatree
level and not on theblobovnicza
?It seems easier to miss sth here, and in blobovnicza we can easily ensure that each
Update
orBatch
is wrapped, for example.To stop iteration over databases as soon as possible.
e916b5765c
to447741ca7f
447741ca7f
toc5c22c632e
c5c22c632e
tof991f4d4fb
Do not change shard mode toto WIP: Do not change shard mode toDEGRADED_READ_ONLY
in case ofno space left
from blobovniczaDEGRADED_READ_ONLY
in case ofno space left
from blobovniczaf991f4d4fb
to84b8e0bd41
@ -0,0 +3,4 @@
import "git.frostfs.info/TrueCloudLab/frostfs-node/pkg/local_object_storage/util/logicerr"
// ErrNoSpace returned if blobovnicza failed to perform an operation because of syscall.ENOSPC.
var ErrNoSpace = logicerr.New("no space left on device with blobovnicza")
To not to use blobstor's ErrNoSpace: blobstor should depend on blobovnicza, not vice versa.
WIP: Do not change shard mode toto Do not change shard mode toDEGRADED_READ_ONLY
in case ofno space left
from blobovniczaDEGRADED_READ_ONLY
in case ofno space left
from blobovniczaachuprov referenced this pull request2024-06-07 10:54:41 +00:00
@ -95,6 +97,8 @@ func (b *Blobovnicza) Put(ctx context.Context, prm PutPrm) (PutRes, error) {
})
if err == nil {
b.itemAdded(recordSize)
} else if errors.Is(err, syscall.ENOSPC) {
Any modifying method can allocate new pages, even delete.
I thought about it, but have found this comment of contributor: https://github.com/etcd-io/bbolt/issues/288#issuecomment-919971605
Anyway, ok, I will fix it.
Done
The comment has different context (shrinking the DB), we already have experienced situations where deletions lead to db remap leading to a deadlock (with the write-cache)
84b8e0bd41
to815e87df74
815e87df74
to6cf512e574
@ -110,2 +111,3 @@
}
if errors.Is(err, blobovnicza.ErrNoSpace) {
i.AllFull = true
Again, do we exit if we received this error from at least 1 blobovnicza? Until we have vacuum I think it is not worth having this optimization, as others blobovniczas may still have free pages.
No, blobstor will try all databases:
I don't understand, why do we need this change in this PR? Is something wrong without it?
Without this change blobovnicza tree will return non logical error:
return common.PutRes{}, errPutFailed
So shard will increase error counter.
Hm, but why
iterateDeepest
return non-nil error?By design.