When AddByPath() is called concurrently on 2 different nodes,
internal path components may be created twice. This violates some
of our assumptions in GetByPath() and, indirectly, in S3 handling of
GetSubTree() results.
Add a test for the correct behaviour, fixes will follow.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
`fmt.Errorf can be replaced with errors.New` and `fmt.Sprintf can be replaced with string addition`
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
* If treeID is empty then deleting buckets for cursor may get
invalidated. So, buckets should be gathered before deleting.
Signed-off-by: Airat Arifullin <a.arifullin@yadro.com>
Some of our pilorama tests fail on CI.
The reasons are not obvious, but one possible improvement
is using `WithNoSync` option for these. It should have much effect,
because we are writing on the tmpfs, but doesn't hurt anyway.
If I replace `t.TempDir()` with a local directory, test execution time
goes down from 5s (sync) to 0.4s (nosync), which is the same time as
with `t.TempDir()`. Maybe we have some strange CI configuration.
```
panic: test timed out after 10m0s
running tests:
TestForest_ApplyRandom (8m22s)
TestForest_ApplyRandom/bbolt (8m21s)
...
goroutine 170 [syscall]:
syscall.Syscall(0xc000100000?, 0xc00047b758?, 0x6aff9a?, 0xc00041c1b0?)
/opt/hostedtoolcache/go/1.20.7/x64/src/syscall/syscall_linux.go:69 +0x27
syscall.Fdatasync(0x9e35c0?)
/opt/hostedtoolcache/go/1.20.7/x64/src/syscall/zsyscall_linux_amd64.go:418 +0x2a
go.etcd.io/bbolt.fdatasync(0xc000189000?)
```
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
1. In redo() we save the old state.
2. If we do redo() for the same operation twice, the old state will be
overritten with the new one.
3. This in turn affects undo() and subsequent isAncestor() check.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Memory forest is here to check the correctness of boltdb optimized
implementation. Let's keep it simple.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Initially it was there to check whether an update is being initiated by
a proper node. It is now obsolete for 2 reasons:
1. Background synchronization fetches all operations from a single node.
2. There are a lot more problems with trust in the tree service, it is
only used in controlled environments.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
To achieve high performance we must choose proper values for both
batch size and delay. For user operations we want to set low delay.
However it would prevent tree synchronization operations to form big
enough batches. For these operations, batching gives the most benefit
not only in terms of on-CPU execution cost, but also by speeding up
transaction persist (`fsync`).
In this commit we try merging batches that are already
_triggered_, but not yet _started to execute_. This way we can still
query batches for execution after the provided delay while also allowing
multiple formed batches to execute faster.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>