Commit graph

518 commits

Author SHA1 Message Date
Evgenii Stratonikov
4e361e2ab4 [] writecache: Open DB in sync mode
Close .

Initially, `Sync: false` was provided because we can already lose
objects cached in memory. However, future changes in writecache will
remove inmemory cache and speed up it via other means.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-08-16 15:27:18 +04:00
Pavel Karpy
da2975a2f9 [] write-cache: Fix panic on Delete operation
If an object is found in the Write-cache and is placed at the end of
the in-memory cache, the memory counter update operation tries to
dereference the index that is out of the sliced array. Moreover, even if
panic does not appear, the counter is updated with the wrong value.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-09 20:18:26 +03:00
Pavel Karpy
a5da665e69 [] meta: Fix MatchCommonPrefix operation
Return all the objects on the empty common prefix search without search
optimizations that breaks boltDB logic.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-04 16:32:00 +03:00
Pavel Karpy
156ba85326 [] node: Do not return expired objects
If an object has not been marked for removal by the GC in the current epoch
yet but has already expired, respond with `ErrObjectNotFound` api status.
Also, optimize shard iteration: a node must stop any iteration if the object
 is found but gonna be removed soon.
All the checks are performed by the Metabase.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-04 16:31:49 +03:00
Pavel Karpy
9aba0ba512 [] meta: Add epoch state
It allows performing expiration checks on the stored objects.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-08-04 16:31:49 +03:00
Evgenii Stratonikov
8ffc2fdf5e [] engine: Do not increase error counter if the pilorama is disabled
After a4adb79db new logical error could be returned. Do not increase
error counter in this case.

Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-07-30 18:39:22 +03:00
Evgenii Stratonikov
8d0884e74f [] storagelog: Fix doc comment
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-07-30 18:39:08 +03:00
Evgenii Stratonikov
4abe5a7245 shard: add more checks for GetRange parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-07-29 11:22:04 +03:00
Evgenii Stratonikov
72586f17d4 shard: fix GetRange for objects stored in the write-cache
Signed-off-by: Evgenii Stratonikov <evgeniy@morphbits.ru>
2022-07-29 11:22:04 +03:00
Evgenii Stratonikov
1691364653 [] local_object_storage: Fix tests and some data races
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
50e28f22f9 [] shard: Change Degraded mode string representation
It is a flag, but is a `degraded-read-write` mode for a user.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Leonard Lyubich
fabe717d32 [] shard: Turn to read-only mode on metabase failure
If metabase can't be opened in the default mode, try opening shard
first in `ReadOnly` mode and then in `DegradedReadOnly`.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
4944490ffb [] local_object_storage: Move shard to the DegradedReadOnly mode
`Degraded` mode can be set by the administrator if needed.
Modifying operations in this mode can lead node into an inconsistent state
because metabase checks such as lock checking are not performed.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
3df62769c0 [] local_object_storage: Allow to set mode for all components
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
1e786233bf [] local_object_storage: Provide readOnly flag to Open
We should be able to reopen storage in readonly in runtime.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Leonard Lyubich
e38b0aa4ba [] engine: Disable shard on blobovnicza init failure
There is a need to support working w/o shard if it has problems with
blobovnicza tree.

Make `BlobStor.Init` to return new `ErrInitBlobovniczas` error. Remove
shard from storage engine's shard set if it returned this error from
`Init` call. So if some of the shards (but not all) return this error,
the node will be able to continue working without them.

Signed-off-by: Leonard Lyubich <leonard@nspcc.ru>
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
40a56c6b42 [] engine: Do not count logical errors as storage ones
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
c8911d72d0 [] shard: Do not consult metabase in a degraded mode
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
339864b720 [] local_object_storage: Move shard.Mode to a separate package
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
d8ba954aff [] shard: Use Set prefix for parameter setting
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
7b882b26d8 [] shard: Remove public functions
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
34d319fed2 [] metabase: Use Set prefix for parameter setting
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
f58234aa2f [] metabase: Remove public functions
Reduce public interface of this package. Later each result will contain
an additional status, so it makes more sense to use the same functions
and result processing everywhere.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 17:56:06 +03:00
Evgenii Stratonikov
a4adb79db7 [] pilorama: Enable tree service explicitly
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
d62723f038 [] pilorama: Provide timeout to bbolt.Open
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
26041f18bf [] pilorama: Allow to customize database parameters
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
c7b10598f9 [] engine: Increase error counter for pilorama errors
1. Modifying operations are not expected to fail, unless the shard is
   read-only.
2. `Get*` operations should increase error counter too, unless the
   error is `ErrTreeNotFound`.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
5e843a73f9 [] services/control: Return pilorama info in ListShards RPC
Do not return backend type from the service for now, because memory
backend is expected to vanish.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8f4ee1aded [] local_object_storage: Support ReadOnly mode in pilorama
The tricky part here is the engine itself: we stop iteration on
`ErrReadOnly` because it is better to synchronize the shard later than
to have partial trees stored in 2 shards.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
735931c842 [] pilorama: Fix TreeApply
Current implementation prevents invalid operations to become valid at
some later point (consider adding a child to the non-existent parent and
then adding the parent). This seems to diverge from the paper algorithm
and complicates implementation. Make it simpler.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
ad3038d16d [] pilorama: Fix TreeMove in bbolt backend
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8027b7bb6b [] pilorama: Optimize internal encoding/decoding
```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     55.5µs ± 4%    55.5µs ± 3%     ~     (p=1.000 n=10+7)
ApplyReorderLast/bbolt-8     108µs ± 6%     112µs ± 8%     ~     (p=0.077 n=9+9)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     28.8kB ± 3%    27.7kB ± 6%   -3.79%  (p=0.005 n=10+10)
ApplyReorderLast/bbolt-8    41.4kB ± 5%    38.9kB ± 5%   -6.19%  (p=0.001 n=10+9)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        262 ± 2%       235 ±10%  -10.41%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8       684 ± 6%       616 ± 7%  -10.04%  (p=0.000 n=10+9)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
4437cd7113 [] pilorama: Generate timestamp based on node position in the container
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
3312924b82 [] pilorama: Use Batch for write transactions
Helps a lot in case of concurrent request flow.

```
name                      old time/op    new time/op    delta
ApplySequential/bbolt-8     78.0µs ± 9%    59.8µs ± 4%  -23.39%  (p=0.000 n=10+9)
ApplyReorderLast/bbolt-8     143µs ± 5%     113µs ±15%  -21.06%  (p=0.000 n=10+10)

name                      old alloc/op   new alloc/op   delta
ApplySequential/bbolt-8     56.9kB ± 8%    28.9kB ± 3%  -49.22%  (p=0.000 n=10+10)
ApplyReorderLast/bbolt-8    87.3kB ± 3%    40.9kB ±10%  -53.16%  (p=0.000 n=10+10)

name                      old allocs/op  new allocs/op  delta
ApplySequential/bbolt-8        224 ±11%       262 ± 5%  +16.93%  (p=0.000 n=9+10)
ApplyReorderLast/bbolt-8       518 ± 4%       674 ±11%  +30.09%  (p=0.000 n=10+10)
```

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
f0a67f948d [] pilorama: Cache attributes in the index
Currently to find a node by path we iterate over all the children on
each level. This is far from optimal and scales badly with the number of
nodes on a single level. Thus we introduce "indexed attributes" for
which an additional information is stored and which can be use in
`*ByPath` operations. Currently this set only includes `FileName`
attribute but this may change in future.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
f5d35571d0 [] engine: Add benchmark for Select vs TreeGetByPath
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
835170e452 [] pilorama: Allow to benchmark all tree backends
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
536857ea5a [] services/tree: Implement GetOpLog RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
7703dd5d7f [] pilorama: Create new nodes in path if needed
Consider a node `{FileName: "dir", Attribute: "xxx"}`. In case we add
a new node by path `["dir", "file.txt"]`, create a new intermediate node
with a single attribute.

`GetByPath` now also considers only nodes with a single attribute while building a path.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
ad48918a97 [] pilorama: Return parent from TreeGetMeta
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8844d9b2db [] pilorama: Document errors for Get* methods
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
d8ad68d613 [] engine: Log errors in Tree* operations
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
910db42748 [] pilorama: Use require.ErrorIs in tests
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
aea855e8f3 [] services/tree: Implement GetSubTree RPC
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
46f4ce2773 [] engine: Implement Forest interface for storage engine
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
8cf71b7f1c [] local_object_storage: Implement tree service backend
In this commit we implement algorithm for CRDT trees from
https://martin.klepmann.com/papers/move-op.pdf

Each tree is identified by the ID of a container it belongs to
and the tree name itself. Essentially, it is a sequence of operations
which should be applied in chronological order to get a usual tree
representation.

There are 2 backends for now: bbolt database and in-memory.
In-memory backend is here for debugging and will eventually act
as a memory-cache for the on-disk database.

Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-21 15:08:24 +03:00
Evgenii Stratonikov
11c9df6f00 [] shard: Print shard ID in component logs
Signed-off-by: Evgenii Stratonikov <evgeniy@nspcc.ru>
2022-07-19 15:57:53 +03:00
Pavel Karpy
9f7a22e2aa [] object: Return OUT_OF_RANGE status
Replace `ErrRangeOutOfBounds` error from `pkg/core/object` package with
`ObjectOutOfRange` from `apistatus` package. That error is returned by
storage node's server as NeoFS API statuses.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-19 13:29:45 +03:00
Pavel Karpy
cb0bb7207c [] shard: Add a separate ErrLockObjectRemoval
Do not return `meta.ErrLockObjectRemoval` from shard's methods, add shard's
own error for that.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00
Pavel Karpy
558cc1193a [] engine: Clarify force removal
Document force removal behaviour in all the Storage engine parts.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
2022-07-18 11:42:25 +03:00