Added `frostfs-cli object locate` subcommand. It lists info
about shards storing an object.
Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
Added method `ListShardsForObject` to ControlService and to
StorageEngine. It returns information about shards storing
object on the node.
Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
There is a race condition when multiple cache operation try to report
the cache size metrics simultaneously. Consider the following example:
- the initial total size of objects stored in the cache size is 2
- worker X deletes an object and reads the cache size, which is 1
- worker Y deletes an object and reads the cache size, which is 0
- worker Y reports the cache size it learnt, which is 0
- worker X reports the cache size it learnt, which is 1
As a result, the observed cache size is 1 (i. e. one object remains
in the cache), which is incorrect because the actual cache size is 0.
To fix this, let's report the metrics periodically in the flush loop.
Signed-off-by: Aleksey Savchuk <a.savchuk@yadro.com>
As it was before: when the database was opened, an error returned,
but along with the original error, the `context cancelled`` error returned,
because `iterateIncompletedRebuildDBPaths` method has `ctx.Done()` check
and egCtx passed to `iterateIncompletedRebuildDBPaths` method.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
If applyOperationStream() exits prematurely, other goroutines will block
on send and errgroup will never finish waiting. In this commit we also
check whether context is cancelled.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Constant string `testOwnerID` for these tests has an invalid
format. It has 11 bytes instead of required 25 for `user.ID`.
It worked because:
1. `user.ID` was a byte slice and didn't check length
and format of byte slices decoded from strings.
2. in these tests `testOwnerID` was used only to decode
container owner id and to compare it with owner id encoded
back to string.
Since `user.ID implementation has changed`, the problem arised.
Now `testOwnerID` is valid.
Signed-off-by: Ekaterina Lebedeva <ekaterina.lebedeva@yadro.com>
Current flow is hard to reason about, #1601 is a notorious example of
accidental complexity.
1. Remove multiple nested ifs, use depth=1.
2. Process each status exactly once, hopefully preventing bugs like
#1601.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Consider `REP 1 REP 1` placement (selects/filters are omitted).
The placement is `[1, 2], [1, 0]`. We are the 0-th node.
Node 1 is under maintenance, so we do not replicate object
on the node 2. In the second replication group node 1 is under maintenance,
but current caching logic considers it as "replica holder" and removes
local copy. Voilà, we have DL if the object is missing from the node 1.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
The node can have MAINTENANCE status in the network map, but can also be
ONLINE while responding with MAINTENANCE. These are 2 different code
paths, let's test them separately.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>