All operations must ensure the shard is not in a degraded mode.
Write operations must also ensure the shard is not in a read-only mode.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
With non-blocking pool restricted by 10 in capacity, the probability of
dropping events is unexpectedly big. Notifications are an essential part of the FrostFS,
we should not drop anything, especially new epochs.
```
Mar 31 07:07:03 vedi neofs-ir[19164]: 2023-03-31T07:07:03.901Z debug subscriber/subscriber.go:154 new notification event from sidechain {"name": "NewEpoch"}
Mar 31 07:07:03 vedi neofs-ir[19164]: 2023-03-31T07:07:03.901Z warn event/listener.go:248 listener worker pool drained {"chain": "morph", "capacity": 10}
```
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
It led to a neo-go dead-lock in the `subscriber` component. Subscribing to
notifications is the same RPC as any others, so it could also be blocked
forever if no async listening (reading the notification channel) routine
exists. If a number of subscriptions is big enough (or a caller is lucky
enough) subscribing loop might have not finished subscribing before the
first notification is received and then: subscribing RPC is blocked by
received notification (non)handling and listening notifications routine is
blocked by not finished subscription loop.
That commit starts listening notification channel _before_ any subscription
actions.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
We have already had and solved plenty of deposit issues and notary balance
is a really important thing. Deserves to be INFO even before the huge logs
severity refactor, happens on an app start only.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
It does not use deprecated methods anymore but also adds more code that
removes. Future refactor that will affect more components will optimize
usage of the updated API.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
A directory is read and files are saved to a local variable. The iteration
over such files may lead to a non-existing files reading due to a normal SN
operation cycle and, therefore, may lead to a returning the OS error to a
caller. Skip just removed (or lost) files as the golang std library does in
similar situations:
5f1a0320b9/src/os/dir_unix.go (L128-L133).
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
In case of session token (ST) with object IDs search should
return only objects allowed in static session
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
Initially it was there to check whether an update is being initiated by
a proper node. It is now obsolete for 2 reasons:
1. Background synchronization fetches all operations from a single node.
2. There are a lot more problems with trust in the tree service, it is
only used in controlled environments.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Fix sending GAS to an empty extra wallets receivers list. Also, send GAS to
extra wallets even if netmap is empty.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
GC deletes expired locks and objects sequentially. Expired locks and
objects are now being deleted concurrently in batches. Added a config
parameter that controls the number of concurrent workers and batch size.
Signed-off-by: Dmitrii Stepanov <d.stepanov@yadro.com>
This reverts commit 2567f8020e. It assumes
that assembling logic could break some failover scenarios if request
forwarding is done. However, it also breaks requesting big objects via a
non-container node with TTL=2. Failover has been rechecked without that
commit and no problems were found. Any (if found) other bugs related to
the forwarding and object assembling must be solved more carefully.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Allow replication of any (expired too) locked object. Information about
object locking is considered to be presented on the _container nodes_.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Previously a token could've expired in the middle of an object.PUT
stream, leading to upload being interrupted. This is bad, because user
doesn't always now what is the right values for the session token
lifetime. More than that, setting it to a very high value will
eventually blow up the session token database.
In this commit we read the session token once and reuse it for the whole
stream duration.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
It will prevent test fails with `-race` flag on components that have
background processes and make some actions on test framework.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
The problem is that accidental timeout errors can make us to ignore
other nodes for some time. The primary purpose of the whole ignore
mechanism is not to degrade in case of failover. For this case,
closing connection and limiting the amount of dials is enough.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
In case we have many small objects in the write-cache, `indices` should
not be reused between iterations.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Missing `ReportError` method did not allow casing multi-client interface to
`errorReporter` interface and dropping broken connections.
`replicationClient` embeds that interface, and it is widely used across
node's code. Embedded interface does not allow casting its parent structure
to `errorReporter` and breaks multi client error reporting logic.
Multi-client scheme is extremely hard to maintain, it makes unpredictable
casts and does not allow tracking code flow, so it will be refactored in the
future anyway.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Previously, node could get an "infinite" small object: it could be expired
and thus could not be flushed (update its storage ID) to metabase => could
not be marked as flushed => node never removes such object and repeat all
the cycle one more time. If object exists and is not marked with GC (meta
returns `ErrObjectIsExpired`, not `ObjectNotFound` and not
`ObjectAlreadyRemoved`), its ID is safe to update _in the same_ bbolt
transaction.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
LRU `Peek`/`Contains` take LRU mutex _inside_ of a `View` transaction.
`View` transaction itself takes `mmapLock` [1], which is lifted after tx
finishes (in `tx.Commit()` -> `tx.close()` -> `tx.db.removeTx`)
When we evict items from LRU cache mutex order is different:
first we take LRU mutex and then execute `Batch` which _does_ take
`mmapLock` in case we need to remap. Thus the deadlock.
[1] 8f4a7e1f92/db.go (L708)
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
To achieve high performance we must choose proper values for both
batch size and delay. For user operations we want to set low delay.
However it would prevent tree synchronization operations to form big
enough batches. For these operations, batching gives the most benefit
not only in terms of on-CPU execution cost, but also by speeding up
transaction persist (`fsync`).
In this commit we try merging batches that are already
_triggered_, but not yet _started to execute_. This way we can still
query batches for execution after the provided delay while also allowing
multiple formed batches to execute faster.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
"Object is expired" means that object is presented in `meta` but it is not
`ObjectNotFound` error. Previous implementation made `shard` search for an
object without `meta` which was an error.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
```
name old time/op new time/op delta
_addressFromString-8 1.25µs ±30% 1.02µs ± 6% -18.49% (p=0.000 n=9+9)
name old alloc/op new alloc/op delta
_addressFromString-8 352B ± 0% 256B ± 0% -27.27% (p=0.000 n=9+10)
name old allocs/op new allocs/op delta
_addressFromString-8 6.00 ± 0% 4.00 ± 0% -33.33% (p=0.000
n=10+10)
```
Also, assure compiler that `s` doesn't escape:
Before this commit:
```
./fstree.go:74:24: leaking param: s
./fstree.go:90:6: moved to heap: addr
```
After this commit:
```
./fstree.go:74:24: s does not escape
```
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Currently we track based on `PayloadSize`, because it is already stored
in the metabase and it is easier to calculate without slowing down the
whole system.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Currently the only way to tell whether `evacuate/set-mode` is finished
is to set a very big timeout and _hope_ that the operation will finish.
In this commit we add INFO logs for such operations which should
simplify the life of an administrator.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Currently, under high load clients are blocked on channel send
and the number of goroutines can increase indefinitely.
In this commit we drop replication messages if send/recv queue is full
and rely on a background synchronization.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
1. Reduce allocations inside transactions.
2. Do not encode container ID to string: it allocates a lot and takes more
space.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
Under high load we are limited by the _amount_ of keys we need to update
in a single transaction. In this commit we try storing all state
with a single key.
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
In the previous implementation any non-nil error that preceded object
fetching from blobstor led to iterating over every storage (in other words,
no storage ID information was taken into account). Now storage ID is
skipped only if metabase (storage ID source) returns any error.
Signed-off-by: Pavel Karpy <p.karpy@yadro.com>
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>