The shard errors count was increased when a "file not found" error occurred when flushing the cache. #98

Closed
opened 2023-03-08 23:30:59 +00:00 by snegurochka · 2 comments

Original issue: https://github.com/nspcc-dev/neofs-node/issues/2202

Expected Behavior

The shard error count should not increase when a "file not found" error occurred when flushing the cache, because this is the logical error.

Current Behavior

The shard errors count was increased when a "file not found" error occurred when flushing the cache.

Steps to Reproduce (for bugs)

  1. Stop neofs-storage service
  2. Clear all data from writecache
  3. Start neofs-storage service
  4. See errors in log:
Jan 17 15:09:16 glagoli neofs-node[821701]: 2023-01-17T15:09:16.244Z        warn        engine/engine.go:160        can't read a file        {"shard_id": "CRcrDrZwLCYBShtfWCKHKa", "error count": 106, "error": "open /srv/neofs/meta0/write_cache8/5/wS8sqzsxwguVwqwNWdMKiyPywrbXqghvdHahSo54WVW.FWb1rLnqmcazDq5Z9cB2BzspgwK5Bap9FxEfn6EiCtRM: no such file or directory"}
  1. Shard moved to degraded-read-only while error count reached threshold.

Version:

NeoFS Storage node
Version: v0.34.0-149-ge5f1af53-dirty
GoVersion: go1.18.4

Your Environment

Server setup and configuration:
HW, 4 servers, 4 SN, 4 http qw, 4 s3 gw

Operating System and version (uname -a):
linux vedi 5.10.0-16-amd64 https://github.com/nspcc-dev/neofs-node/issues/1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux

Original issue: https://github.com/nspcc-dev/neofs-node/issues/2202 ## Expected Behavior The shard error count should not increase when a "file not found" error occurred when flushing the cache, because this is the logical error. ## Current Behavior The shard errors count was increased when a "file not found" error occurred when flushing the cache. ## Steps to Reproduce (for bugs) 1. Stop neofs-storage service 2. Clear all data from writecache 3. Start neofs-storage service 4. See errors in log: ``` Jan 17 15:09:16 glagoli neofs-node[821701]: 2023-01-17T15:09:16.244Z warn engine/engine.go:160 can't read a file {"shard_id": "CRcrDrZwLCYBShtfWCKHKa", "error count": 106, "error": "open /srv/neofs/meta0/write_cache8/5/wS8sqzsxwguVwqwNWdMKiyPywrbXqghvdHahSo54WVW.FWb1rLnqmcazDq5Z9cB2BzspgwK5Bap9FxEfn6EiCtRM: no such file or directory"} ``` 5. Shard moved to ```degraded-read-only``` while error count reached threshold. ## Version: ``` NeoFS Storage node Version: v0.34.0-149-ge5f1af53-dirty GoVersion: go1.18.4 ``` ## Your Environment Server setup and configuration: HW, 4 servers, 4 SN, 4 http qw, 4 s3 gw Operating System and version (uname -a): linux vedi 5.10.0-16-amd64 https://github.com/nspcc-dev/neofs-node/issues/1 SMP Debian 5.10.127-1 (2022-06-30) x86_64 GNU/Linux
carpawell self-assigned this 2023-03-16 17:05:32 +00:00
Collaborator

@anikeev-yadro, are you sure that storage cleaning was done after service had been stopped? is the issue reproducible? Was there any load?

@anikeev-yadro, are you sure that storage cleaning was done _after_ service had been stopped? is the issue reproducible? Was there any load?

@carpawell Yes, I'm sure. Not reproduced on version:

FrostFS Storage node
Version: v0.0.1-205-g9929dcf5
GoVersion: go1.18.4
@carpawell Yes, I'm sure. Not reproduced on version: ``` FrostFS Storage node Version: v0.0.1-205-g9929dcf5 GoVersion: go1.18.4 ```
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#98
There is no content yet.