Shard moved to READ-ONLY due to errors meta info was present, but object is missing #679

Closed
opened 2023-09-07 12:38:26 +00:00 by anikeev-yadro · 2 comments

Current Behavior

Shard was moved to READ-ONLY due to errors meta info was present, but object is missing after shard mount point was broken.

Steps to Reproduce (for bugs)

  1. Stop storage node
  2. Umount disk with data
  3. Start storage node
  4. See log errors:
Sep 07 09:24:13 buky frostfs-node[1476381]: 2023-09-07T09:24:13.778Z        warn        engine/engine.go:167        meta info was present, but object is missing        {"shard_id": "GH9U7V51uubW83kvpP77NC", "error count": 133, "error": "status: code = 2049 message = object not found", "address": "5hzPmfUjGMS282Hx4ckT4CHSVCCJDobAxh3ndv7LdEyq/CuHBkEtQHQirvbqWdnuvpgqGmA5bXV2tc42ofHi2UAzx"}
Sep 07 09:24:13 buky frostfs-node[1476381]: 2023-09-07T09:24:13.779Z        info        engine/engine.go:113        shard is moved in read-only mode due to error threshold        {"shard_id": "GH9U7V51uubW83kvpP77NC", "error count": 133}

Version

FrostFS Storage node
Version: v0.0.1-883-gd9de9e2b
GoVersion: go1.20.5

Your Environment

HW

## Current Behavior Shard was moved to READ-ONLY due to errors `meta info was present, but object is missing` after shard mount point was broken. ## Steps to Reproduce (for bugs) 1. Stop storage node 2. Umount disk with data 3. Start storage node 4. See log errors: ``` Sep 07 09:24:13 buky frostfs-node[1476381]: 2023-09-07T09:24:13.778Z warn engine/engine.go:167 meta info was present, but object is missing {"shard_id": "GH9U7V51uubW83kvpP77NC", "error count": 133, "error": "status: code = 2049 message = object not found", "address": "5hzPmfUjGMS282Hx4ckT4CHSVCCJDobAxh3ndv7LdEyq/CuHBkEtQHQirvbqWdnuvpgqGmA5bXV2tc42ofHi2UAzx"} Sep 07 09:24:13 buky frostfs-node[1476381]: 2023-09-07T09:24:13.779Z info engine/engine.go:113 shard is moved in read-only mode due to error threshold {"shard_id": "GH9U7V51uubW83kvpP77NC", "error count": 133} ``` ## Version ``` FrostFS Storage node Version: v0.0.1-883-gd9de9e2b GoVersion: go1.20.5 ``` ## Your Environment HW
anikeev-yadro added the
bug
triage
labels 2023-09-07 12:38:26 +00:00
fyrchik added the
frostfs-node
label 2023-09-07 13:34:02 +00:00

I suggest making this error logical now, but don't close the task until we figure out which events lead to this situation.

I suggest making this error logical now, but don't close the task until we figure out which events lead to this situation.
fyrchik added this to the v0.37.0 milestone 2023-09-08 09:22:14 +00:00

I cannot reproduce this issue with the following steps:

  1. PUT object
Sep 08 10:24:41 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:24:41.806Z        info        log/log.go:13        local object storage operation        {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "op": "PUT", "type": "fstree", "storage_id": ""}
Sep 08 10:24:41 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:24:41.831Z        info        log/log.go:13        local object storage operation        {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "op": "metabase PUT"}
  1. delete object from file system
root@aanikeev-node1:~# mv /srv/frostfs/data1/H/L/a/o/AzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn.FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ /tmp/
  1. try to GET object with --ttl 1
service@loader02:/etc/k6$ sudo frostfs-cli object get --rpc-endpoint 172.26.161.15:8080 --wallet scenarios/files/wallet.json --cid FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ --oid HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn --ttl 1 --file /tmp/111
Enter password >
 0 / ? [----------------------------------------------------------------------------------------------=]   0.00% 2562047h47m16s
rpc error: read object header: status: code = 2049 message = object not found
  1. Those steps weren'r lead to increase shard error count
Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z        debug        get/get.go:88        serving request...        {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false}
Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": false}
Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z        warn        shard/get.go:136        fetching object without meta        {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"}
Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true}
Sep 08 10:28:50 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:50.100Z        warn        shard/get.go:136        fetching object without meta        {"shard_id": "8w9rPMQPMbWNXDPyrnVYup", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"}
Sep 08 10:28:50 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:50.100Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "8w9rPMQPMbWNXDPyrnVYup", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true}
Sep 08 10:28:51 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:51.394Z        warn        shard/get.go:136        fetching object without meta        {"shard_id": "8HWWy3754pAVLbKxUys7yr", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"}
Sep 08 10:28:51 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:51.394Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "8HWWy3754pAVLbKxUys7yr", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true}
Sep 08 10:28:52 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:52.718Z        warn        shard/get.go:136        fetching object without meta        {"shard_id": "WdkDRoELwCb5BakjFi1g3L", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"}
Sep 08 10:28:52 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:52.718Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "WdkDRoELwCb5BakjFi1g3L", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true}
Sep 08 10:28:54 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:54.633Z        warn        shard/get.go:136        fetching object without meta        {"shard_id": "B1jNFFLvMc4sAswKFyLGzm", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"}
Sep 08 10:28:54 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:54.633Z        debug        shard/get.go:145        object is missing in write-cache        {"shard_id": "B1jNFFLvMc4sAswKFyLGzm", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true}
Sep 08 10:28:55 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:55.887Z        debug        get/local.go:33        local get failed        {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false, "error": "status: code = 2049 message = object not found"}
Sep 08 10:28:55 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:55.887Z        debug        get/get.go:109        operation finished with error        {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false, "error": "status: code = 2049 message = object not found"}
I cannot reproduce this issue with the following steps: 1. PUT object ``` Sep 08 10:24:41 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:24:41.806Z info log/log.go:13 local object storage operation {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "op": "PUT", "type": "fstree", "storage_id": ""} Sep 08 10:24:41 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:24:41.831Z info log/log.go:13 local object storage operation {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "op": "metabase PUT"} ``` 2. delete object from file system ``` root@aanikeev-node1:~# mv /srv/frostfs/data1/H/L/a/o/AzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn.FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ /tmp/ ``` 3. try to GET object with `--ttl 1` ``` service@loader02:/etc/k6$ sudo frostfs-cli object get --rpc-endpoint 172.26.161.15:8080 --wallet scenarios/files/wallet.json --cid FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ --oid HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn --ttl 1 --file /tmp/111 Enter password > 0 / ? [----------------------------------------------------------------------------------------------=] 0.00% 2562047h47m16s rpc error: read object header: status: code = 2049 message = object not found ``` 4. Those steps weren'r lead to increase shard error count ``` Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z debug get/get.go:88 serving request... {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false} Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": false} Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z warn shard/get.go:136 fetching object without meta {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"} Sep 08 10:28:48 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:48.799Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "RJB2LiicwcLP9mmViAHd7P", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true} Sep 08 10:28:50 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:50.100Z warn shard/get.go:136 fetching object without meta {"shard_id": "8w9rPMQPMbWNXDPyrnVYup", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"} Sep 08 10:28:50 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:50.100Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "8w9rPMQPMbWNXDPyrnVYup", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true} Sep 08 10:28:51 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:51.394Z warn shard/get.go:136 fetching object without meta {"shard_id": "8HWWy3754pAVLbKxUys7yr", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"} Sep 08 10:28:51 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:51.394Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "8HWWy3754pAVLbKxUys7yr", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true} Sep 08 10:28:52 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:52.718Z warn shard/get.go:136 fetching object without meta {"shard_id": "WdkDRoELwCb5BakjFi1g3L", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"} Sep 08 10:28:52 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:52.718Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "WdkDRoELwCb5BakjFi1g3L", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true} Sep 08 10:28:54 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:54.633Z warn shard/get.go:136 fetching object without meta {"shard_id": "B1jNFFLvMc4sAswKFyLGzm", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn"} Sep 08 10:28:54 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:54.633Z debug shard/get.go:145 object is missing in write-cache {"shard_id": "B1jNFFLvMc4sAswKFyLGzm", "addr": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "skip_meta": true} Sep 08 10:28:55 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:55.887Z debug get/local.go:33 local get failed {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false, "error": "status: code = 2049 message = object not found"} Sep 08 10:28:55 aanikeev-node1 frostfs-node[960]: 2023-09-08T10:28:55.887Z debug get/get.go:109 operation finished with error {"component": "Object.Get service", "request": "GET", "address": "FQQ9k4RY7oQD8PKgoRPnaQwssK1Y6yWajaHfQEVUZwkQ/HLaoAzjhcDCJn6RwNCG9KvZpH9oNmxHm9yCG4VNv3Ptn", "raw": false, "local": true, "with session": false, "with bearer": false, "error": "status: code = 2049 message = object not found"} ```
fyrchik self-assigned this 2023-09-13 08:10:17 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#679
There is no content yet.