Move shard to degraded mode if metabase open failed #918

Merged
fyrchik merged 1 commits from dstepanov-yadro/frostfs-node:fix/degraded_mode into master 2024-01-23 19:00:27 +00:00

After this fix shard will move to degrared read only mode if metabase open failed.

Before fix: update shard id failed, only one shard out of two was attached

2024-01-18T12:01:32.900+0300	error	frostfs-node/config.go:1023	failed to attach shard to engine	{"error": "could not create a shard: could not update shard ID: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T12:01:32.902+0300	info	frostfs-node/config.go:1026	shard attached to engine	{"id": "DonciCDQ5JPkDBQzqnCLHk"}
2024-01-18T12:01:32.902+0300	info	frostfs-node/main.go:76	initializing storage engine service...
2024-01-18T12:01:32.914+0300	info	frostfs-node/main.go:78	storage engine service has been successfully initialized

After fix:
If metabase file locked for write, shard will open in read only mode:

2024-01-18T13:12:20.138+0300	warn	engine/shards.go:127	failed to update shard id	{"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "metabase_path": "/home/dstepanov/src/frostfs-node/.cache/storage/meta0", "error": "failed to open metabase: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T13:12:20.138+0300	info	frostfs-node/config.go:1026	shard attached to engine	{"id": "5Uj4CJBKqy3o238X59Dmvc"}
2024-01-18T13:12:20.138+0300	info	frostfs-node/config.go:1026	shard attached to engine	{"id": "DonciCDQ5JPkDBQzqnCLHk"}
2024-01-18T13:12:20.138+0300	info	frostfs-node/main.go:76	initializing storage engine service...
2024-01-18T13:12:20.139+0300	error	shard/control.go:21	metabase failure, switching mode	{"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "stage": "open", "mode": "READ_ONLY", "error": "can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T13:12:20.139+0300	info	shard/mode.go:29	setting shard mode	{"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "old_mode": "READ_WRITE", "new_mode": "READ_ONLY"}
2024-01-18T13:12:20.141+0300	info	shard/mode.go:72	shard mode set successfully	{"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "mode": "READ_ONLY"}

If metabase file locked for read and write, shard will open in degraded read only mode:

2024-01-18T13:18:16.821+0300	warn	engine/shards.go:127	failed to update shard id	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "metabase_path": "/home/dstepanov/src/frostfs-node/.cache/storage/meta0", "error": "failed to open metabase: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T13:18:16.821+0300	info	frostfs-node/config.go:1026	shard attached to engine	{"id": "PGd8mPpSm3aaxmvUDW93x9"}
2024-01-18T13:18:16.821+0300	info	frostfs-node/config.go:1026	shard attached to engine	{"id": "DonciCDQ5JPkDBQzqnCLHk"}
2024-01-18T13:18:16.821+0300	info	frostfs-node/main.go:76	initializing storage engine service...
2024-01-18T13:18:16.828+0300	error	shard/control.go:21	metabase failure, switching mode	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "stage": "open", "mode": "READ_ONLY", "error": "can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T13:18:16.828+0300	info	shard/mode.go:29	setting shard mode	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "old_mode": "READ_WRITE", "new_mode": "READ_ONLY"}
2024-01-18T13:18:16.830+0300	error	shard/control.go:31	can't move shard to readonly, switch mode	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "stage": "open", "mode": "DEGRADED_READ_ONLY", "error": "can't set metabase mode (old=READ_WRITE, new=READ_ONLY): can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"}
2024-01-18T13:18:16.830+0300	info	shard/mode.go:29	setting shard mode	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "old_mode": "READ_WRITE", "new_mode": "DEGRADED_READ_ONLY"}
2024-01-18T13:18:16.831+0300	info	shard/mode.go:72	shard mode set successfully	{"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "mode": "DEGRADED_READ_ONLY"}
After this fix shard will move to degrared read only mode if metabase open failed. Before fix: update shard id failed, only one shard out of two was attached ``` 2024-01-18T12:01:32.900+0300 error frostfs-node/config.go:1023 failed to attach shard to engine {"error": "could not create a shard: could not update shard ID: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T12:01:32.902+0300 info frostfs-node/config.go:1026 shard attached to engine {"id": "DonciCDQ5JPkDBQzqnCLHk"} 2024-01-18T12:01:32.902+0300 info frostfs-node/main.go:76 initializing storage engine service... 2024-01-18T12:01:32.914+0300 info frostfs-node/main.go:78 storage engine service has been successfully initialized ``` After fix: If metabase file locked for write, shard will open in read only mode: ``` 2024-01-18T13:12:20.138+0300 warn engine/shards.go:127 failed to update shard id {"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "metabase_path": "/home/dstepanov/src/frostfs-node/.cache/storage/meta0", "error": "failed to open metabase: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T13:12:20.138+0300 info frostfs-node/config.go:1026 shard attached to engine {"id": "5Uj4CJBKqy3o238X59Dmvc"} 2024-01-18T13:12:20.138+0300 info frostfs-node/config.go:1026 shard attached to engine {"id": "DonciCDQ5JPkDBQzqnCLHk"} 2024-01-18T13:12:20.138+0300 info frostfs-node/main.go:76 initializing storage engine service... 2024-01-18T13:12:20.139+0300 error shard/control.go:21 metabase failure, switching mode {"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "stage": "open", "mode": "READ_ONLY", "error": "can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T13:12:20.139+0300 info shard/mode.go:29 setting shard mode {"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "old_mode": "READ_WRITE", "new_mode": "READ_ONLY"} 2024-01-18T13:12:20.141+0300 info shard/mode.go:72 shard mode set successfully {"shard_id": "5Uj4CJBKqy3o238X59Dmvc", "mode": "READ_ONLY"} ``` If metabase file locked for read and write, shard will open in degraded read only mode: ``` 2024-01-18T13:18:16.821+0300 warn engine/shards.go:127 failed to update shard id {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "metabase_path": "/home/dstepanov/src/frostfs-node/.cache/storage/meta0", "error": "failed to open metabase: can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T13:18:16.821+0300 info frostfs-node/config.go:1026 shard attached to engine {"id": "PGd8mPpSm3aaxmvUDW93x9"} 2024-01-18T13:18:16.821+0300 info frostfs-node/config.go:1026 shard attached to engine {"id": "DonciCDQ5JPkDBQzqnCLHk"} 2024-01-18T13:18:16.821+0300 info frostfs-node/main.go:76 initializing storage engine service... 2024-01-18T13:18:16.828+0300 error shard/control.go:21 metabase failure, switching mode {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "stage": "open", "mode": "READ_ONLY", "error": "can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T13:18:16.828+0300 info shard/mode.go:29 setting shard mode {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "old_mode": "READ_WRITE", "new_mode": "READ_ONLY"} 2024-01-18T13:18:16.830+0300 error shard/control.go:31 can't move shard to readonly, switch mode {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "stage": "open", "mode": "DEGRADED_READ_ONLY", "error": "can't set metabase mode (old=READ_WRITE, new=READ_ONLY): can't open boltDB database: open /home/dstepanov/src/frostfs-node/.cache/storage/meta0: permission denied"} 2024-01-18T13:18:16.830+0300 info shard/mode.go:29 setting shard mode {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "old_mode": "READ_WRITE", "new_mode": "DEGRADED_READ_ONLY"} 2024-01-18T13:18:16.831+0300 info shard/mode.go:72 shard mode set successfully {"shard_id": "PGd8mPpSm3aaxmvUDW93x9", "mode": "DEGRADED_READ_ONLY"} ```
dstepanov-yadro force-pushed fix/degraded_mode from 1835e7cbe0 to bc348e7e8b 2024-01-18 10:53:19 +00:00 Compare
dstepanov-yadro force-pushed fix/degraded_mode from bc348e7e8b to 237ba80db5 2024-01-18 10:53:56 +00:00 Compare
dstepanov-yadro requested review from storage-core-committers 2024-01-18 11:19:54 +00:00
dstepanov-yadro requested review from storage-core-developers 2024-01-18 11:19:58 +00:00
dstepanov-yadro force-pushed fix/degraded_mode from 237ba80db5 to d9208294ef 2024-01-19 12:56:07 +00:00 Compare
acid-ant approved these changes 2024-01-23 06:34:46 +00:00
fyrchik approved these changes 2024-01-23 07:25:01 +00:00
fyrchik left a comment
Owner

The thing we should also check is the behaviour in case blobstor cannot be opened -- shard should be disabled.

The thing we should also check is the behaviour in case blobstor cannot be opened -- shard should be disabled.
@ -128,4 +128,1 @@
_, err := e.AddShard(context.Background(), opts...)
if errOnAdd {
require.Error(t, err)
// This branch is only taken when we cannot update shard ID in the metabase.

This comment was wrong?

This comment was wrong?
Poster
Collaborator

Yes. I think the source of this comment is the current behavior but not the right behavior.

Yes. I think the source of this comment is the current behavior but not the right behavior.
fyrchik marked this conversation as resolved
@ -51,0 +42,4 @@
defer func() {
cErr := s.metaBase.Close()
if cErr != nil {
err = fmt.Errorf("failed to close metabase: %w", cErr)

We discussed but haven't come to conclusion: do we need failed to in the error text? It will probably be in the log message anyway.

We discussed but haven't come to conclusion: do we need `failed to` in the error text? It will probably be in the log message anyway.
Poster
Collaborator

Now there are both variant in code

Now there are both variant in code
fyrchik marked this conversation as resolved
Poster
Collaborator

The thing we should also check is the behaviour in case blobstor cannot be opened -- shard should be disabled.

See shard/control.go: it should work as you described.

> The thing we should also check is the behaviour in case blobstor cannot be opened -- shard should be disabled. See `shard/control.go`: it should work as you described.
dstepanov-yadro force-pushed fix/degraded_mode from d9208294ef to 931a5e9aaf 2024-01-23 08:17:13 +00:00 Compare
fyrchik merged commit 931a5e9aaf into master 2024-01-23 19:00:27 +00:00
Sign in to join this conversation.
No reviewers
TrueCloudLab/storage-core-developers
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: TrueCloudLab/frostfs-node#918
There is no content yet.