frostfs-sdk-go/doc/policy.md
Evgenii Stratonikov ac8fc6d440
All checks were successful
DCO / DCO (pull_request) Successful in 1m6s
Tests and linters / Tests (1.19) (pull_request) Successful in 1m38s
Tests and linters / Tests (1.20) (pull_request) Successful in 1m34s
Tests and linters / Lint (pull_request) Successful in 5m27s
[#162] netmap: Allow to parse single unnamed selectors
Signed-off-by: Evgenii Stratonikov <e.stratonikov@yadro.com>
2023-09-11 15:22:24 +03:00

16 KiB

Placement Policy

This document describes placement policies, their purpose, syntax and semantics.

Index

Introduction

The purpose of a placement policy is to determine whichs node(s) of a frostfs system will store an object. Namely, given a netmap (a set of nodes) and a placement policy, a subset of those nodes is selected to store a given object. An important aspect is that since nodes in a netmap come and go due to distributed nature of the system, this selection must be deterministic and consistent, i.e. different nodes must come to the same conclusion as long as they share the same view of the netmap.

ℹ️ Throughout this document, we will consider each node as a dictionary of attributes and a global unique ID.

One way to think about the placement policy, is as a pipeline of operations which begins with a set of nodes (the entire netmap) and gradually refine it into only the nodes that will be in charge of storing the object. More specifically, each operation in this pipeline takes a set of nodes and transforms it into a subset of those nodes. The transformation is done purely on the basis of the node attributes.

Placement policy as a pipeline

The three main operations are:

  1. FILTER: filters a set of nodes based on their attributes.
  2. SELECT: selects a specific amount of nodes from a set of nodes based on certain conditions.
  3. REP: specifies how many nodes (and which ones) from a set of nodes are used to store an object.

In the next sections, we will explore each of them in detail.

Operations

Basic Expressions

Before exploring the operations in detail, we must get acquainted with the basic expressions that appear in a placement policy. As mentioned above, the placement policy operates solely on the basis of node attributes, and as such, basic expressions mostly revolve around node attribute comparison.

A comparison expression expresses whether a node attribute equals a specified value:

AttributeName Operation AttributeValue

For example, the following expression

City EQ 'Moscow'

asserts that the node attribute City equals the value Moscow.

Comparison expressions can be nested via boolean operators and parentheses. For example, the following expression:

(City EQ 'Moscow') AND (Disks GT 2)

asserts that the node attribute City equals the value Moscow and the node attribute Disks must be greater than 2. Note that the arguments can be either a string or a number.

See Appendix 1 for a complete list of supported operators.

FILTER

A FILTER operation takes as input a set of nodes and returns a subset of those nodes. It's useful for selecting nodes that have (or lack) specific attributes. Its basic syntax is as follows:

FILTER <expr> AS <id>

For example, the following filter

FILTER Color EQ 'Red' AS RedNodes

selects those nodes for which the Color attribute equals Red, and discards the rest. The filter's identifier is RedNodes, which can be used to reference it in other parts of the placement policy. For example, you could reference the above filter in another filter as follows

FILTER @RedNodes AND (City EQ 'Moscow') AS RedMoscowNodes

which would select those nodes for which the Color attribute equals Red and the City attribute equals Moscow. You can think of the @ operator as embedding the referenced filter expression verbatim where it's used. This makes it easy to compose filters. However, filters can be referenced via @ only within filter expressions. In other places you can simply use the filter identifier directly.

⚠️ Every filter requires a unique identifier. What would be the use of a filter that you cannot reference?

The following diagram illustrates the filter operation

FILTER

where the nodes are represented as colored circles, with their color representing the value of their Color attribute, respectively.

ℹ️ A filter referring to all nodes in the netmap always exists and can be referenced by *.

SELECT

A SELECT operation specifies how many and which nodes from a subset previously obtained from a FILTER will be available to build replica groups for object storage. It's not that different from a FILTER in that it transforms a set of nodes into a subset of those, but while a FILTER cannot control the size of the resulting subset and other characteristics, a SELECT can.

Its basic syntax is as follows:

SELECT <count> {IN (SAME|DISTINCT) <attribute>} FROM <filter> {AS <id>}

In a nutshell, a SELECT takes a filter result as input and outputs a specific number of nodes, optionally enforcing that all output nodes must either share or differ in a specific attribute. Note that only the output node count and the source filter are required.

Let's see some examples

-- Selects exactly one node from the entire netmap
SELECT 1 FROM *  

-- Same as above, but with an identifier for the selection
SELECT 1 FROM * AS ONE  

-- Selects two nodes from the RedOrBlueNodes filter, such that both selected nodes
-- share the same value for the Color attribute, i.e. both red or both blue.
SELECT 2 IN SAME Color FROM RedOrBlueNodes

-- Selects two nodes from the RedOrBlueNodes filter, such that the selected nodes
-- have distinct values for the Color attribute, i.e. one red and one blue.
-- The selection is also given an identifier.
SELECT 2 IN DISTINCT Color FROM RedOrBlueNodes AS MyNodes

The last example is illustrated in the following diagram:

SELECT

ℹ️ At this point, notice that while FILTER's output is always unique (namely, every node in the input is either filtered in or out), that is not always the case for SELECT. In the last example above, there is more than one way to select two nodes with distinct Color attribute. Because we require the output to be deterministic and consistent (so that all nodes agree on which nodes to store a given object without having to commmunicate with each other), we need a way to reach this consensus efficiently. Internally, the policy engine uses Rendezvouz Hashing to ensure this. If you want more control over what nodes are actually selected, you can always use narrower filters/selections to ensure this.

REP

A REP operation specifies how many copies of an object need to be stored (REP stands for "replica"). A placement policy can contain multiple replica operations, with each of them representing a replica group, i.e. a group of objects associated with the same replica. Following our analogy with a pipeline, REP operations are the sink or output nodes.

Its basic syntax is as follows:

REP <count> {IN <select>}

If a select is not specified, then the entire netmap is used as input. The only exception to this rule is when exactly 1 replica and 1 selector are being present: in this case the only selector is being used instead of the whole netmap. The resulting nodes will be used to actually store objects and they constitute a replica group (or simply, "a replica").

Examples

-- A replica consisting of a single copy, stored in an arbitrary node of the netmap.
REP 1

-- A replica consisting of three copies, each stored in a different node from the selection
-- identified as 'MyNodes'.
REP 3 IN MyNodes

The following diagram illustrates the REP operation:

REP

⚠️ Notice that although we use REP 1 in the examples, in real life scenarios you almost always want to have more than a single node in each replica for redundancy.

Policies

In order to specify a complete placement policy, we just need to assemble it from the operations described above.

Its basic (simplified) syntax is as follows:

<rep>+ <select>* <filter>*

We begin by stating all our REP operations, followed by all the SELECT operations and finally all the FILTER operations. Note that this is the reverse order in which they are applied. Also note that at least one REP operation is required.

Here's a complete example:

REP 1 IN MyNodes
SELECT 2 IN DISTINCT Color FROM RedOrBlueNodes AS MyNodes
FILTER Color EQ 'Red' AS RedNodes
FILTER Color EQ 'Blue' AS BlueNodes
FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes

In additional to this basic syntax, there are a couple of additional useful options to specify which nodes and how many nodes are actually selected to store objects. We explore these in the next sections.

The policy playground

ℹ️ This section assumes you have an up-to-date version of the frostfs-cli.

While simple placement policies have predictable results that can be understood at a glance, more complex ones need careful consideration before deployment. In order to simplify understanding a policy's outcome and experimenting while learning, a builtin tool is provided as part of the frostfs-cli for this purpose: the policy playground.

For the remainder of this guide, we will use the policy playground to setup a virtual netmap (that is, one that doesn't require any networking or deployment) and test various policies. In order to visualize this netmap easily, each node will have three attributes: a character, a shape and a color

Sample Netmap

We can start the policy playground as follows:

$ frostfs-cli container policy-playground
>

Since we didn't pass any endpoint, the initial netmap is empty, which we can verify with the ls command (to list the nodes in the netmap):

> ls
>

Nows let's add virtual nodes to represent our test netmap in the figure above

> add 01 Char:A Shape:Circle  Color:Blue
> add 02 Char:B Shape:Circle  Color:Green
> add 03 Char:C Shape:Circle  Color:Red
> add 04 Char:D Shape:Square  Color:Blue
> add 05 Char:E Shape:Square  Color:Green
> add 06 Char:F Shape:Square  Color:Red
> add 07 Char:G Shape:Diamond Color:Blue
> add 08 Char:H Shape:Diamond Color:Green
> add 09 Char:I Shape:Diamond Color:Red

and verify that the netmap now contains what we expect

> ls
	 1: id=06 attrs={Char:F Shape:Square Color:Red}
	 2: id=08 attrs={Char:H Shape:Diamond Color:Green}
	 3: id=01 attrs={Char:A Shape:Circle Color:Blue}
	 4: id=04 attrs={Char:D Shape:Square Color:Blue}
	 5: id=05 attrs={Char:E Shape:Square Color:Green}
	 6: id=09 attrs={Char:I Shape:Diamond Color:Red}
	 7: id=02 attrs={Char:B Shape:Circle Color:Green}
	 8: id=03 attrs={Char:C Shape:Circle Color:Red}
	 9: id=07 attrs={Char:G Shape:Diamond Color:Blue}

With our sample netmap setup, we can now continue.

CBF

Consider the following policy:

REP 1

It builds a replica consisting of one copy, selected from the entire netmap. If we evaluate this policy in our sample netmap, we obtain a result which is probably unexpected:

> eval REP 1
	 1: [06 05 02]

The eval commands evaluates a policy and lists in a separate line the nodes selected for each REP operation, in the order they appear in the policy. We were expecting a single node, but we got three instead. The reason is that there's a policy-wide parameter called container backup factor (or CBF). This parameter is a multiplier which controls the maximum number of storage nodes: for example, if a policy requires 10 nodes and the CBF is 3, it means that the policy can store an object in up to 10 × 3 nodes. However, if there are not enough nodes and fewer (but at least 10) are used, this is not considered an error.

The default value for CBF is 3, which explains our result above, given than every node in the netmap agrees with the policy. The CBF can be explicitly set in the policy right after the REP operations. For example

> eval REP 1 CBF 1
	 1: [06]

results in what we expected in the first example. On the other hand

> eval REP 1 IN MyNodes SELECT 1 IN SAME Char FROM * AS MyNodes
	 1: [01]

results in a single node despite the default CBF, because there are not enough nodes compatible with the selection.

UNIQUE

Consider the following policy:

REP 1
REP 1
CBF 2

If we evaluate it

> eval REP 1 REP 1 CBF 2
	 1: [06 05]
	 2: [06 05]

we find that each replica gets two nodes, in accordance with the CBF. However, these nodes are exactly the same and this might not be desirable. In order to force the policy engine to select different nodes for each replica, we can use the UNIQUE option, which is specified right before the REP operations.

In our example, if we change it to

UNIQUE
REP 1
REP 1
CBF 2

and evaluate it

> eval UNIQUE REP 1 REP 1 CBF 2
	 1: [06 05]
	 2: [02 03]

we now find that the nodes selected for each replica are now distinct from each other.

More examples

This section presents some more examples of placement policies and their result when applied to the sample netmap. Try to figure out the result before looking at it or evaluating the policy.

Example #1

REP 1 IN TwoRedNodes
SELECT 2 FROM RedNodes AS TwoRedNodes
FILTER Color EQ 'Red' AS RedNodes
Result
> eval REP 1 IN TwoRedNodes SELECT 2 FROM RedNodes AS TwoRedNodes FILTER Color EQ 'Red' AS RedNodes
	 1: [06 09 03]

Example #2

REP 1 IN TwoRedNodes
REP 1 IN TwoRedNodes
SELECT 2 FROM RedNodes AS TwoRedNodes
FILTER Color EQ 'Red' AS RedNodes
Result
> eval REP 1 REP 1 IN TwoRedNodes SELECT 2 FROM RedNodes AS TwoRedNodes FILTER Color EQ 'Red' AS RedNodes
	 1: [06 09 03]
	 2: [06 09 03]

Example #3

REP 2 IN MyNodes
REP 2 IN MyNodes
SELECT 2 FROM RedOrBlueNodes AS MyNodes
FILTER Color EQ 'Red' AS RedNodes
FILTER Color EQ 'Blue' AS BlueNodes
FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes
Result
> eval REP 2 IN MyNodes REP 2 IN MyNodes SELECT 2 FROM RedOrBlueNodes AS MyNodes FILTER Color EQ 'Red' AS RedNodes FILTER Color EQ 'Blue' AS BlueNodes FILTER @RedNodes OR @BlueNodes AS RedOrBlueNodes
	 1: [06 01 04 03 09 07]
	 2: [06 01 04 03 09 07]

Example #4

REP 2 IN MyRedNodes
REP 2 IN MyBlueNodes
CBF 1
SELECT 2 FROM RedNodes AS MyRedNodes
SELECT 2 FROM BlueNodes AS MyBlueNodes
FILTER Color EQ 'Red' AS RedNodes
FILTER Color EQ 'Blue' AS BlueNodes
Result
> eval REP 2 IN MyRedNodes REP 2 IN MyBlueNodes CBF 1 SELECT 2 FROM RedNodes AS MyRedNodes SELECT 2 FROM BlueNodes AS MyBlueNodes FILTER Color EQ 'Red' AS RedNodes FILTER Color EQ 'Blue' AS BlueNodes
	 1: [06 03]
	 2: [01 04]

Example #5

UNIQUE
REP 1 IN MyGreenNodes
REP 1 IN MyGreenNodes
REP 1 IN MyGreenNodes
CBF 1
SELECT 1 FROM GreenNodes AS MyGreenNodes
FILTER Color EQ 'Green' AS GreenNodes
Result
> eval UNIQUE REP 1 IN MyGreenNodes REP 1 IN MyGreenNodes REP 1 IN MyGreenNodes CBF 1 SELECT 1 FROM GreenNodes AS MyGreenNodes FILTER Color EQ 'Green' AS GreenNodes
	 1: [05]
	 2: [02]
	 3: [08]

Example #6

REP 1 IN MyNodes
REP 2
CBF 2
SELECT 1 FROM CuteNodes AS MyNodes
FILTER (Color EQ 'Blue') AND NOT (Shape EQ 'Circle' OR Shape EQ 'Square') AS CuteNodes
Result
eval REP 1 IN MyNodes REP 2  CBF 2 SELECT 1 FROM CuteNodes AS MyNodes FILTER (Color EQ 'Blue') AND NOT (Shape EQ 'Circle' OR Shape EQ 'Square') AS CuteNodes
	 1: [07]
	 2: [06 05 02 03]

Appendix 1: Operators

Comparison operators (all binary):

  • EQ: equals
  • NE: not equal
  • GE: greater or equal
  • GT: greater than
  • LE: less or equal
  • LT: less than

Logical operators:

  • NOT: negation (unary)
  • AND: conjunction (binary)
  • OR: disjunction (binary)

Others:

  • @: filter reference
  • (: left parenthesis
  • ): right parenthesis

Appendix 2: Policy playground commands

  • ls: list nodes in the current netmap and their attributes
  • add: add a node to the current netmap. If it already exists, it will be overwritten.
  • remove: remove a node from the current netmap.
  • eval: evaluate a placement policy on the current netmap.