frostfs-s3-gw/docs/authentication.md
Denis Kirillov 94bd1dfe28 [#334] Add auth doc
Signed-off-by: Denis Kirillov <d.kirillov@yadro.com>
2024-03-21 12:12:29 +03:00

313 lines
18 KiB
Markdown

# Authentication and authorization scheme
This document describes s3-gw authentication and authorization mechanism.
## General overview
Basic provisions:
* A request to s3-gw can be signed or not (request that isn't signed we will cal anonymous or just anon)
* To manage resources (buckets/objects) using s3-gw you must have appropriate access rights
Each request must be authenticated (at least as anonymous) and authorized. The following scheme shows components that
are involved to this
process.
<a>
<img src="images/authentication/auth-overview.svg" alt="Auth general overview"/>
</a>
There are several participants of this process:
1. User that make a request
2. S3-GW that accepts a request
3. FrostFS Storage that stores AccessObjects (objects are needed for authentication)
4. Blockchain smart contracts (`frostfsid`, `policy`) that stores user info and access rules.
## Data auth process
Let's look at the process in more detail:
<a>
<img src="images/authentication/auth-sequence.svg" alt="Auth sequence diagram"/>
</a>
* First of all, someone make a request. If request is signed we will check its signature (`Authentication`) after that
we will check access rights using policies (`Auhorization`). For anonymous requests only authorization be performed.
* **Authentication steps**:
* Each signed request is provided with `AccessKeyId` and signature. So if request is signed we must check its
signature. To do this we must know the `AccessKeyId`/`SecretAccessKey` pair (How the signature is calculated
using this pair see [signing](#aws-signing). Client and server (s3-gw) use the same credentials and algorithm to
compute signature). The `AccessKeyId` is a public part of credentials, and it's passed to gate in request. The
private part of credentials is `SecretAccessKey` and it's encrypted and stored in [AccessBox](#accessbox). So on
this step we must find appropriate `AccessBox` in FrostFS storage node (How to find appropriate `AccessBox`
knowing `AccessKeyId` see [search algorithm](#search-algorithm)). On this stage we can get `AccessDenied` from
FrostFS storage node if the s3-gw doesn't have permission to read this `AccessBox` object.
* After successful retrieving object we must extract `SecretAccessKey` from it. Since it's encrypted the s3-gw must
decrypt (see [encryption](#encryption)) this object using own private key and `SeedKey` from `AccessBox`
(see [AccessBox inner structure](#accessbox)). After s3-gw have got the `AccessKeyId`/`SecretAccessKey` pair it
[calculate signature](#aws-signing) and compare got signature with provided withing request. If signature doesn't
match the `AccessDenied` is returned.
* `AccessBox` also contains `OwnerID` that is related to `AccessKeyId` that was provided. So we have to check if
such `OwnerID` exists in `frsotfsid` contract (that stores all registered valid users). If user doesn't exist in
contract the `AccessDenied` is returned.
* **Authorization steps**:
* To know if user has access right to do what he wants to do we must find appropriate access policies. Such policies
are stored in `policy` contract and locally (can be manged using [control api](#control-auth-process)). So we need
to get policies from contract and [check them](#policies) along with local to decide if user has access right. If
he doesn't have such right the `AccessDenied` is returned.
* After successful authentication and authorization the request will be processed by s3-gw business logic and finally be
propagated to FrostFS storage node which also performs some auth checks and can return `AccessDenied`. If this happens
s3-gw also returns `AccessDenied` as response.
### AWS Signing
Every interaction with FrostFS S3 gateway is either authenticated or anonymous. This section explains request
authentication with the AWS Signature Version 4 algorithm. More info in AWS documentation:
* [Authenticating Requests (AWS Signature Version 4)](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html)
* [Signing AWS API requests](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_aws-signing.html)
#### Authentication Methods
You can express authentication information by using one of the following methods:
* **HTTP Authorization header** - Using the HTTP Authorization header is the most common method of authenticating an
FrostFS S3 request. All the FrostFS S3 REST operations (except for browser-based uploads using POST requests) require
this header. For more information about the Authorization header value, and how to calculate signature and related
options,
see [Authenticating Requests: Using the Authorization Header (AWS Signature Version 4)](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html).
* **Query string parameters** - You can use a query string to express a request entirely in a URL. In this case, you use
query parameters to provide request information, including the authentication information. Because the request
signature is part of the URL, this type of URL is often referred to as a presigned URL. You can use presigned URLs to
embed clickable links, which can be valid for up to seven days, in HTML. For more information,
see [Authenticating Requests: Using Query Parameters (AWS Signature Version 4)](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html).
FrostFS S3 also supports browser-based uploads that use HTTP POST requests. With an HTTP POST request, you can upload
content to FrostFS S3 directly from the browser. For information about authenticating POST requests,
see [Browser-Based Uploads Using POST (AWS Signature Version 4)](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-UsingHTTPPOST.html).
#### Introduction to Signing Requests
Authentication information that you send in a request must include a signature. To calculate a signature, you first
concatenate select request elements to form a string, referred to as the string to sign. You then use a signing key to
calculate the hash-based message authentication code (HMAC) of the string to sign.
In AWS Signature Version 4, you don't use your secret access key to sign the request. Instead, you first use your secret
access key to derive a signing key. The derived signing key is specific to the date, service, and Region. For more
information about how to derive a signing key in different programming languages, see Examples of how to derive a
signing key for Signature Version 4.
The following diagram illustrates the general process of computing a signature.
<a>
<img src="images/authentication/aws-signing.png" alt="AWS Signing"/>
</a>
The string to sign depends on the request type. For example, when you use the HTTP Authorization header or the query
parameters for authentication, you use a varying combination of request elements to create the string to sign. For an
HTTP POST request, the POST policy in the request is the string you sign. For more information about computing string to
sign, follow links provided at the end of this section.
For signing key, the diagram shows series of calculations, where result of each step you feed into the next step. The
final step is the signing key.
Upon receiving an authenticated request, FrostFS S3 servers re-create the signature by using the authentication
information that is contained in the request. If the signatures match, FrostFS S3 processes your request; otherwise, the
request is rejected.
##### Signature Calculations for the Authorization Header
To calculate a signature, you first need a string to sign. You then calculate a HMAC-SHA256 hash of the string to sign
by using a signing key. The following diagram illustrates the process, including the various components of the string
that you create for signing.
When FrostFS S3 receives an authenticated request, it computes the signature and then compares it with the signature
that you provided in the request. For that reason, you must compute the signature by using the same method that is used
by FrostFS S3. The process of putting a request in an agreed-upon form for signing is called canonicalization.
<a>
<img src="images/authentication/auth-header-signing.png" alt="Signature Calculations for the Authorization Header"/>
</a>
See detains in [AWS documentation](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-header-based-auth.html).
#### s3-gw
s3-gw support the following ways to provide the singed request:
* [HTTP Authorization header](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html)
* [Query string parameters](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html)
* [Browser-Based Uploads Using POST](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-UsingHTTPPOST.html)
All these methods provide `AccessKeyId` and signature. Using `AccessKeyId` s3-gw can get `SecretAccessKey`
(see [data auth](#data-auth-process)) to compute signature using exactly the same mechanics
as [client does](#introduction-to-signing-requests). After signature calculation the s3-gw just compares signatures and
if they don't match the access denied is returned.
### AccessBox
`AccessBox` is an ordinary object in FrostFS storage. It contains all information that can be used by s3-gw to
successfully authenticate request. Also, it contains data that is required to successful authentication in FrostFS
storage node.
Based on this object s3 credentials are formed:
* `AccessKeyId` - is concatenated container id and object id (`<cid>0<oid>`) of `AccessBox` (
e.g. `2XGRML5EW3LMHdf64W2DkBy1Nkuu4y4wGhUj44QjbXBi05ZNvs8WVwy1XTmSEkcVkydPKzCgtmR7U3zyLYTj3Snxf`)
* `SecretAccessKey` - hex-encoded random generated 32 bytes (that is encrypted and stored in object payload)
> **Note**: sensitive info in `AccessBox` is [encrypted](#encryption), so only someone who posses specific private key
> can decrypt such info.
`AccessBox` has the following structure:
<a>
<img src="images/authentication/accessbox-object.svg" alt="AccessBox object structure"/>
</a>
**Headers:**
`AccessBox` object has the following attributes (at least them, it also can contain custom one):
* `Timestamp` - unix timestamp when object was created
* `__SYSTEM__EXPIRATION_EPOCH` - epoch after which the object isn't available anymore
* `S3-CRDT-Versions-Add` - comma separated list of previous versions of `AccessBox` (
see [AccessBox versions](#accessbox-versions))
* `S3-Access-Box-CRDT-Name` - `AccessKeyId` of credentials to which current `AccessBox` is related (
see [AccessBox versions](#accessbox-versions))
* `FilePath` - just object name
**Payload:**
The `AccessBox` payload is an encoded [AccessBox protobuf type](../creds/accessbox/accessbox.proto) .
It contains:
* Seed key - hex-encoded public seed key to compute shared secret using ECDH (see [encryption](#encryption))
* List of gate data:
* Gate public key (so that gate (when it will decrypt data later) know which one item from list it should process)
* Encrypted tokens:
* `SecretAccessKey` - hex-encoded random generated 32 bytes
* Marshaled bearer token - more detail
in [spec](https://git.frostfs.info/TrueCloudLab/frostfs-api/src/commit/4c68d92468503b10282c8a92af83a56f170c8a3a/acl/types.proto#L189)
* Marshaled session token - more detail
in [spec](https://git.frostfs.info/TrueCloudLab/frostfs-api/src/commit/4c68d92468503b10282c8a92af83a56f170c8a3a/session/types.proto#L89)
* Container placement policies:
* `LocationsConstraint` - name of location constraint that can be used to create bucket/container using s3
credentials related to this `AccessBox`
* Marshaled placement policy - more detail
in [spec](https://git.frostfs.info/TrueCloudLab/frostfs-api/src/commit/4c68d92468503b10282c8a92af83a56f170c8a3a/netmap/types.proto#L111)
#### AccessBox versions
Imagine the following scenario:
* There is a system where only one s3-gw exist
* There is a `AccessBox` that can be used by this s3-gw
* User has s3 credentials (`AccessKeyId`/`SecretAccessKey`) related to corresponded `AccessBox` and can successfully
make request to s3-gw
* The system is expanded and new one s3-gw is added
* User must be able to use the credentials (that he has already had) to make request to new one s3-gw
Since `AccessBox` object is immutable and `SecretAccessKey` is encrypted only for restricted list of keys (can be used
(decrypted) only by limited number of s3-gw) we have to create new `AccessBox` that has encrypted secrets for new list
of s3-gw and be related to initial s3 credentials (`AccessKeyId`/`SecretAccessKey`). Such relationship is done
by `S3-Access-Box-CRDT-Name`.
##### Search algorithm
To support scenario from previous section and find appropriate version of `AccessBox` (that contains more recent and
relevant data) the following sequence is used:
<a>
<img src="images/authentication/accessbox-search.svg" alt="AccessBox search process"/>
</a>
* Search all object whose attribute `S3-Access-Box-CRDT-Name` is equal to `AccessKeyId` (extract container id
from `AccessKeyId` that has format: `<cid>0<oid>`).
* Get metadata for these object using `HEAD` requests (not `Get` to reduce network traffic)
* Sort all these objects by creation epoch and object id
* Pick last object id (If no object is found then extract object id from `AccessKeyId` that has format: `<cid>0<oid>`.
We need to do this because versions of `AccessBox` can miss the `S3-Access-Box-CRDT-Name` attribute.)
* Get appropriate object from FrostFS storage
* Decrypt `AccessBox` (see [encryption](#encryption))
#### Encryption
Each `AccessBox` contains sensitive information (`AccessSecretKey`, bearer/session tokens etc.) that must be protected
and available only to trusted parties (in our case it's a s3-gw).
To encrypt/decrypt data the authenticated encryption with associated
data ([AEAD](https://en.wikipedia.org/wiki/Authenticated_encryption)) is used. The encryption algorithm
is [ChaCha20-Poly1305](https://en.wikipedia.org/wiki/ChaCha20-Poly1305) ([RFC](https://datatracker.ietf.org/doc/html/rfc7905)).
Is the following algorithm the ECDSA keys (with curve implements NIST P-256 (FIPS 186-3, section D.2.3) also known as
secp256r1 or prime256v1) is used (unless otherwise stated).
**Encryption:**
* Create ephemeral key (`SeedKey`), it's need to generate shared secret
* Generate random 32-byte (that after hex-encoded be `SecretAccessKey`) or use existing secret access key
(if `AccessBox` is being updated rather than creating brand new)
* Generate shared secret as [ECDH](https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman)
* Derive 32-byte key using shared secret from previous step with key derivation function based on
HMAC with SHA256 [HKDF](https://en.wikipedia.org/wiki/HKDF)
* Encrypt marshaled [Tokens](../creds/accessbox) using derived key
with [ChaCha20-Poly1305](https://en.wikipedia.org/wiki/ChaCha20-Poly1305) algorithm without additional data.
**Decryption:**
* Get public part of `SeedKey` from `AccessBox`
* Generate shared secret as follows:
* Make scalar curve multiplication of public part of `SeedKey` and private part of s3-gw key
* Use `X` part of multiplication (with zero padding at the beginning to fit 32-byte)
* Derive 32-byte key using shared secret from previous step with key derivation function based on
HMAC with SHA256 [HKDF](https://en.wikipedia.org/wiki/HKDF)
* Decrypt encrypted marshaled [Tokens](../creds/accessbox) using derived key
with [ChaCha20-Poly1305](https://en.wikipedia.org/wiki/ChaCha20-Poly1305) algorithm without additional data.
### Policies
The main repository that contains policy implementation is https://git.frostfs.info/TrueCloudLab/policy-engine.
Policies can be stored locally (using [control api](#control-auth-process)) or in `policy` contract. When policies check
is performed the following algorithm is applied:
* Check local policies:
* If any rule was matched return checking result.
* Check contract policies:
* If any rule was matched return checking result.
* If no rules were matched return `deny` status.
To local and contract policies `deny first` scheme is applied. This means that if several rules were matched for
reqeust (with both statuses `allow` and `deny`) the resulting status be `deny`.
Policy rules validate if specified request can be performed on the specific resource. Request and resource can contain
some properties and rules can contain conditions on some such properties.
In s3-gw resource is `/bucket/object`, `/bucket` or just `/` (if request is trying to list buckets).
Currently, request that is checked contains the following properties (so policy rule can contain conditions on them):
* `Owner` - address of owner that is performing request (this is taken from bearer token from `AccessBox`)
* `frostfsid:groupID` - groups to which the owner belongs (this is taken from `frostfsid` contract)
## Control auth process
There are control path [grpc api](../pkg/service/control/service.proto) in s3-gw that also has their own authentication
and authorization process.
But this process is quite straight forward:
* Get grpc request
* Check if signing key belongs to [allowed key list](configuration.md#control-section) (that is located in config file)
* Validate signature
For signing process the asymmetric encryption based on elliptic curves (`ECDSA_SHA512`) is used.
For more details see the appropriate code
in [frostfs-api](https://git.frostfs.info/TrueCloudLab/frostfs-api/src/commit/4c68d92468503b10282c8a92af83a56f170c8a3a/refs/types.proto#L94)
and [frostfs-api-go](https://git.frostfs.info/TrueCloudLab/frostfs-api-go/src/commit/a85146250b312fcdd6da9a71285527fed544234f/refs/types.go#L38).