frostfs-s3-gw/docs/authentication.md
Denis Kirillov 6a90f4e624 [#509] Update docs
Signed-off-by: Denis Kirillov <d.kirillov@yadro.com>
2024-10-23 15:01:31 +03:00

18 KiB

Authentication and authorization scheme

This document describes s3-gw authentication and authorization mechanism.

General overview

Basic provisions:

  • A request to s3-gw can be signed or not (request that isn't signed we will call anonymous or just anon)
  • To manage resources (buckets/objects) using s3-gw you must have appropriate access rights

Each request must be authenticated (at least as anonymous) and authorized. The following scheme shows components that are involved in this process.

Auth general overview

There are several participants in this process:

  1. User that make a request
  2. S3-GW that accepts a request
  3. FrostFS Storage that stores AccessObjects (objects are needed for authentication)
  4. Blockchain smart contracts (frostfsid, policy) that store user info and access rules.

Data auth process

Let's look at the process in more detail:

Auth sequence diagram
  • First of all, someone makes a request. If request is signed we will check its signature (Authentication) after that we will check access rights using policies (Auhorization). For anonymous requests only authorization is performed.

  • Authentication steps:

    • Each signed request is provided with AccessKeyId and signature. So if request is signed we must check its signature. To do this we must know the AccessKeyId/SecretAccessKey pair (For how the signature is calculated using this pair, see signing. Client and server (s3-gw) use the same credentials and algorithm to compute signature). The AccessKeyId is a public part of credentials, and it's passed to the gate in request. The private part of credentials is SecretAccessKey and it's encrypted and stored in AccessBox. So on this step we must find appropriate AccessBox in FrostFS storage node (For how to find appropriate AccessBox knowing AccessKeyId, see search algorithm). On this stage we can get AccessDenied from FrostFS storage node if the s3-gw doesn't have permission to read this AccessBox object.

    • After successfully retrieving the object we must extract SecretAccessKey from it. Since it's encrypted, the s3-gw must decrypt (see encryption) this object using its own private key and SeedKey from AccessBox (see AccessBox inner structure). After s3-gw got the AccessKeyId/SecretAccessKey pair it calculates signature and compares this signature with one provided by the request. If signature doesn't match the AccessDenied is returned.

    • AccessBox also contains OwnerID that is related to AccessKeyId that was provided. So we have to check if such OwnerID exists in frsotfsid contract (that stores all registered valid users). If user doesn't exist in contract the AccessDenied is returned.

  • Authorization steps:

    • To know if user has access right to do what he wants to do we must find appropriate access policies. Such policies are stored in policy contract and locally (can be manged using control api). So we need to get policies from contract and check them along with local to decide if user has access right. If he doesn't have such right the AccessDenied is returned.
  • After successful authentication and authorization the request will be processed by s3-gw business logic and finally be propagated to FrostFS storage node which also performs some auth checks and can return AccessDenied. If this happens s3-gw also returns AccessDenied as a response.

AWS Signing

Every interaction with FrostFS S3 gateway is either authenticated or anonymous. This section explains request authentication with the AWS Signature Version 4 algorithm. More info in AWS documentation:

Authentication Methods

You can express authentication information by using one of the following methods:

  • HTTP Authorization header - Using the HTTP Authorization header is the most common method of authenticating FrostFS S3 request. All the FrostFS S3 REST operations (except for browser-based uploads using POST requests) require this header. For more information about the Authorization header value, and how to calculate signature and related options, see Authenticating Requests: Using the Authorization Header (AWS Signature Version 4).
  • Query string parameters - You can use a query string to express a request entirely in a URL. In this case, you use query parameters to provide request information, including the authentication information. Because the request signature is part of the URL, this type of URL is often referred to as a presigned URL. You can use presigned URLs to embed clickable links, which can be valid for up to seven days, in HTML. For more information, see Authenticating Requests: Using Query Parameters (AWS Signature Version 4).

FrostFS S3 also supports browser-based uploads that use HTTP POST requests. With an HTTP POST request, you can upload content to FrostFS S3 directly from the browser. For information about authenticating POST requests, see Browser-Based Uploads Using POST (AWS Signature Version 4).

Introduction to Signing Requests

Authentication information that you send in a request must include a signature. To calculate a signature, you first concatenate select request elements to form a string, referred to as the string to sign. You then use a signing key to calculate the hash-based message authentication code (HMAC) of the string to sign.

In AWS Signature Version 4, you don't use your secret access key to sign the request. Instead, you first use your secret access key to derive a signing key. The derived signing key is specific to the date, service, and Region. For more information about how to derive a signing key in different programming languages, see Examples of how to derive a signing key for Signature Version 4.

The following diagram illustrates the general process of computing a signature.

AWS Signing

The string to sign depends on the request type. For example, when you use the HTTP Authorization header or the query parameters for authentication, you use a varying combination of request elements to create the string to sign. For an HTTP POST request, the POST policy in the request is the string you sign. For more information about computing string to sign, follow links provided at the end of this section.

For signing key, the diagram shows series of calculations, where the result of each step you feed into the next step. The final step is the signing key.

Upon receiving an authenticated request, FrostFS S3 servers re-create the signature by using the authentication information that is contained in the request. If the signatures match, FrostFS S3 processes your request; otherwise, the request is rejected.

Signature Calculations for the Authorization Header

To calculate a signature, you first need a string to sign. You then calculate a HMAC-SHA256 hash of the string to sign by using a signing key. The following diagram illustrates the process, including the various components of the string that you create for signing.

When FrostFS S3 receives an authenticated request, it computes the signature and then compares it with the signature that you provided in the request. For that reason, you must compute the signature by using the same method that is used by FrostFS S3. The process of putting a request in an agreed-upon form for signing is called canonicalization.

Signature Calculations for the Authorization Header

See detains in AWS documentation.

s3-gw

s3-gw supports the following ways to provide the singed request:

All these methods provide AccessKeyId and signature. Using AccessKeyId s3-gw can get SecretAccessKey (see data auth) to compute signature using exactly the same mechanics as client does. After signature calculation the s3-gw just compares signatures and if they don't match the access denied is returned.

AccessBox

AccessBox is an ordinary object in FrostFS storage. It contains all information that can be used by s3-gw to successfully authenticate request. Also, it contains data that is required for successful authentication in FrostFS storage node.

Object s3 credentials are formed based on:

  • AccessKeyId - is concatenated container id and object id (<cid>0<oid>) of AccessBox ( e.g. 2XGRML5EW3LMHdf64W2DkBy1Nkuu4y4wGhUj44QjbXBi05ZNvs8WVwy1XTmSEkcVkydPKzCgtmR7U3zyLYTj3Snxf). Or it can be arbitrary user-provided unique string with min length 4 and max length 128.
  • SecretAccessKey - hex-encoded random generated 32 bytes (that is encrypted and stored in object payload). Or it can be arbitrary user-provided unique string with min length 4 and max length 128.

Note

: sensitive info in AccessBox is encrypted, so only someone who posses specific private key can decrypt such info.

AccessBox has the following structure:

AccessBox object structure

Headers:

AccessBox object has the following attributes (at least them, it also can contain custom ones):

  • Timestamp - unix timestamp indicating when the object was created
  • __SYSTEM__EXPIRATION_EPOCH - epoch after which the object isn't available anymore
  • S3-CRDT-Versions-Add - comma separated list of previous versions of AccessBox ( see AccessBox versions)
  • S3-Access-Box-CRDT-Name - AccessKeyId of credentials to which current AccessBox is related ( see AccessBox versions)
  • FilePath - just object name

Payload:

The AccessBox payload is an encoded AccessBox protobuf type . It contains:

  • Seed key - hex-encoded public seed key to compute shared secret using ECDH (see encryption)
  • List of gate data:
    • Gate public key (so that gate (when it will decrypt data later) know which item from the list it should process)
    • Encrypted tokens:
      • SecretAccessKey - hex-encoded random generated 32 bytes (or arbitrary user-provided string)
      • Marshaled bearer token - more detail in spec
      • Marshaled session token - more detail in spec
  • Container placement policies:
    • LocationsConstraint - name of location constraint that can be used to create bucket/container using s3 credentials related to this AccessBox
    • Marshaled placement policy - more detail in spec

AccessBox versions

Imagine the following scenario:

  • There is a system where only one s3-gw exists
  • There is an AccessBox that can be used by this s3-gw
  • User has s3 credentials (AccessKeyId/SecretAccessKey) related to corresponding AccessBox and can successfully make request to s3-gw
  • The system is expanded and a new s3-gw is added
  • User must be able to use the credentials (that he has already had) to make request to the new s3-gw

Since AccessBox object is immutable and SecretAccessKey is encrypted only for restricted list of keys (can be used (decrypted) only by limited number of s3-gw) we have to create a new AccessBox that has encrypted secrets for a new list of s3-gw and is related to the initial s3 credentials (AccessKeyId/SecretAccessKey). Such relation is done by S3-Access-Box-CRDT-Name.

Search algorithm

To support scenario from previous section and find appropriate version of AccessBox (that contains more recent and relevant data) the following sequence is used:

AccessBox search process
  • Search all object whose attribute S3-Access-Box-CRDT-Name is equal to AccessKeyId (extract container id from AccessKeyId that has format: <cid>0<oid> if AccessBox was created with default parameters, or it can also be arbitrary user-defined string).
  • Get metadata for these object using HEAD requests (not Get to reduce network traffic)
  • Sort all these objects by creation epoch and object id
  • Pick last object id (If no object is found then extract object id from AccessKeyId that has format: <cid>0<oid> (if AccessBox was created with default parameters, or it can also be arbitrary user-defined string). We need to do this because versions of AccessBox can miss the S3-Access-Box-CRDT-Name attribute.)
  • Get appropriate object from FrostFS storage
  • Decrypt AccessBox (see encryption)

Encryption

Each AccessBox contains sensitive information (AccessSecretKey, bearer/session tokens etc.) that must be protected and available only to trusted parties (in our case it's a s3-gw).

To encrypt/decrypt data the authenticated encryption with associated data (AEAD) is used. The encryption algorithm is ChaCha20-Poly1305 (RFC).

Is the following algorithm the ECDSA keys (with curve implements NIST P-256 (FIPS 186-3, section D.2.3) also known as secp256r1 or prime256v1) is used (unless otherwise stated).

Encryption:

  • Create ephemeral key (SeedKey), it's need to generate shared secret
  • Generate random 32-byte (that after hex-encoded be SecretAccessKey) or use existing secret access key (if AccessBox is being updated rather than creating brand new) or use arbitrary user-provided string
  • Generate shared secret as ECDH
  • Derive 32-byte key using shared secret from previous step with key derivation function based on HMAC with SHA256 HKDF
  • Encrypt marshaled Tokens using derived key with ChaCha20-Poly1305 algorithm without additional data.

Decryption:

  • Get public part of SeedKey from AccessBox
  • Generate shared secret as follows:
    • Make scalar curve multiplication of public part of SeedKey and private part of s3-gw key
    • Use X part of multiplication (with zero padding at the beginning to fit 32-byte)
  • Derive 32-byte key using shared secret from previous step with key derivation function based on HMAC with SHA256 HKDF
  • Decrypt encrypted marshaled Tokens using derived key with ChaCha20-Poly1305 algorithm without additional data.

Policies

The main repository that contains policy implementation is https://git.frostfs.info/TrueCloudLab/policy-engine.

Policies can be stored locally (using control api) or in policy contract. When policies check is performed the following algorithm is applied:

  • Check local policies:
    • If any rule was matched return checking result.
  • Check contract policies:
    • If any rule was matched return checking result.
    • If no rules were matched return deny status.

To local and contract policies deny first scheme is applied. This means that if several rules were matched for reqeust (with both statuses allow and deny) the resulting status is deny.

Policy rules validate if specified request can be performed on the specific resource. Request and resource can contain some properties, and rules can contain conditions on some of these properties.

In s3-gw resource is /bucket/object, /bucket or just / (if request is trying to list buckets). Currently, request that is checked contains the following properties (so policy rule can contain conditions on them):

  • Owner - address of owner that is performing request (this is taken from bearer token from AccessBox)
  • frostfsid:groupID - groups to which the owner belongs (this is taken from frostfsid contract)

Control auth process

There are control path grpc api in s3-gw that also has their own authentication and authorization process.

But this process is quite straight forward:

  • Get grpc request
  • Check if signing key belongs to allowed key list (that is located in config file)
  • Validate signature

For signing process the asymmetric encryption based on elliptic curves (ECDSA_SHA512) is used. For more details see the appropriate code in frostfs-api and frostfs-api-go.