From e80d147d722d149e392e332af2ff4fee4278b1d9 Mon Sep 17 00:00:00 2001 From: Roman Khimov Date: Fri, 30 Apr 2021 00:15:04 +0300 Subject: [PATCH] README: rewrite all documentation I think we can fit into one page with this, thus how-to-check.md was removed (and it's a bit bloated to me anyway). --- README.md | 295 +++++++++++++++++++++++++++++++++++-------- docs/how-to-check.md | 111 ---------------- 2 files changed, 239 insertions(+), 167 deletions(-) delete mode 100644 docs/how-to-check.md diff --git a/README.md b/README.md index 5e5472d..d15a01f 100644 --- a/README.md +++ b/README.md @@ -4,57 +4,139 @@ NeoFS HTTP Protocol Gateway bridges NeoFS internal protocol and HTTP standard. - you can download one file per request from NeoFS Network - you can upload one file per request into the NeoFS Network -## Notable make targets +## Installation + +```go get -u github.com/nspcc-dev/neofs-http-gate``` + +Or you can call `make` to build it from the cloned repository (the binary will +end up in `bin/neofs-http-gw`). + +### Notable make targets ``` dep Check and ensure dependencies image Build clean docker image -dirty-image Build diry docker image with host-built binaries +dirty-image Build dirty docker image with host-built binaries fmts Run all code formatters lint Run linters version Show current version ``` -## Install +## Execution -```go get -u github.com/nspcc-dev/neofs-http-gate``` +HTTP gateway itself is not a NeoFS node, so to access NeoFS it uses node's +gRPC interface and you need to provide some node that it will connect to. This +can be done either via `-p` parameter or via `HTTP_GW_PEERS__ADDRESS` and +`HTTP_GW_PEERS__WEIGHT` environment variables (the gate supports multiple +NeoFS nodes with weighted load balancing). -## File uploading behaviors +These two commands are functionally equivalent, they run the gate with one +backend node (and otherwise default settings): +``` +$ neofs-http-gw -p 192.168.130.72:8080 +$ HTTP_GW_PEERS_0_ADDRESS=192.168.130.72:8080 neofs-http-gw +``` -- you can upload on file per request -- if `FileName` not provided by Header attributes, multipart/form filename will be used instead +### Configuration -## Configuration +In general, everything available as CLI parameter can also be specified via +environment variables, so they're not specifically mentioned in most cases +(see `--help` also). + +#### Nodes and weights + +You can specify multiple `-p` options to add more NeoFS nodes, this will make +gateway spread requests equally among them (using weight 1 for every node): ``` -# Flags: +$ neofs-http-gw -p 192.168.130.72:8080 -p 192.168.130.71:8080 +``` +If you want some specific load distribution proportions, use weights, but they +can only be specified via environment variables: - --pprof enable pprof - --metrics enable prometheus - -h, --help show help - -v, --version show version - --key string path to private key file, hex string or wif (the key will be autogenerated if not specified) - --verbose debug gRPC connections - --request_timeout duration gRPC request timeout (default 5s) - --connect_timeout duration gRPC connect timeout (default 30s) - --listen_address string HTTP gate's listen address (default "0.0.0.0:8082") - --tls_certificate string TLS certificate path - --tls_key string TLS key path - -p, --peers stringArray NeoFS nodes +``` +$ HTTP_GW_PEERS_0_ADDRESS=192.168.130.72:8080 HTTP_GW_PEERS_0_WEIGHT=9 \ + HTTP_GW_PEERS_1_ADDRESS=192.168.130.71:8080 HTTP_GW_PEERS_1_WEIGHT=1 neofs-http-gw +``` +This command will make gateway use 192.168.130.72 for 90% of requests and +192.168.130.71 for remaining 10%. -# Environments: +#### Keys -HTTP_GW_KEY=string - Path to private key file, hex string or wif string -HTTP_GW_CONNECT_TIMEOUT=duration - Timeout for connection -HTTP_GW_REQUEST_TIMEOUT=duration - Timeout for request -HTTP_GW_REBALANCE_TIMER=duration - Time between connections checks -HTTP_GW_LISTEN_ADDRESS=host:port - Address to listen connections -HTTP_GW_TLS_CERTIFICATE=path - File with TLS certificate -HTTP_GW_TLS_KEY=path - File with TLS private key -HTTP_GW_PEERS__ADDRESS=host:port - Address of NeoFS Node -HTTP_GW_PEERS__WEIGHT=float - Weight of NeoFS Node (1 if not specified) -HTTP_GW_PPROF=bool - Enable/disable pprof (/debug/pprof) -HTTP_GW_METRICS=bool - Enable/disable prometheus metrics endpoint (/metrics) +By default gateway autogenerates key pair it will use for NeoFS requests. If +for some reason you need to have static keys you can pass them via `--key` +parameter. The key can be a path to private key file (as raw bytes), a hex +string or (unencrypted) WIF string. Example: + +``` +$ neofs-http-gw -p 192.168.130.72:8080 -k KxDgvEKzgSBPPfuVfw67oPQBSjidEiqTHURKSDL1R7yGaGYAeYnr +``` + +#### Binding and TLS + +Gateway binds to `0.0.0.0:8082` by default and you can change that with +`--listen_address` option. + +It can also provide TLS interface for its users, just specify paths to key and +certificate files via `--tls_key` and `--tls_certificate` parameters. Note +that using these options makes gateway TLS-only, if you need to serve both TLS +and plain text HTTP you either have to run two gateway instances or use some +external redirecting solution. + +Example to bind to `192.168.130.130:443` and serve TLS there: + +``` +$ neofs-http-gw -p 192.168.130.72:8080 --listen_address 192.168.130.130:443 \ + --tls_key=key.pem --tls_certificate=cert.pem +``` + +#### HTTP parameters + +You can tune HTTP read and write buffer sizes as well as timeouts with +`HTTP_GW_WEB_READ_BUFFER_SIZE`, `HTTP_GW_WEB_READ_TIMEOUT`, +`HTTP_GW_WEB_WRITE_BUFFER_SIZE` and `HTTP_GW_WEB_WRITE_TIMEOUT` environment +variables. + +`HTTP_GW_WEB_STREAM_REQUEST_BODY` environment variable can be used to disable +request body streaming (effectively it'll make gateway accept file completely +first and only then try sending it to NeoFS). + +`HTTP_GW_WEB_MAX_REQUEST_BODY_SIZE` controls maximum request body size +limiting uploads to files slightly lower than this limit. + +#### NeoFS parameters + +Gateway can automatically set timestamps for uploaded files based on local +time source, use `HTTP_GW_UPLOAD_HEADER_USE_DEFAULT_TIMESTAMP` environment +variable to control this behavior. + +#### Monitoring and metrics + +Pprof and Prometheus are integrated into the gateway, but not enabled by +default. To enable them use `--pprof` and `--metrics` flags or +`HTTP_GW_PPROF`/`HTTP_GW_METRICS` environment variables. + +#### Timeouts + +You can tune gRPC interface parameters with `--connect_timeout` (for +connection to node) and `--request_timeout` (for request processing over +established connection) options as well as `HTTP_GW_KEEPALIVE_TIME` +(peer pinging interval), `HTTP_GW_KEEPALIVE_TIMEOUT` (peer pinging timeout) +and `HTTP_GW_KEEPALIVE_PERMIT_WITHOUT_STREAM` environment variables. + +gRPC-level checks allow gateway to detect dead peers, but it declares them +unhealthy at pool level once per `--rebalance_timer` interval, so check for it +if needed. + +All timing options accept values with suffixes, so "15s" is 15 seconds and +"2m" is 2 minutes. + +#### Logging + +`--verbose` flag enables gRPC logging and there is a number of environment +variables to tune logging behavior: + +``` HTTP_GW_LOGGER_FORMAT=string - Logger format HTTP_GW_LOGGER_LEVEL=string - Logger level HTTP_GW_LOGGER_NO_CALLER=bool - Logger don't show caller @@ -62,27 +144,128 @@ HTTP_GW_LOGGER_NO_DISCLAIMER=bool - Logger don't show application HTTP_GW_LOGGER_SAMPLING_INITIAL=int - Logger sampling initial HTTP_GW_LOGGER_SAMPLING_THEREAFTER=int - Logger sampling thereafter HTTP_GW_LOGGER_TRACE_LEVEL=string - Logger show trace on level -HTTP_GW_KEEPALIVE_TIME=duration - After a duration of this time if the client sees no activity - it pings the server to see if the transport is still alive -HTTP_GW_KEEPALIVE_TIMEOUT=duration - After having pinged for keepalive check, the client waits for a duration - of Timeout and if no activity is seen even after that the connection - is closed -HTTP_GW_KEEPALIVE_PERMIT_WITHOUT_STREAM=bool - If true, client sends keepalive pings even with no active RPCs. - If false, when there are no active RPCs, Time and Timeout will be - ignored and no keepalive pings will be sent -HTTP_GW_UPLOAD_HEADER_USE_DEFAULT_TIMESTAMP=bool - Enable/disable adding current timestamp attribute when object uploads - -HTTP_GW_WEB_READ_BUFFER_SIZE=4096 - per-connection buffer size for requests' reading -HTTP_GW_WEB_READ_TIMEOUT=15s - an amount of time allowed to read the full request including body -HTTP_GW_WEB_WRITE_BUFFER_SIZE=4096 - per-connection buffer size for responses' writing -HTTP_GW_WEB_WRITE_TIMEOUT=1m0s - maximum duration before timing out writes of the response -HTTP_GW_WEB_STREAM_REQUEST_BODY=true - enables request body streaming, and calls the handler sooner when given - body is larger then the current limit -HTTP_GW_WEB_MAX_REQUEST_BODY_SIZE=4194304 - maximum request body size, server rejects requests with bodies exceeding - this limit - -Peers preset: - -HTTP_GW_PEERS__ADDRESS = string -HTTP_GW_PEERS__WEIGHT = float ``` + +## HTTP API provided + +This gateway intentionally provides limited feature set and doesn't try to +substitute (or completely wrap) regular gRPC NeoFS interface. You can download +and upload objects with it, but deleting, searching, managing ACLs, creating +containers and other activities are not supported and not planned to be +supported. + +### Downloading + +#### Requests + +Basic downloading involves container and object ID and is done via GET +requests to `/get/$CID/$OID` path, like this: + +``` +$ wget http://localhost:8082/get/Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ/2m8PtaoricLouCn5zE8hAFr3gZEBDCZFe9BEgVJTSocY + +``` + +There is also more complex interface provided for attribute-based downloads, +it's usually used to retrieve files by their names, but any other attribute +can be used as well. The generic syntax for it looks like this: + +```/get_by_attribute/$CID/$ATTRIBUTE_NAME/$ATTRIBUTE_VALUE``` + +where `$CID` is a container ID, `$ATTRIBUTE_NAME` is the name of the attribute +we want to use and `ATTRIBUTE_VALUE` is the value of this attribute that the +target object should have. + +If multiple objects have specified attribute with specified value, then the +first one of them is returned (and you can't get others via this interface). + +Example for file name attribute: + +``` +$ wget http://localhost:8082/get_by_attribute/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/FileName/cat.jpeg +``` + +Some other user-defined attribute: + +``` +$ wget http://localhost:8082/get_by_attribute/Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ/Ololo/100500 +``` + +An optional `download=true` argument for `Content-Disposition` management is +also supported (more on that below): + +``` +$ wget http://localhost:8082/get/Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ/2m8PtaoricLouCn5zE8hAFr3gZEBDCZFe9BEgVJTSocY?download=true + +``` + +#### Replies + +You get object contents in the reply body, but at the same time you also get a +set of reply headers generated using the following rules: + * `Content-Length` is set to the length of the object + * `Content-Type` is autodetected dynamically by gateway + * `Content-Disposition` is `inline` for regular requests and `attachment` for + requests with `download=true` argument, `filename` is also added if there + is `FileName` attribute set for this object + * `Last-Modified` header is set to `Timestamp` attribute value if it's + present for the object + * `x-container-id` contains container ID + * `x-object-id` contains object ID + * `x-owner-id` contains owner address + * all the other NeoFS attributes are converted to `x-*` attributes (but only + if they can be safely represented in HTTP header), for example `FileName` + attribute becomes `x-FileName` + +### Uploading + +You can POST files to `/upload/$CID` path where `$CID` is container ID. The +request must contain multipart form with mandatory `filename` parameter. Only +one part in multipart form will be processed, so to upload another file just +issue new POST request. + +Example request: + +``` +$ curl -F 'file=@cat.jpeg;filename=cat.jpeg' http://localhost:8082/upload/Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ +``` + +Chunked encoding is supported by the server (but check for request read +timeouts if you're planning some streaming). You can try streaming support +with large file piped through named FIFO pipe: + +``` +$ mkfifo pipe +$ cat video.mp4 > pipe & +$ curl --no-buffer -F 'file=@pipe;filename=catvideo.mp4' http://localhost:8082/upload/Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ +``` + +You can also add some attributes to your file using the following rules: + * all "X-Attribute-*" headers get converted to object attributes with + "X-Attribute-" prefix stripped, that is if you add "X-Attribute-Ololo: + 100500" header to your request the resulting object will get "Ololo: + 100500" attribute + * "X-Attribute-NEOFS-*" headers are special, they're used to set internal + NeoFS attributes starting with `__NEOFS__` prefix, for these attributes all + dashes get converted to underscores and all letters are capitalized. For + example, you can use "X-Attribute-NEOFS-Expiration-Epoch" header to set + `__NEOFS__EXPIRATION_EPOCH` attribute + * `FileName` attribute is set from multipart's `filename` if not set + explicitly via `X-Attribute-FileName` header + * `Timestamp` attribute can be set using gateway local time if using + HTTP_GW_UPLOAD_HEADER_USE_DEFAULT_TIMESTAMP option and if request doesn't + provide `X-Attribute-Timestamp` header of its own + +For successful uploads you get JSON data in reply body with container and +object ID, like this: +``` +{ + "object_id": "9ANhbry2ryjJY1NZbcjryJMRXG5uGNKd73kD3V1sVFsX", + "container_id": "Dxhf4PNprrJHWWTG5RGLdfLkJiSQ3AQqit1MSnEPRkDZ" +} +``` + +### Metrics and Pprof + +If enabled, Prometheus metrics are available at `/metrics/` path and Pprof at +`/debug/pprof`. diff --git a/docs/how-to-check.md b/docs/how-to-check.md deleted file mode 100644 index ca96116..0000000 --- a/docs/how-to-check.md +++ /dev/null @@ -1,111 +0,0 @@ -# How to check - -1. Create container -``` -→ neofs-cli -k /path/to/user.key -r s01.neofs.devenv:8080 container create --name TestStorage --basic-acl public -p "REP 1 IN X CBF 1 SELECT 1 FROM * AS X" --await -container ID: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -awaiting... -container has been persisted on sidechain -``` - -2. Put object into container - -``` -→ neofs-cli -k /path/to/user.key -r s01.neofs.devenv:8080 object put --cid 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM --file /path/to/1.jpeg -[/path/to/1.jpeg] Object successfully stored - ID: GTUokhLMtEq1Kh1nzSVCsWybFvVHFQyhZHaVXZBrYd3W - CID: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -``` - -3. Check that object can be fetched by oldest API - -``` -→ curl -sSI -XGET http://http.neofs.devenv/get/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/GTUokhLMtEq1Kh1nzSVCsWybFvVHFQyhZHaVXZBrYd3W -HTTP/1.1 200 OK -Date: Thu, 03 Dec 2020 15:04:52 GMT -Content-Type: image/jpeg -Content-Length: 93077 -x-object-id: GTUokhLMtEq1Kh1nzSVCsWybFvVHFQyhZHaVXZBrYd3W -x-owner-id: NTrezR3C4X8aMLVg7vozt5wguyNfFhwuFx -x-container-id: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -x-FileName: 1.jpeg -x-Timestamp: 1607006318 -Last-Modified: Thu, 03 Dec 2020 17:38:38 MSK -Content-Disposition: inline; filename=1.jpeg -``` - -4. Check that object can be fetched by newest API - -``` -→ curl -sSI -XGET http://http.neofs.devenv/get_by_attribute/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/FileName/1.jpeg -HTTP/1.1 200 OK -Date: Thu, 03 Dec 2020 15:04:52 GMT -Content-Type: image/jpeg -Content-Length: 93077 -x-object-id: GTUokhLMtEq1Kh1nzSVCsWybFvVHFQyhZHaVXZBrYd3W -x-owner-id: NTrezR3C4X8aMLVg7vozt5wguyNfFhwuFx -x-container-id: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -x-FileName: 1.jpeg -x-Timestamp: 1607006318 -Last-Modified: Thu, 03 Dec 2020 17:38:38 MSK -Content-Disposition: inline; filename=1.jpeg -``` - -5. Put second object with same name - -``` -→ neofs-cli -k /path/to/user.key -r s01.neofs.devenv:8080 object put --cid 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM --file /path/to/1.jpeg -[/path/to/1.jpeg] Object successfully stored - ID: 14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga - CID: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM - -``` - -6. Check that object can be fetched by oldest API - -``` -→ curl -sSI -XGET http://http.neofs.devenv/get/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga -HTTP/1.1 200 OK -Date: Thu, 03 Dec 2020 15:07:51 GMT -Content-Type: image/jpeg -Content-Length: 93077 -x-object-id: 14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga -x-owner-id: NTrezR3C4X8aMLVg7vozt5wguyNfFhwuFx -x-container-id: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -x-FileName: 1.jpeg -x-Timestamp: 1607006355 -Last-Modified: Thu, 03 Dec 2020 17:39:15 MSK -Content-Disposition: inline; filename=1.jpeg -``` - -7. Retry fetch object by newest API - -``` -→ curl -sSI -XGET http://http.neofs.devenv/get_by_attribute/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/FileName/1.jpeg -HTTP/1.1 200 OK -Date: Thu, 03 Dec 2020 15:04:28 GMT -Content-Type: image/jpeg -Content-Length: 93077 -x-object-id: 14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga -x-owner-id: NTrezR3C4X8aMLVg7vozt5wguyNfFhwuFx -x-container-id: 88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM -x-FileName: 1.jpeg -x-Timestamp: 1607006355 -Last-Modified: Thu, 03 Dec 2020 17:39:15 MSK -Content-Disposition: inline; filename=1.jpeg -``` - -**http-gate log when find multiple objects** -``` -2020-12-03T18:04:28.617+0300 debug neofs-gw/receive.go:191 find multiple objects {"cid": "88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM", "attr_key": "FileName", "attr_val": "1.jpeg", "object_ids": ["14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga", "GTUokhLMtEq1Kh1nzSVCsWybFvVHFQyhZHaVXZBrYd3W"], "show_object_id": "14Q3AhJhPyJzWrmiYMzswRDY4cXSUgKPSAEDxadkHKga"} -``` - -8. Check newest API when object not found - -``` -→ curl -sSI -XGET http://http.neofs.devenv/get_by_attribute/88GdaZFTcYJn1dqiSECss8kKPmmun6d6BfvC4zhwfLYM/FileName/2.jpeg -HTTP/1.1 404 Not Found -Date: Thu, 03 Dec 2020 15:11:07 GMT -Content-Type: text/plain; charset=utf-8 -Content-Length: 9 -```