Section 8.2 of RFC 8555 explains how retries apply to the validation
process. However, much is left up to the implementer.
Add retries every 12 seconds for 2 minutes after a client requests a
validation. The challenge status remains "processing" indefinitely until
a distinct conclusion is reached. This allows a client to continually
re-request a validation by sending a post-get to the challenge resource
until the process fails or succeeds.
Challenges in the processing state include information about why a
validation did not complete in the error field. The server also includes
a Retry-After header to help clients and servers coordinate.
Retries are inherently stateful because they're part of the public API.
When running step-ca in a highly available setup with replicas, care
must be taken to maintain a persistent identifier for each instance
"slot". In kubernetes, this implies a *stateful set*.