Concepts
Delivery modes & topologies
EchoRelay sits in the middle of your HTTP traffic — the target your callers hit and the source that calls your services. A request comes in; each target receives it in one of three delivery modes. Combine endpoints and targets and you get four named topologies.
The three delivery modes
A delivery mode is a per-target setting (the API / MCP enum is async /
sync / stream). One endpoint can
mix modes across its targets.
| Mode | What it does | The caller gets |
|---|---|---|
Queuedasync |
Queued, retried, fanned out to every target. The default. | 202 Accepted, immediately. |
Request/responsesync |
We call the target inline and return its response. Capped 1–90 s; not retried. | The target's own response and status (usually 200). |
Streamingstream |
We forward an SSE / chunked response back as the target produces it. One stream target per endpoint. | The stream, chunk by chunk. |
With sync or stream the caller gets
the response back — but these are not thin proxies: sync round-trips our
durable queue and a delivery worker, while stream is a direct in-process
proxy. The caller gets the data; we don't promise a specific latency figure.
A mixed-mode endpoint
Fan out to queued targets and return one sync target's response in the same call:
{
"targets": [
{ "name": "Slack", "targetUrl": "https://hooks.slack.com/...", "deliveryMode": "async" },
{ "name": "Warehouse", "targetUrl": "https://logs.example.com/in", "deliveryMode": "async" },
{ "name": "Lookup", "targetUrl": "https://api.example.com/price", "deliveryMode": "sync" }
]
}
The four topologies
The shapes you build by combining endpoints, targets, and modes. Each has a use-case page with the full story.
Fan-out · 1 → N
One request fans out to many targets — authenticated, validated, retried, delivered at-least-once with a replayable failure queue. Fan-out →
Streaming proxy · caller ⇄ target
Forward a streaming target's SSE / chunked response straight to the caller — built for LLM token streaming and long-running inference. Streaming →
Mirror / shadow · 1 → live + shadow
Deliver to your live target and a shadow copy in the same request — for migration, audit, and A/B. A shadow target is a billable delivery, so mirroring doubles a request's deliveries (and its cost). Mirror →
Origin shield / ingestion · N → 1
Clients connect to EchoRelay, never to your origin. Auth, validation, rate-limiting, and IP allowlists run at the edge — flood protection (we rate-limit and allowlist; not a scrubbing CDN). Inbound is one project-wide key validated against your schema, not per-provider webhook-signature verification. Origin shield →
Billing
Each delivery is 1 credit — a request to 3 targets is 3 credits, plus 1 per
64 KB. Every mode bills the same per delivery: sync,
stream, and each mirror/shadow target are all billable deliveries. A
stream bills by the size of the response it streams back (heartbeats are free; a cut-short stream bills only for
what was delivered). Deliveries that fail on our side aren't charged.
Next: Quickstart walks the queued
path end to end, including a streaming example with curl -N.