Concepts

Delivery modes & topologies

EchoRelay sits in the middle of your HTTP traffic — the target your callers hit and the source that calls your services. A request comes in; each target receives it in one of three delivery modes. Combine endpoints and targets and you get four named topologies.

The three delivery modes

A delivery mode is a per-target setting (the API / MCP enum is async / sync / stream). One endpoint can mix modes across its targets.

Mode	What it does	The caller gets
Queued `async`	Queued, retried, fanned out to every target. The default.	`202 Accepted`, immediately.
Request/response `sync`	We call the target inline and return its response. Capped 1–90 s; not retried.	The target's own response and status (usually `200`).
Streaming `stream`	We forward an SSE / chunked response back as the target produces it. One stream target per endpoint.	The stream, chunk by chunk.

With sync or stream the caller gets the response back — but these are not thin proxies: sync round-trips our queue and a delivery worker, while stream is a direct in-process proxy. The caller gets the data; we don't promise a specific latency figure.

A mixed-mode endpoint

Fan out to queued targets and return one sync target's response in the same call:

{
  "targets": [
    { "name": "Slack",     "targetUrl": "https://hooks.slack.com/...", "deliveryMode": "async" },
    { "name": "Warehouse", "targetUrl": "https://logs.example.com/in", "deliveryMode": "async" },
    { "name": "Lookup",    "targetUrl": "https://api.example.com/price", "deliveryMode": "sync" }
  ]
}

The four topologies

The shapes you build by combining endpoints, targets, and modes. Each has a use-case page with the full story.

Fan-out · 1 → N

One request fans out to many targets — authenticated, validated, retried, delivered at-least-once with a replayable failure queue. Fan-out →

Streaming proxy · caller ⇄ target

Forward a streaming target's SSE / chunked response straight to the caller — built for LLM token streaming and long-running inference. Streaming →

Mirror / shadow · 1 → live + shadow

Deliver to your live target and a shadow copy in the same request — for migration, audit, and A/B. A shadow target is a billable delivery, so mirroring doubles a request's deliveries (and its cost). Mirror →

Origin shield / ingestion · N → 1

Clients connect to EchoRelay, never to your origin. Auth, validation, rate-limiting, and IP allowlists run at the edge — flood protection (we rate-limit and allowlist; not a scrubbing CDN). Inbound is one project-wide key validated against your schema, not per-provider webhook-signature verification. Origin shield →

Billing

Each delivery is 1 credit — a request to 3 targets is 3 credits, plus 1 per 64 KB. Every mode bills the same per delivery: sync, stream, and each mirror/shadow target are all billable deliveries. A stream bills by the size of the response it streams back (heartbeats are free; a cut-short stream bills only for what was delivered). Deliveries that fail on our side aren't charged.

Next: Quickstart walks the queued path end to end, including a streaming example with curl -N.

← Docs index