---
title: Rate limits
description: How ggui surfaces rate limits — the in-band `rate_limited` tool error on `ggui_render`, the HTTP 429 + Retry-After contract, and how host SDKs handle backoff.
---

ggui rate limiting is operator-configured: a `RateLimiter` seam the deployment wires (or doesn't). This page covers the self-hosted defaults, the two enforcement layers and their wire shapes, and how to layer retry on top of whichever MCP host SDK you're using.

:::note[Hosted ggui (coming soon)]
On hosted ggui, specific limits will depend on your plan and the endpoint family, surfaced on the `mcp.ggui.ai` dashboard. The wire contracts below are independent of the numbers.
:::

## Self-hosted defaults

- Default (strict) `ggui serve` wires **no** generation limiter — `ggui_render` is unlimited for paired callers.
- `ggui serve --public-demo` binds a per-remote-IP fixed-window limiter to `ggui_render`: **30 generations / 10 minutes / IP** (operator-pays posture for public demos).
- Library users wire their own `RateLimiter` into the render handler deps — the seam and the typed `RateLimitedError` live in `@ggui-ai/mcp-server-core`.

## Two enforcement layers

### Tool-level: `ggui_render` rejects in-band

When a rate limiter is wired into the render handler, a limited `ggui_render` call rejects with an MCP **tool error** (an `isError` tool result), not an HTTP 429. The error carries the code `rate_limited` and the retry decision (`retryAfterMs`). Catch it in your agent loop like any other tool error and back off before re-calling.

### HTTP-level: 429 on auth/pairing endpoints

The pairing/login routes enforce limits at the HTTP transport layer. Every limited request returns:

| Field                | Value                                                                                          |
| -------------------- | ---------------------------------------------------------------------------------------------- |
| HTTP status          | `429`                                                                                          |
| `Retry-After` header | Seconds before the next attempt is permitted. Optional — absent means use exponential backoff. |
| Body                 | JSON `{ "error": { "code": "rate_limited", "message": "...", "retryAfter": <seconds> } }`.     |

`Retry-After` is the authoritative signal. When present, honor it verbatim — the server has already computed the appropriate wait. The `retryAfter` field in the body mirrors the header for convenience when only the body is observable (e.g. some transport wrappers).

:::note[JSON-RPC code reservation]
The JSON-RPC error code `-32013 RATE_LIMIT_EXCEEDED` is reserved in the protocol types for forward compatibility, but ggui does **not** emit it on the wire today — the canonical mechanisms are the in-band `rate_limited` tool error (generation) and HTTP 429 (auth/pairing endpoints). If you're authoring a custom MCP transport that needs to surface rate-limiting through the JSON-RPC envelope (e.g. stdio bridges), `-32013` is the agreed code.
:::

## Retry is the host SDK's job

ggui has no first-party client SDK to wrap retries — your MCP host owns that loop. The pattern is the same regardless of host: catch the 429, read `Retry-After`, sleep, retry, cap attempts. For `ggui_render`, additionally check the tool result's `isError` flag for the in-band `rate_limited` error and back off the same way.

### Claude Agent SDK

The Claude Agent SDK's `query()` already retries transient transport errors (including 429) using the standard Anthropic SDK retry config. You generally don't need to do anything — bursts within the retry window never surface to your code. To tune, pass `maxRetries` through the SDK's options. See [Examples → Claude Agent](/examples/claude-agent/) for a runnable scaffold.

### `@modelcontextprotocol/sdk` (generic MCP)

The official MCP SDK throws on HTTP errors without retrying. Wrap `callTool` (or whichever method you invoke) yourself:

```typescript
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

const client = new Client({ name: "my-agent", version: "1.0.0" });
await client.connect(
  new StreamableHTTPClientTransport(new URL("http://127.0.0.1:6781/mcp"), {
    requestInit: { headers: { Authorization: "Bearer dev" } },
  })
);

async function callWithRetry<T>(
  fn: () => Promise<T>,
  { maxRetries = 3, baseDelayMs = 1000, maxDelayMs = 30000 } = {}
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (err) {
      // The MCP SDK surfaces HTTP errors with status + headers attached.
      const status = (err as { status?: number }).status;
      if (status !== 429 || attempt === maxRetries) throw err;

      const retryAfter =
        Number((err as { headers?: Record<string, string> }).headers?.["retry-after"]) || undefined;
      const waitMs =
        retryAfter != null ? retryAfter * 1000 : Math.min(baseDelayMs * 2 ** attempt, maxDelayMs);

      await new Promise((r) => setTimeout(r, waitMs));
    }
  }
  throw new Error("unreachable");
}

const result = await callWithRetry(() =>
  client.callTool({
    name: "ggui_handshake",
    arguments: {
      /* ... */
    },
  })
);
```

Tune `maxRetries` per workload: lower on interactive (user-blocking) paths so failures bubble up fast; raise on background batch paths where backoff is cheaper than re-queuing. Note that a rate-limited `ggui_render` on a `--public-demo` server does NOT throw an HTTP error — it resolves with `isError: true` and a `rate_limited` message; check the result before treating the call as a success.

## Raw HTTP

If you're hitting the server directly without an MCP SDK, implement the same loop against `fetch`:

1. Read the `Retry-After` header on every 429.
2. If present, sleep that many seconds, then retry.
3. If absent, sleep `min(baseDelay * 2^attempt, maxDelay)`, then retry.
4. Cap retries (3–5 is reasonable for interactive workloads, more for batch).
5. Stop retrying on non-429 4xx (those won't resolve with backoff).

The [generic MCP example](/examples/generic-mcp/) walks through raw-HTTP usage end-to-end.

## See also

- [Examples → Claude Agent](/examples/claude-agent/) — runnable scaffold with the host SDK's native retry.
- [Cookbook → Error handling](/cookbook/error-handling/) — retry, surfacing, and dead-letter patterns.
- [Troubleshooting](/troubleshooting/) — common error patterns.
- [MCP Protocol](/api/mcp-protocol/) — full JSON-RPC method reference.