---
title: ggui serve
description: Run the self-hosted ggui runtime — MCP server plus a supervised agent, configured by ggui.json.
---

`ggui serve` boots the self-hosted ggui runtime: an MCP server (`@ggui-ai/mcp-server`) plus, by default, a supervised agent process declared in `ggui.json`. It's distinct from [`ggui dev`](/cli/dev/), the inner-loop development hub — `ggui serve` is the production-shaped self-host you'd put behind a tunnel or on a VM.

## Quick start

```bash
ggui serve
```

Binds `127.0.0.1:6781`, mounts the first-run bundle, and starts the agent declared in `ggui.json#agent.entry` alongside MCP. The CLI auto-opens the landing page once the banner prints (skip with `--no-open`).

**Point an agent at it** — the MCP endpoint is `http://127.0.0.1:6781/mcp`:

```bash
GGUI_MCP_URL=http://127.0.0.1:6781/mcp
GGUI_MCP_BEARER=dev   # i.e. `Authorization: Bearer dev` — works with `ggui serve --dev-allow-all`
```

With the default strict auth, swap `dev` for a pair-minted bearer (or one minted via `ggui keys create --keys-file <path>`).

For the bootstrap walkthrough, see [OSS Quick Start](/oss-quickstart/).

## The first-run bundle

A default `ggui serve` mounts these same-origin surfaces:

| Path                        | What it serves                                                                     |
| --------------------------- | ---------------------------------------------------------------------------------- |
| `/`                         | Landing page — server identity, pair-code card, links into the console             |
| `/mcp`                      | MCP HTTP endpoint — agents call `ggui_render`, `ggui_consume`, etc. here.          |
| `/ws`                       | Live-channel WebSocket — live session plane for MCP Apps iframes and the console.  |
| `/ggui/health`              | Liveness probe — the path the [reference-deploy](/self-hosted/reference-deploys/) healthchecks hit. |
| `/r/<shortCode>`            | Signed render-viewer URL — resolves a shortCode to its session.                    |
| `/settings`                 | LLM provider-key page — paste a key; takes effect without restart.                 |
| `/pair`, `/admin/pair/init` | Pairing endpoints for paired viewer clients.                                       |

Hosts that need a different shape (no landing page, no pairing, programmatic control) should compose `createGguiServer()` directly rather than invoke this CLI. `createGguiServer({ mcpServices: [...] })` can also mount additional standalone MCP services at their own paths — see [MCP services](/architecture/mcp-services/).

## Flags

```text
ggui serve [options]
```

### Bind & lifecycle

| Flag            | Default     | Purpose                                                                                                                          |
| --------------- | ----------- | -------------------------------------------------------------------------------------------------------------------------------- |
| `--port <n>`    | `6781`      | Bind port. `0` = OS-assigned (the actual port prints in the boot banner).                                                        |
| `--host <addr>` | `127.0.0.1` | Bind host. Loopback only by default.                                                                                             |
| `--mcp-only`    | off         | Run just the MCP server; skip agent supervision even if `ggui.json` has `agent.entry`. Also implies `--no-open`.                 |
| `--no-open`     | off         | Skip auto-opening the operator's browser. Auto-open is also skipped whenever stdout is not a TTY (CI, supervised, piped output). |

### Auth posture (mutually exclusive)

By default, `/mcp` rejects any bearer that wasn't minted by the pairing flow. These flags relax that for specific scenarios:

| Flag              | Posture                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `--dev-allow-all` | Accept any bearer — or none at all (the no-bearer probe MCP custom connectors send) — as `builder`. Local-dev / tunnel smoke ONLY. **Never expose to the open internet** — the banner prints an unmissable warning when this is set.                                                                                                                                                                                                                                                 |
| `--public-demo`   | Same any-bearer auth as `--dev-allow-all`, plus a per-IP `FixedWindowRateLimiter` on `ggui_render` (default: 30 `ggui_render` calls per 10 min per IP) and a "PUBLIC DEMO — operator pays" banner. Use case: a single shared LLM key for an audience demo (Show HN, blog, classroom). Mutually exclusive with `--dev-allow-all`.                                                                                                                                                     |
| `--multi-tenant`  | Strict-auth multi-tenant posture. The console `/settings` LLM-keys gate switches from admin-token to auth-adapter so each authenticated end-user manages their OWN provider keys (scope = `userId` for `kind:'user'`, `appId` for `kind:'app'`). `kind:'builder'` identities are rejected. Mutually exclusive with `--dev-allow-all` and `--public-demo`. Note: the admin-token `/keys` pairing plane is separate from the `/settings` LLM-keys plane that `--multi-tenant` rebinds. |

### Custom connector hosts

| Flag                    | Purpose                                                                                                                                                                                                                                                                                                                                                                         |
| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--oauth`               | Mount OAuth 2.1 + PKCE + Dynamic Client Registration routes (`.well-known/oauth-*` + `/oauth/{authorize,token,register}`). Required for hosts whose Add Connector form has no field for a pre-shared bearer (claude.ai, ChatGPT). Pure-bearer clients (Claude Desktop with bearer in config) work without it.                                                                   |
| `--public-base-url <u>` | Override the public base URL used to compose the iframe-runtime + live-channel URLs written into each render's `ai.ggui/render` slice. Set to a tunnel URL (`https://<random>.trycloudflare.com`) when testing against a remote MCP host so those URLs resolve from the host's perspective. Without this, they derive from `--host:--port` and only work from the same machine. |

### Operator config

| Flag                     | Purpose                                                                                                                                                                                                                                                  |
| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--admin-token <token>`  | Pin the admin bearer that gates the console `/keys` plane. Without this, the server mints a fresh `ggui_admin_*` per boot and prints it on the banner. Pin a stable value when you want the bearer to survive restarts.                                  |
| `--keys-file <path>`     | JSON file backing the pairing service. When set, paired bearers survive restart. Stored in plaintext at `0600` perms — assume operator-controlled disk (e.g. `~/.ggui/keys.json`).                                                                       |
| `--ephemeral`            | Opt out of the default cross-restart persistence bundle (`.ggui/persistent/`). With this flag, HMAC secrets, renders, vectors, short-codes, and paired bearers all reset on every restart. Use for tests, CI loops, or incident-response nuclear-revoke. |
| `--seed-pool <dir>`      | Repeatable. Load a read-only shared blueprint pool (a [`ggui export-pool`](/cli/) artifact) for exact-contract reuse, consulted after the server's own blueprints.                                                                                       |
| `--mcp-instructions <p>` | Server-level MCP instructions preset (the string injected into the LLM's system prompt above the tool catalog). Presets: `default`, `aggressive`, `always`, `minimal`, `off`. Also accepts `GGUI_MCP_INSTRUCTIONS` env (CLI flag wins).                  |

## Agent runtime supervision

By default, `ggui serve` boots your agent alongside MCP. The entry file comes from `ggui.json`:

```json
{
  "agent": { "entry": "./agent.ts" }
}
```

Supported extensions:

- `.js` / `.mjs` / `.cjs` — runs as `node <entry>`
- `.ts` / `.tsx` / `.mts` — runs as `node --import=tsx <entry>` (`tsx` must be resolvable in your project)

Failure modes:

- **No `ggui.json`** or **no `agent.entry`** — falls back to MCP-only with a warning. Useful when you want to point an external agent runtime at the server.
- **Malformed `ggui.json`** or **unsupported entry extension** — hard error, exits 1 before binding.
- **Agent crashes after startup** — logged, MCP keeps running. No auto-restart — compose that with your own supervisor (systemd, pm2, Docker restart policy).

## Generation (bring your own key)

Component-code generation on a self-hosted server uses **your** LLM provider key (BYOK). At boot, `ggui serve` resolves a key in this order:

1. **Provider env vars** — `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GOOGLE_API_KEY` (falling back to `GEMINI_API_KEY`), `OPENROUTER_API_KEY`. The env layer always wins.
2. **`~/.ggui/credentials.json`** — keys pasted at `/settings` land here (plaintext, `0600`).

The model comes from `ggui.json#generation.model`, in either `provider:model` (canonical) or `provider/model` (LiteLLM) form:

```json
{
  "generation": { "model": "anthropic:claude-haiku-4-5-20251001" }
}
```

`GGUI_GENERATION_MODEL` env overrides the manifest model — the precedence is **`GGUI_GENERATION_MODEL` env > `ggui.json#generation.model` > per-provider default**. The env value accepts both `provider:model` (canonical) and `provider/model` (LiteLLM) forms; a malformed value is a hard boot error, not a silent fallback. It's the ops escape hatch for pointing one `ggui serve` instance at a different model without editing the manifest. When neither env nor manifest sets a model, the per-provider default applies (anthropic `claude-haiku-4-5`, openai `gpt-5.5-2026-04-23`, google `gemini-3.1-flash-lite`).

Rules:

- **A key without a model is a hard error** — if a boot key resolves but `generation.model` is unset, `ggui serve` exits with an actionable message showing both accepted model-string forms.
- **Bedrock routes are rejected** on the OSS path (IAM-only; supported via the hosted runtime only).
- **No key at all is not fatal** — the banner prints `⚠ no LLM key configured`, and renders fall back to a Connect-a-key card pointing at `/settings`. Pasting a key there takes effect without a restart.

## Persistent storage

By default, `ggui serve` writes a cross-restart bundle under `.ggui/persistent/` (project-local when a `ggui.json` was resolved, else `~/.ggui/persistent/`) so HMAC secrets, renders, vectors, short-codes, and paired bearers survive a restart. claude.ai chat-history revisits keep working without re-pairing.

Bundle layout:

```text
.ggui/persistent/
├── ws-token-secret.hex       (HMAC, 0600)
├── render-signer-secret.hex  (HMAC, 0600)
├── short-codes.sqlite        (signed render-URL resolution — backs the /r/<shortCode> viewer)
├── sessions.sqlite            (GguiSessionStore — renders + event history)
├── vectors.sqlite            (RAG corpus)
└── keys.json                 (paired bearers)
```

Override the directory with `GGUI_PERSISTENT_DIR`. Pass [`--ephemeral`](#operator-config) to skip the bundle entirely.

Explicit `ggui.json#storage` declarations always win. Declare them to override paths or swap a single surface back to memory while keeping the rest persistent:

```json
{
  "storage": {
    "renders": { "driver": "sqlite", "path": "./ggui-sessions.sqlite" },
    "vectors": { "driver": "sqlite", "path": "./ggui-vectors.sqlite" },
    "threads": { "driver": "sqlite", "path": "./ggui-threads.sqlite" }
  }
}
```

| Store     | Default                                                                                                  | sqlite driver requires    |
| --------- | -------------------------------------------------------------------------------------------------------- | ------------------------- |
| `renders` | sqlite under `.ggui/persistent/sessions.sqlite` (opt out with `--ephemeral` or `{ "driver": "memory" }`) | `better-sqlite3` peer dep |
| `vectors` | sqlite under `.ggui/persistent/vectors.sqlite` (opt out with `--ephemeral` or `{ "driver": "memory" }`)  | `better-sqlite3` peer dep |
| `threads` | **routes unmounted unless declared** (opt-in)                                                            | `better-sqlite3` peer dep |

Paths in `ggui.json#storage` resolve relative to the `ggui.json` directory, regardless of where `ggui serve` was invoked from.

For threads specifically, `driver: "memory"` mounts the routes but data resets on restart (`/ggui/health` reports `threads.durability: "ephemeral"`); `driver: "sqlite"` is durable.

## Recommended setups

**Local-only smoke (no tunnel, no remote MCP host):**

```bash
ggui serve
```

**claude.ai custom connector (over a public tunnel):**

```bash
# Terminal 1: tunnel
cloudflared tunnel --url http://localhost:6781

# Terminal 2: serve
ggui serve --oauth \
           --public-base-url https://<tunnel>.trycloudflare.com
```

Note the `PAIR_CODE` from the boot banner. Visit the public URL, complete pairing, save the bearer. Then in claude.ai → Settings → Connectors → Add custom connector, point it at `https://<tunnel>.trycloudflare.com/mcp`, leave Client ID / Secret empty (the server uses Dynamic Client Registration), click Connect, and paste the bearer when prompted.

**Quick local-dev without auth (NEVER over a public tunnel):**

```bash
ggui serve --dev-allow-all
```

For the full pairing walkthrough, see [Self-hosted pairing](/self-hosted/pairing/).

## Production hardening

The default auth shape is dev-mode pairing. For production, swap in a real `AuthAdapter` by composing `createGguiServer()` directly in your agent entrypoint instead of running the CLI:

```typescript
import { createGguiServer } from "@ggui-ai/mcp-server";

const server = createGguiServer({
  auth: {
    /* your AuthAdapter — OIDC, Cognito, custom */
  },
});
```

The adapter gates both `/mcp` and the live-channel `/ws` upgrade. See [Reference deploys](/self-hosted/reference-deploys/) for Docker / Fly / Render manifests.

## Current limits

- Strict-auth only — `/mcp` rejects any bearer not pair-minted (or relaxed via `--dev-allow-all` / `--public-demo`).
- Single-tenant by default — every request scopes to one `builder` app ID. Use `--multi-tenant` to scope per authenticated user.
- No auto-restart on agent crash — compose with your own supervisor.

## See also

- [`ggui` CLI overview](/cli/) — the full command surface.
- [`ggui login`](/cli/login/) — sign into `api.ggui.ai` for `ggui_user_*` connector keys _(Preview — managed cloud, coming soon; separate from `ggui serve`'s pairing flow)_.
- [Self-hosted pairing](/self-hosted/pairing/) — pair a viewer client to a `ggui serve` instance.
- [Reference deploys](/self-hosted/reference-deploys/) — Docker, Fly, Render manifests for `ggui serve`.