Skip to content

Conformance

read as .md

An implementation is ggui-conformant iff it passes the conformance kit — a fixture-based test suite a candidate runs against. Opinion, intent, and prose-level reading of this site are NOT the arbiter; the kit is.

This page describes what the ggui protocol promises in exchange for the name “protocol”, what each contract inside it must satisfy, and how the kit makes those promises observable.

Conformance for ggui means three concrete things:

  1. A third-party implementation can replace the first-party one — agents, hosts, and ops tools that work against the OSS ggui serve work against the third-party server too, without source-level changes.
  2. Every named failure mode produces an observable signal — operators see violations without running a debugger.
  3. Every breaking change to the protocol is reproducible — “is this PR breaking?” resolves to a kit run, not a debate.

If a layer that calls itself a “protocol” can’t honor these three promises, the layer is something else (a shape, an SDK feature, an author convention) and should be named accordingly.

A runtime contract in ggui MUST satisfy all four. Three of four is not a contract. Documenting a fourth in prose without mechanism is not a contract.

The contract states WHO is on each side. Not “producer” and “consumer” in the abstract — the concrete roles. The ggui_runtime_submit_action handler and the consume-buffer pipe. The iframe-runtime and the render-channel server. The contract author and the platform validator. If you can’t name the parties in one sentence, there is no contract — only a data type adrift.

  • ✅ “The ggui_runtime_submit_action handler appends every valid kind: 'dispatch' envelope to the render-keyed pending-events pipe; the agent drains it via ggui_consume.”
  • ❌ “Clients should respect the ordering semantics.”

Each party has a list of MUSTs and MUST-NOTs. SHOULDs are discipline, not contract — a SHOULD belongs in a style guide. The obligations must be specific enough that a reviewer can literally check a diff against them.

  • ✅ “Agent-authored streamSpec MUST NOT declare a channel whose name starts with _ggui:.”
  • ❌ “Handlers should behave predictably.”

An obligation without a verifiable test is a hope, not a contract.

When the obligation breaks, there is a named, typed, observable failure. Generic throw new Error() is not a failure mode. console.warn() is not a failure mode. “Log the outcome to telemetry and carry on” is not a failure mode.

Acceptable mechanisms:

  • A canonical error-code union (CONTRACT_VIOLATION, SESSION_NOT_FOUND, SCHEMA_MISMATCH_ERROR, the runtime-tool rejection codes INVALID_ACTION_KIND / PIPE_NOT_FOUND / CONTEXT_TOO_LARGE).
  • A typed rejection frame the caller receives on the call that caused it (e.g. a CONTRACT_VIOLATION error frame on the live channel, or a {ok:false, code} structuredContent rejection on a runtime tool).
  • Boot-time refusal with a message naming the offender (e.g. mount tool-name collision).
  • Compile-time impossibility (the type has no field → consumers can’t read it).

Unacceptable: silent noop, silent degradation, swallowed exception, “it’ll surface in the next call if it’s a real problem.”

An operator can see the violation without running a debugger. Surfaces:

  • An error envelope/frame the caller observes synchronously (the gold standard — a CONTRACT_VIOLATION error frame on the live channel when an inbound action violates the contract).
  • A validation response the caller receives synchronously.
  • An operator UI that reads the above.
  • A structured log emitted at a known event name, documented as the contract’s telemetry point.

A contract whose violations only show up in production stack traces fails this bar — an operator who inherits the system in six months can’t tell what’s working and what’s silently degraded.


A protocol is the emergent layer above a set of contracts. Every layer called “a protocol” MUST satisfy all six. Five of six is a protocol-in-progress; the gap MUST be flagged explicitly and the layer treated as experimental until closed.

A canonical prose spec describing every envelope shape, every field, every value constraint, and — critically — every intentional omission and why. Not just TypeScript types. Types say the shape; the spec says the semantics.

For ggui, this lives in the site-level wire references — Envelopes, MCP Protocol, WebSocket Protocol — which document every envelope shape, field, and intentional omission.

Given envelope X at time T in state S, which transitions are legal? What happens on reconnect? On resume? On concurrent emission? Crash recovery? Out-of-order delivery?

A protocol without sequencing is a data type with pretensions. ggui’s sequencing lives in the three-channel topology + the GguiSessionStreamBuffer interface + the fromSeq replay contract. Any new channel or envelope class MUST answer the sequencing question explicitly.

Producer and consumer MUST be able to agree on what version they speak, or explicitly refuse. Stamping a version on every envelope is infrastructure; actually rejecting a mismatched peer is the protocol feature. Pre-launch “advisory stamp only” is acceptable IF the launch-cutover plan names the policy flip (producer-stamp + consumer-reject).

Anti-pattern: calling version-stamping “versioning” and shipping. See Version policy for the post-launch handshake semantics.

A third-party implementer can verify their implementation without running the ggui server. This means:

  • A public fixture table of envelopes and expected behaviors.
  • An executable conformance kit (a test package a third party pnpm adds and runs).
  • Coverage of every canonical failure mode — a third-party implementation that passes the kit can be used interchangeably with the first-party one.

Without this, “third party adopts the protocol” means “third party reads the spec and hopes.”

5. Named failure modes — closed, or extensibly-closed

Section titled “5. Named failure modes — closed, or extensibly-closed”

Every way the protocol can fail has a name, a code, and a shape. Acceptable forms:

  • Closed union — exhaustive by design. Bumping requires a major version bump. Use when the set of values is fixed by an external spec or by a pre-launch design decision the protocol owners control end-to-end.
  • Extensibly-closed — e.g. the channel_error code union 'CHANNEL_UNKNOWN' | 'SESSION_NOT_FOUND' | 'SUBSCRIBE_UNAUTHORIZED' | 'POLL_FAILED' | (string & {}), or the open code: string on the WS error frame whose canonical literals include CONTRACT_VIOLATION and UPGRADE_REQUIRED. Consumers MUST handle unknown values gracefully. Adding new values does NOT bump the protocol version. Use when producer-side failure modes are expected to grow (new tool classes, new transports, new preconditions) without consumers needing a new version to render them.

Unacceptable: ad-hoc message: string as the only failure surface.

6. Vendor-neutral separation from implementation

Section titled “6. Vendor-neutral separation from implementation”

The protocol package imports nothing from a specific runtime, transport, cloud vendor, or framework. @ggui-ai/protocol MUST remain consumable by a third-party implementation that has never seen @ggui-ai/mcp-server.

Violation of this criterion is the loudest signal that what you have is an SDK feature pretending to be a protocol.


The kit lives at packages/protocol-conformance in the open repo — a fixture-based test suite a candidate implementation runs against to demonstrate conformance.

Terminal window
pnpm add -D @ggui-ai/protocol-conformance

Beyond the WS runner, the package ships pure-function catalogs for the gadget obligations — schema, registration, and resolution conformance — importable from @ggui-ai/protocol-conformance/{schema-conformance,registration-conformance,resolution-conformance}; the raw fixture catalog is also exported at @ggui-ai/protocol-conformance/fixtures.

Programmatic and CLI entry points:

import { runConformance } from "@ggui-ai/protocol-conformance";
const result = await runConformance({
serverUrl: "http://localhost:3000",
auth: { kind: "bearer", token: process.env.TOKEN! },
});
if (result.failed.length > 0) process.exit(1);
Terminal window
npx ggui-protocol-conformance --url http://localhost:3000 --auth bearer:$TOKEN

serverUrl / --url take the implementation’s base http:// / https:// URL — the runner appends /ws and derives the ws:// / wss:// scheme itself.

Transport. v1.0 is WebSocket-only — the canonical ggui transport (see the WebSocket Protocol reference). The kit’s TransportConfig is an extensibly-closed union, so later bindings (stdio MCP, HTTP long-poll) can be added post-v1.1 without breaking the public API.

Path-A vs Path-B fixtures. The fixture catalog spans both wire-observable claims and surface-observable claims. The runner handles Path-A — behaviors a runner can assert from WS frames alone (no MCP-Apps-host adapter, no Playwright). Path-B fixtures (e.g., bootstrap-failure, props-update) require a browser-host harness driving Playwright + page.route() fault injection + DOM assertion; they are recorded as SKIP on the WS runner, not FAIL. The partition is intentional: Path-A FAILs are vendor-neutrality bugs the server owns; Path-B SKIPs are claims a different driver is responsible for.

The kit covers:

  • Envelope round-trip — every fixture in the table can be emitted, observed, validated, and round-tripped.
  • Reserved-channel authority — the implementation rejects agent-authored streamSpec entries in the _ggui: namespace and validates the platform-owned shapes (_ggui:preview, _ggui:lifecycle).
  • Schema enforcementactionSpec[action].schema ⊆ the hinted tool’s inputSchema, with a SCHEMA_MISMATCH_ERROR push-time rejection on violation.
  • Sequencingseq is gap-free per-session, fromSeq replay returns the right tail, replayTruncated is honored when the requested cursor is unrecoverable.
  • Version handshake — schema-version stamping on every envelope; UPGRADE_REQUIRED emission on mismatch (post-launch — pre-launch is advisory).
  • Action persistence — an action frame round-trips to an ack carrying payload.sequence, proving the event landed on the GguiSession’s consume buffer (append-then-ack).
  • Contract enforcement at receipt — an action for an undeclared name is rejected with a CONTRACT_VIOLATION error frame and nothing reaches the consume buffer.

A change is breaking iff at least one fixture that passed against version N now fails against version N+1 when the only change is the protocol version. See Version policy → §2 for the policy this anchors.


Pre-launch (draft- versions), some protocol criteria are intentionally at ⚠️ — flagged gaps with owners and closing slices. The point is not zero gaps forever; it is zero silent gaps.

CriterionPre-launch postureLaunch (v1.0)
Version negotiationEnvelope schemaVersion: advisory stamp; consumers MUST NOT reject on mismatch. Subscribe handshake (supportedVersions / serverVersion): server default versionPolicy: 'reject' — emits UPGRADE_REQUIRED and closes; 'advisory' is a legacy opt-out for migration windows.Same mechanisms; envelope stamps stay advisory, the subscribe handshake stays the rejection point. Mismatched majors surface as UPGRADE_REQUIRED.
Conformance kitOptional today; available for first-party servers + early third-party implementers.Required before third-party implementers are invited to build on the protocol.
Schema-compat checkerRender-time + console blueprint-try (SCHEMA_MISMATCH_ERROR rejection on the ggui_render tool result); operator policy 'reject' | 'warn' | 'off' (default 'reject').Same; default unchanged.

At launch, all load-bearing boundaries MUST be at 4/4 contract + 6/6 protocol for any layer described as a public interop surface. Load-bearing = any boundary a third-party implementer is expected to adopt. Internal-only boundaries may sit at lower scores if the gap is explicitly scoped to internal use.


Things that feel contract-shaped but fail the bar:

Looks likeIs actuallyWhy
”TypeScript type is the contract”Shape-onlyTypes say what, not who/when/what-on-violation
”Docstring says MUST”Author disciplineProse without mechanism decays
”A validator function exists”Half-contractValidator without observable failure path wastes the check
”Error gets logged”Not observableLogs aren’t observability unless somebody reads them
”Pre-launch advisory”Infrastructure, not contractValid interim stance; MUST flag the gap + name the flip
”We cover the common cases”Denylist disciplineRock-paper-scissors with future adversaries
”Three similar implementations agree”FolkloreInterop-by-intuition is not interop

If a new “contract” maps to a row here, either strengthen it to meet the bar or pick a different word.


Used precisely throughout this site:

  • Shape — a TypeScript type with no enforcement story.
  • Convention — author discipline; no mechanism.
  • Contract — passes the 4-criterion bar.
  • Protocol — passes the 6-criterion bar; a set of contracts with sequencing + versioning + conformance-kit.

If a doc says “this contract is enforced by convention” — that’s a category error and either the contract framing or the convention framing is wrong.