Skip to content

Testing ggui Integrations

read as .md

ggui’s agent surface has a dedicated mock layer so you never need a live mcp.ggui.ai (or self-hosted ggui serve) connection in unit tests:

  • Agent side — mock the MCP transport (not a wrapper SDK) and assert the tool calls your agent makes.

Integration tests against a live endpoint belong in a separate tier — see Self-Hosted Reference Deploys for spinning up a throwaway stack.

ggui is consumed over standard MCP, so the right place to draw the test boundary is at the MCP transport, not at a wrapper SDK. Stub the tool-call responses your agent expects and assert against the call log — your agent code stays unmodified between test and production.

With @modelcontextprotocol/sdk — stub Client.callTool

Section titled “With @modelcontextprotocol/sdk — stub Client.callTool”

If your agent talks to ggui through the canonical MCP Client, stub callTool and seed the responses keyed by tool name. A unit-test fixture only needs the fields your agent actually reads, so each builder returns a Pick<> of the real protocol type (GguiHandshakeOutput, GguiRenderOutput, GguiConsumeOutput from @ggui-ai/protocol) — keeping the field names and shapes wire-faithful without hand-rolling every required field of the live envelope.

import { describe, it, expect, beforeEach, vi } from "vitest";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import type {
GguiHandshakeOutput,
GguiRenderOutput,
GguiConsumeOutput,
ConsumeEventEntry,
} from "@ggui-ai/protocol";
function makeMockMcpClient() {
const callLog: Array<{ name: string; arguments: unknown }> = [];
const responses = new Map<string, unknown>();
let lastSessionId: string | null = null;
const renderEvents = new Map<string, ConsumeEventEntry[]>();
const renderStatus = new Map<string, "active" | "expired">();
responses.set(
"ggui_handshake",
(): Pick<GguiHandshakeOutput, "handshakeId" | "action"> => ({
handshakeId: `h_${Date.now()}`,
action: "create",
})
);
responses.set(
"ggui_render",
(): Pick<GguiRenderOutput, "sessionId" | "resourceUri" | "action"> => {
const sessionId = `rnd_${Date.now()}`;
lastSessionId = sessionId;
renderEvents.set(sessionId, []);
renderStatus.set(sessionId, "active");
return {
sessionId,
// Spec-canonical MCP-Apps entry point. There is NO clickable
// `url` on the wire — the host mounts the `ui://ggui/render/{id}`
// iframe resource. (A dead `url` had the model hallucinating
// links that resolve nowhere, so it was removed.)
resourceUri: `ui://ggui/render/${sessionId}`,
action: "create",
};
}
);
responses.set("ggui_consume", (args: { sessionId: string }): GguiConsumeOutput => {
const events = renderEvents.get(args.sessionId) ?? [];
renderEvents.set(args.sessionId, []);
return {
events,
status: renderStatus.get(args.sessionId) ?? "active",
};
});
const client = {
callTool: vi.fn(async ({ name, arguments: args }) => {
callLog.push({ name, arguments: args });
const handler = responses.get(name);
if (!handler) {
// JSON-RPC -32601: Method not found — matches the live server.
throw new Error(`MCP tool not found: ${name}`);
}
const result = typeof handler === "function" ? handler(args) : handler;
// MCP wraps tool output in { content: [...], structuredContent: ... }.
return {
content: [{ type: "text", text: JSON.stringify(result) }],
structuredContent: result,
};
}),
} as unknown as Client;
return {
client,
callLog,
// Simulate a user gesture appearing on the consume pipe.
simulateSubmit(sessionId: string, data: ConsumeEventEntry["actionData"]) {
const events = renderEvents.get(sessionId) ?? [];
events.push({
type: "action",
sessionId,
intent: "submit",
actionData: data,
uiContext: {},
actionId: "mockactn",
firedAt: new Date().toISOString(),
});
renderEvents.set(sessionId, events);
},
// Status semantics match the canonical protocol: `active` = more events
// may arrive; `expired` = TTL elapsed. Flip terminal state explicitly.
simulateExpire(sessionId: string) {
renderStatus.set(sessionId, "expired");
},
};
}
describe("feedback agent", () => {
let mock: ReturnType<typeof makeMockMcpClient>;
beforeEach(() => {
mock = makeMockMcpClient();
});
it("collects user feedback end-to-end", async () => {
const handshake = await mock.client.callTool({
name: "ggui_handshake",
arguments: { intent: "Collect feedback" },
});
const { handshakeId } = handshake.structuredContent as GguiHandshakeOutput;
const render = await mock.client.callTool({
name: "ggui_render",
// `props` is REQUIRED on ggui_render — pass `{}` when the agreed
// contract declares no propsSpec.
arguments: { handshakeId, props: {} },
});
const { sessionId, resourceUri } = render.structuredContent as GguiRenderOutput;
// The render's entry point is the spec-canonical MCP-Apps resource
// URI the host mounts — not a clickable link.
expect(resourceUri).toMatch(/^ui:\/\/ggui\/render\//);
// Pretend the user filled in the form.
mock.simulateSubmit(sessionId, { rating: 5, comments: "Great!" });
const consume = await mock.client.callTool({
name: "ggui_consume",
arguments: { sessionId },
});
const { events, status } = consume.structuredContent as GguiConsumeOutput;
expect(status).toBe("active");
expect(events[0].actionData).toEqual({ rating: 5, comments: "Great!" });
// Assert the agent issued exactly the expected tool-call sequence.
expect(mock.callLog.map((c) => c.name)).toEqual([
"ggui_handshake",
"ggui_render",
"ggui_consume",
]);
});
});

Consumers using @anthropic-ai/claude-agent-sdk typically pass an mcpServers config to query(). For unit tests, supply an in-memory server entry that returns canned tool responses instead of dialling mcp.ggui.ai. Use the SDK’s createSdkMcpServer (or the SDK-specific test helper) and register tools that produce the same structured payloads shown above.

The principle is identical: stub at the MCP transport surface so your agent prompt, tool-loop logic, and consume-pipe handling exercise unchanged.

If your agent talks to a self-hosted ggui serve over plain HTTP (http://127.0.0.1:6781/mcp), vi.spyOn(global, "fetch") (or msw) is the right boundary. Assert request URL + JSON-RPC method, and return the matching structuredContent payload.

ggui surfaces failures at protocol level — there are no SDK-specific error classes to import. Assert against JSON-RPC error codes (-32601 method-not-found, -32602 invalid-params, etc.) or HTTP status codes (401 Unauthorized, 408 Request Timeout, 429 Too Many Requests), depending on your transport.

it("surfaces auth errors", async () => {
const failingClient = {
callTool: async () => {
// Mirror the wire shape: an HTTP-401 surface from the gateway becomes a
// JSON-RPC error on the MCP client.
const err = new Error("Unauthorized") as Error & { code?: number };
err.code = -32000; // JSON-RPC server-error range; httpStatus 401 upstream.
throw err;
},
};
await expect(
failingClient.callTool({ name: "ggui_handshake", arguments: {} })
).rejects.toMatchObject({ code: -32000 });
});

See Error Handling for retry, fallback, and graceful-degradation patterns built on these protocol-level signals.


LLM-generated component code is non-deterministic — assert behavior, not DOM structure. Pin contracts (via defineContract + useContract) and test your agent’s tool-call sequence. Leave the visual layer to live snapshots or a separate generation-quality tier.

See also: @ggui-ai/react · MCP Protocol · Troubleshooting.