Twins - Pome

A twin is a deterministic, resettable, in-process simulation of a real SaaS API—the same response shapes, status codes, and error semantics as the live service. Your agent runs against the twin, Pome records every tool call, and the judge scores the run against the scenario’s acceptance criteria. Pome ships three twins today: GitHub, Stripe, and Slack. Each twin page covers its API surface, env vars, seed shapes, and the bundled scenario catalog for that twin.

Available twins

Twin	`pome run` config	Standalone `pome twin start`	Page
GitHub	`twins: ["github"]`	Yes	GitHub twin
Stripe	`twins: ["stripe"]`	No (in-process via `pome run`)	Stripe twin
Slack	`twins: ["slack"]`	No (in-process via `pome run`)	Slack twin

Hosted sessions on app.pome.sh spawn the same twin packages inside per-session sandboxes. GitHub and Stripe are available via pome session create; Slack is local pome run today.

How scenarios work

A scenario is a markdown file with three parts:

Seed state — the twin state before the agent runs (## Seed State)
Prompt — what the agent is asked to do (## Prompt)
Acceptance criteria — deterministic [D] and probabilistic [P] checks

Set the target twin in ## Config:

twins: ["github"]   # or stripe, slack
timeout: 60
passThreshold: 100

Run a scenario

pome run scenarios/<scenario>.md --agent "<your agent command>"

During the run, Pome boots the matching twin on a localhost port and injects POME_<TWIN>_REST_URL, POME_<TWIN>_MCP_URL, and POME_AUTH_TOKEN into the agent process.

GitHub twin

Issues, PRs, labels, CI status, and 10 bundled scenarios.

Stripe twin

PaymentIntents, refunds, x402, and 6 bundled scenarios.

Slack twin

Channels, messaging, and 2 bundled scenarios.

​Available twins

​How scenarios work

​Run a scenario