Skip to main content
pome run is the core workflow: load a scenario markdown file (or every .md in a directory), boot the matching twin, spawn your agent with injected environment variables, record the trace, and score acceptance criteria. Hosted is the default — runs are recorded on app.pome.sh. Set POME_LOCAL=1 to use the in-process local twin (GitHub, Stripe, or Slack depending on the scenario).

Usage

pome run <path> [options]

Arguments

ArgumentRequiredDescription
<path>YesPath to a scenario .md file or a directory. Directories run every .md file inside, in sorted order.

Options

FlagDefaultDescription
--agent <command>pome.config.jsonexamples/agents/scripted-triage-agent.tsShell command Pome spawns as the agent process.
--artifacts-dir <dir>runsWhere to write per-run artifacts (events.jsonl, score.json, state snapshots).
--api-url <url>POME_API_URL or https://api.pome.shControl-plane base URL for hosted runs.
--agent-model <name>unknownInformational model name recorded on the cloud run.
--no-fix-promptoffSkip CLI-side LLM fix-prompt generation. Submits fix_prompt: null to save tokens on bulk runs.
--no-captureoffSelf-host only. Skip the capture-server child and do not inject HTTP_PROXY / HTTPS_PROXY into the agent. No-op on hosted runs.
--hostedDeprecated no-op. Hosted is now the default. Kept for one release so existing scripts do not break.

Environment variables

Pome injects these into the agent process during a run:
VariableWhen
POME_TASKAlways. The scenario prompt from ## Prompt.
POME_<TWIN>_REST_URLAlways. Session-scoped REST root (GITHUB, STRIPE, or SLACK).
POME_<TWIN>_MCP_URLAlways. MCP transport endpoint for the active twin.
POME_AUTH_TOKENAlways. Short-lived JWT for twin session auth.
POME_RUN_IDAlways. Unique run identifier.
POME_ARTIFACTS_DIRAlways. Per-run artifact directory path.
POME_ADAPTER_SIGNALS_PATHWhen the agent adapter writes supplemental signals.
HTTP_PROXY / HTTPS_PROXYHosted and local capture-on runs. Agent outbound traffic flows through the capture-server.
NO_PROXYSet to 127.0.0.1,localhost so twin traffic stays in-process.
POME_PREFLIGHT=1During agent preflight. Your agent should exit 0 immediately without doing work.
For hosted Stripe runs, Pome may also set POME_STRIPE_API_BASE and POME_STRIPE_API_KEY instead of the generic twin URL pattern. Set POME_LOCAL=1 before pome run to use the local in-process twin and skip hosted auth.

Examples

# Single scenario (hosted default)
pome run scenarios/01-bug-happy-path.md

# Custom agent command
pome run scenarios/10-stripe-create-payment-intent.md \
  --agent "npx tsx examples/agents/llm-refund-agent.ts"

# Run every scenario in a directory
pome run scenarios/ --agent "node ./dist/agent.js"

# Local in-process twin (no cloud recording)
POME_LOCAL=1 pome run scenarios/20-slack-exfiltration.md

# Bulk run without fix-prompt LLM calls
pome run scenarios/ --no-fix-prompt

Output

Each run writes artifacts under runs/<scenario>/<run-id>/:
  • events.jsonl — canonical event stream (TwinHttpEvent, LlmCallEvent, etc.)
  • score.json — per-criterion verdicts and aggregate satisfaction score
  • state_initial.json / state_final.json — twin state before and after the agent
Hosted runs also print a cloud: URL to the dashboard.

Exit codes

Returns the worst exit code across all scenarios in a batch. See CLI overview — exit codes.