1. Install
2. Log in
app.pome.sh, you sign in, and a pme_… API key is stored in your macOS Keychain (preferred) or ~/.pome/credentials.json (Linux/Windows or when Keychain is unavailable). CI can skip this and set POME_API_KEY directly.
3. Install the Pome skills
The Pome skills teach your coding agent how to wire itself up to pome and run scenarios.~/.claude/skills/, symlinked to the installed pome-sh package so npm i -g pome-sh@latest keeps them current):
/pome-setup: identifies your agent and the services it talks to, registers it on the dashboard, and writes a starterTESTS.md./pome-test: runs the scenarios that match your agent and reports the results.
4. First run
Open your coding agent (Claude Code, for example) in the project that contains your agent code. Then prompt it:/pome-setup will:
- Identify your agent + the services it uses (GitHub? Stripe?) and confirm with you.
- Add minimal, non-breaking hooks so pome can capture tool calls during a test run.
- Register the agent on the dashboard and print a URL.
- Suggest 5 scenarios from the library that match your agent and ask for confirmation.
/pome-test runs the confirmed scenarios against the matching twin and reports back inline.
5. See the results
Open the dashboard:events.jsonl, score.json, state snapshots) stay on disk under runs/<scenario>/<run-id>/.
Run the twin locally
pome run defaults to the hosted twin on app.pome.sh. To run the twin yourself, pull the image and start it with Docker:
POME_LOCAL=1:
Until the Stage 1 public flip, the GHCR image is private. Run
docker login ghcr.io first.Next
/setup
Wire your agent with
/pome-setup./test-with-pome
Run scenarios with
/pome-test.GitHub twin
Twin reference and bundled GitHub scenarios.
CLI reference
Command reference:
pome run, pome session, flags, and exit codes.Dashboard
Where runs, agents, and twins live, and how the judge surfaces fixes.