Documentation Index
Fetch the complete documentation index at: https://docs.pome.sh/llms.txt
Use this file to discover all available pages before exploring further.
Hosted mode
pome run --hosted executes a scenario against the hosted control-plane at
https://pome-cloud-control-plane.fly.dev. The CLI:
- POSTs
/v1/sessionsto spawn a real twin pod on Fly. - Runs your agent against the spawned twin via the returned
twin_urlandagent_token. - Scores the run locally using the deterministic evaluator.
- POSTs the score + trace blobs to
/v1/sessions/{id}/result. - Prints a dashboard URL for the persisted run.
agent_stdout.
Quickstart
Sign in through Clerk and let the control-plane mint a team API key:Auth resolution
The CLI looks for credentials in this order:POME_API_KEYenv var (CI / one-off /direnv).~/.pome/credentials.jsonwritten bypome login(persistent).
POME_API_URL=https://staging...
(env) or --api-url (flag). pome login also accepts
--dashboard-url https://pome.sh while the production domain is still
settling.
V1 scope notes
- Only
[D](deterministic) criteria are scored.[P](LLM-judge) criteria are silently skipped. - The twin starts from its default seed (
acme/api, three labels, issue #1). Custom seed override is not yet supported in hosted mode — scenario## Seed Stateblocks are honored in self-host but ignored in hosted (the CLI sendsscenario_sourceonly, not seed). Coming in V1.1. - Single twin (
github) only. pome loginuses Clerk for human auth, then stores the samepme_...runtime credential that hosted mode already sends asX-API-KEY.
Exit codes
| Code | Meaning |
|---|---|
| 0 | All scenarios passed passThreshold. |
| 1 | At least one scenario scored below threshold. |
| 2 | Twin or orchestrator error (network, 5xx, twin spawn failed). |
| 3 | Auth error (401/403). Re-mint your key. |
| 4 | Quota exceeded. Upgrade plan or wait for reset. |
| 5 | Usage error (bad flags, missing files). |
Differences from self-host
| Aspect | Self-host | Hosted |
|---|---|---|
| Twin location | In-process (Hono) | Fly Machine spawned per session |
| Auth | Random per-run JWT | HS256 JWT signed by control-plane |
| Seed | Scenario’s ## Seed State | Twin’s defaultSeedState() |
| Result storage | runs/<slug>/<run_id>/ only | + cloud row + trace blobs in Supabase Storage |
| Cleanup | Server stops at end of run | DELETE /v1/sessions/ on completion |
POME_GITHUB_REST_URL, POME_AUTH_TOKEN, etc.) — your agent code does
not need to know which mode it’s running in.