You can add Pome to GitHub Actions with one step — the action installs the CLI, boots a twin, runs your scenarios, scores the results, and uploads trace artifacts to the workflow run. Your pipeline gets a clear exit code: zero means all scenarios passed, non-zero means something failed or errored.Documentation Index
Fetch the complete documentation index at: https://docs.pome.sh/llms.txt
Use this file to discover all available pages before exploring further.
Using pome-sh/run-scenarios-action
The simplest usage points the action at a single scenario file:Full workflow example
This workflow runs agent evaluation on every push and pull request. It passes your LLM provider key to the agent subprocess viaenv and your Pome API key (for hosted mode) via the action input.
Action inputs
| Input | Required | Description |
|---|---|---|
scenario-path | Yes | Path to a scenario .md file or a directory of scenario files |
hosted-api-key | No | API key for the hosted Pome control plane. Omit to use a local twin instead |
What the action does
Installs the pome CLI
The action downloads and installs the
pome binary appropriate for the runner OS.Boots a twin
If
hosted-api-key is provided, the action provisions a hosted twin session via the Pome API. Otherwise it starts a local twin on the runner using Docker.Runs scenarios and scores them
The action calls
pome run for each scenario file. For [P] criteria, the CLI runs the LLM judge locally using your provider key — the Pome cloud never sees your key or the agent’s output.Setting secrets
Add the following secrets to your repository under Settings → Secrets and variables → Actions:POME_API_KEY— your Pome team API key (required for hosted mode; omit for local twin)- Your LLM provider key — one of:
ANTHROPIC_API_KEYfor Claude modelsOPENAI_API_KEYfor OpenAI modelsPOME_LLM_API_KEYfor any other OpenAI-compatible endpoint (setPOME_LLM_BASE_URLandPOME_LLM_MODELalongside it)
ANTHROPIC_API_KEY and OPENAI_API_KEY. If you use a different provider, set all three POME_LLM_* variables.
Exit codes from
pome run: 0 means all scenarios met the pass threshold, 1 means at least one scenario scored below the threshold, and 2 or higher indicates an infrastructure error (twin boot failure, auth error, quota exceeded). Your CI step will fail automatically on any non-zero exit.Running locally before pushing
Before committing a new scenario, run it locally to check it passes:runs/ so you can inspect them with pome inspect latest before pushing.