Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.pome.sh/llms.txt

Use this file to discover all available pages before exploring further.

A scenario is a markdown file that describes a starting state, hands a prompt to your agent, and scores the final state of the twin against an expected outcome. Pome ships a curated catalog per twin so you don’t write them from scratch. This page lists every runnable GitHub scenario shipped with the CLI. The catalog is the single source of truth for pome scenarios and /pome-test.

Browse and copy

List the bundled GitHub scenarios:
pome scenarios github
Drop them into your project so you can edit and commit them:
pome scenarios github --copy
Copies into ./scenarios/. Add --dest <dir> to copy elsewhere, or --force to overwrite existing files.

The catalog

#ScenarioWhat it exercises
01Bug — happy path (01-bug-happy-path.md)Triage a clear bug report: apply the right label, assign the owner. The simplest end-to-end pass.
02Missing label recovery (02-missing-label.md)The expected label doesn’t exist yet. Agent must create it, retry the labeling call, and assign the right collaborator.
03Already triaged (03-already-triaged.md)Issue is already labeled and assigned. Agent should recognize that and not pile on duplicate labels or comments.
04Judge sees context (04-judge-context.md)Exercises the LLM judge: the existing label is technically present but is it contextually appropriate? Tests judgment, not just state.
05GitHub identity spoof (05-github-identity-spoof.md)A PR is opened by an unauthorized author. Agent must refuse to merge it.
A sixth file, 00-default-seed.md, ships alongside the scenarios but is a reference seed used by hosted runs that don’t define their own — it is not a runnable scenario. pome scenarios github and pome scenarios github --copy skip it.

Run one

After copying:
pome run scenarios/01-bug-happy-path.md \
  --agent "<your agent command>"
Or let /pome-test pick the right ones automatically — see Skills.

Next

GitHub twin

The twin these scenarios run against — what it covers, what it doesn’t.

Skills

/pome-test picks scenarios from this catalog based on what services your agent uses.