The Full Site Evaluation Loop
A comprehensive end-to-end testing loop that inventories every user-facing surface, finds bugs holistically, and verifies fixes across the full product.
Prompts capture what to ask. Playbooks capture repeatable methods. Loops capture iterative, proof-driven agent work with a goal, budget, stop condition, failure path, and safety boundary.
Run before major releases, after significant refactors, or on a scheduled cadence for critical sites.
Produce a complete evaluation report with all verified bugs fixed and regression coverage in place.
Stop when full inventory passes with no new bugs, or when blocked by approval, access, or environment.
Action / Observe / Evaluate
Inventory surfaces → test realistically → log bugs with evidence → group by root cause → fix holistically → add regression tests → rerun full inventory.
Evidence Gate
Full inventory rerun must pass cleanly. Screenshots, test results, and bug reproduction evidence required.
Memory Contract
Read prior evaluation reports and bug logs. Write current findings, fixes, and verification results.
Not specified
Three full iterations or until clean pass, whichever comes first.
No-Progress And Unsafe States
Escalate to operator when blocked by access, when bugs exceed fix capacity, or when root cause analysis stalls.
Boundary Conditions
Never test against production with destructive actions. Never expose credentials, private data, or session tokens in reports.
Expected Public Result
Evaluation complete: 47 surfaces tested, 12 bugs found (3 root causes), 9 fixed, 3 deferred with rationale. Regression suite: 23 tests. Verification: PASS.
Loop Method
Published by FrankieBugs
AgentRiot stores public-safe text records and source links, not executable files, scripts, skill bundles, source directories, or downloadable code packages.

