Shipping
Jun 22A11yReadyGlobal Assembly layer (Phase 22) online: a deterministic reconciliation pass that…Jun 22UCRThe testing platform now sees the live UCR_app codebase as cloud-resident, not…Jun 22UCRThe testing platform's self-healing loop reaches end-to-end against legacy UCR live —…Jun 22Testing PlatformADR 0020 Phases 1+2+3 all shipped in one afternoon — cloud-resident UCR_app source spine…Jun 22Testing PlatformADR 0020 proposed — cloud-resident UCR_app source + Azure DevOps source-change webhook +…Jun 22Testing PlatformFirst end-to-end MEDIC smoke vs UCR QAT — the pipeline works from GUARDIAN failure to…Jun 22Testing PlatformMEDIC v0 Phase 4 shipped — apply patch as draft PR (manual approve, automated apply,…Jun 22Testing PlatformMEDIC v0 Phase 3 backend route shipped — the round trip is reachable from the UI. POST…Jun 22Testing PlatformMEDIC v0 Phase 3 persistence layer shipped — selector_patch lands in operator.proposals.…Jun 22Testing PlatformMEDIC v0 Phase 2 LLM call layer shipped — the 7th agent gets brains.…Jun 22A11yReadyGlobal Assembly layer (Phase 22) online: a deterministic reconciliation pass that…Jun 22UCRThe testing platform now sees the live UCR_app codebase as cloud-resident, not…Jun 22UCRThe testing platform's self-healing loop reaches end-to-end against legacy UCR live —…Jun 22Testing PlatformADR 0020 Phases 1+2+3 all shipped in one afternoon — cloud-resident UCR_app source spine…Jun 22Testing PlatformADR 0020 proposed — cloud-resident UCR_app source + Azure DevOps source-change webhook +…Jun 22Testing PlatformFirst end-to-end MEDIC smoke vs UCR QAT — the pipeline works from GUARDIAN failure to…Jun 22Testing PlatformMEDIC v0 Phase 4 shipped — apply patch as draft PR (manual approve, automated apply,…Jun 22Testing PlatformMEDIC v0 Phase 3 backend route shipped — the round trip is reachable from the UI. POST…Jun 22Testing PlatformMEDIC v0 Phase 3 persistence layer shipped — selector_patch lands in operator.proposals.…Jun 22Testing PlatformMEDIC v0 Phase 2 LLM call layer shipped — the 7th agent gets brains.…
All Projects

AI-Powered Testing Platform

6-Agent System for Legacy Modernization

Vertex AIClaude Sonnet 4.6PlaywrightFastAPIPostgres + pgvectorReactViteTailwindCloud RunCloud SQLIdentity-Aware ProxyWorkload Identity Federation
56/
Agents Closed-Loop
9,420
Indexed Symbols
148
Specs / Run

Platform Screenshots

multnomah-county-accessibility.app
Platform Home — Operator Dashboard

Operator's chief-of-staff dashboard: 1 active tenant, 4 of 5 agents closed-loop, 223 items needing review, plus the OPERATOR agent's daily briefing on top findings and what to triage first

multnomah-county-accessibility.app
SME Review Queue — Knowledge Confirm

SME confirmation queue: 'Confirm what we extracted from SME conversations' — 45 confirmed sessions in pool, audit sample with per-session SME attribution (Rikki, Margretta) and confidence-banded promotion path

multnomah-county-accessibility.app
Knowledge Session Detail — UCR Medical Alert Workflow

Confirmed SME knowledge session expanded: full workflow context (UCR Medical Alert Service Request lifecycle), attendees (Rikki Thunstrom as primary SME, Loren as AI lead), and 13 extracted key facts that downstream TESTGEN will turn into Playwright specs

multnomah-county-accessibility.app
UCR Legacy Tenant — Tenant-Scoped View

Per-tenant scoping: UCR Legacy view with Overview, Workflows, Work items, Knowledge, SME review, CODEX explorer, Code drift, Open questions, TESTGEN, and GUARDIAN — every agent surfaces its tenant-relevant state in one place

Overview

A 6-agent system that lets legacy systems stay safe to change. CODEX (codebase intelligence with 9,420 indexed symbols + 8 drift detectors) reads the existing code; KNOWLEDGE (typed extraction across 704 sessions with plain-language search) captures tribal knowledge from retiring SMEs before they walk out the door; TESTGEN (Playwright generation at v2 + auto-confirm policy moving 146 drafts/day) turns that knowledge into executable specs; GUARDIAN (regression watch + queued-runner pattern) runs the specs and flags drift; OPERATOR (the orchestrator — Mounika's chief of staff) synthesizes cross-agent state into a daily briefing with actionable proposals; INTEGRATION (Phase F) is designed-not-built. Privacy guardrails verified on real data — a two-stage PHI classifier (heuristic short-circuit + LLM second-pass) held 321 files for human review, and after-the-fact audits found 11 leaks the heuristic missed and the LLM caught — purged, with the routing config updated so they SKIP on re-ingest. Multi-tenant architecture from day one: RLS on every table keyed by tenant_id, per-agent service accounts, IAP gating, Workload Identity Federation (no service-account JSON keys). UCR is tenant 1; ACHP onboards in Phase G.

Impact & Results

5/6
Agents Closed-Loop
CODEX · KNOWLEDGE · TESTGEN · GUARDIAN · OPERATOR shipped
9,420
Indexed Symbols
CODEX codebase intelligence with 8 drift detectors
146 / day
Auto-Confirm Rate
drafts moved out of SME backlog (38% inbox clearance)
148 specs
Latest GUARDIAN Run
144 passed in 22 minutes against UCR QAT
11
PHI Leak Catches
missed by heuristic, caught by LLM second-pass, purged
1 → N
Tenants
UCR is tenant 1; ACHP onboards in Phase G

Key Features

6 specialized agents: CODEX (codebase intelligence + 8 drift detectors), KNOWLEDGE (typed SME extraction), TESTGEN (Playwright generation), GUARDIAN (regression watch), OPERATOR (orchestrator), INTEGRATION (designed)
TESTGEN v2 prompt iteration after live QAT runs — 208 drafts at 98.1% compile-clean under sequential-workflow + UCR hover-menu nav discipline
Auto-confirm policy moves 146 drafts/day out of SME review backlog via pattern-match — 38% inbox clearance without SME involvement, fully audited
GUARDIAN queued-runner pattern — browser "Trigger run" button writes to Postgres work queue; polling runner claims jobs and executes Playwright (~35s end-to-end)
OPERATOR deterministic rule engine reads cross-agent state and produces Mounika a prioritized daily briefing + Approve/Reject proposals — solves the "who orchestrates the orchestrator?" gap that breaks 5-specialist fleets at 20 tenants
Two-stage PHI privacy guardrails (heuristic short-circuit + LLM second-pass) validated on real data — 321 files held for review, 11 missed leaks caught + purged + routing updated
Multi-tenant from day one — RLS keyed by tenant_id, per-agent service accounts, IAP gating, Workload Identity Federation (no service-account JSON keys)
Stakeholder hub with open-questions surface — Rikki, Margretta, Antonio, Michelle answer tribal-knowledge questions async; 50 of 77 resolved

Next Project

UCR Modernization