Audience: engineering, QA, and release/audit reviewers. This documents the
formal test harness in
tests/ that supersedes the ad-hoc test-api.sh smoke
script. It validates normal/typical product operations end to end.How the harness is organised
The harness separates what an operation is from how it is verified, in three layers:Procedures
One API operation per file (e.g.
createPolicy, registerDevice). Reusable,
typed building blocks under src/procedures/.Cases
Ordered, asserted scenarios composing procedures (e.g. Policy CRUD) under
src/cases/*.case.test.ts. Each step is individually reported.Plans
Grouping of cases by capability with an objective. Catalogued in
src/catalogue.ts for traceability.Pass / fail determination
The harness uses Vitest, giving rigorous, machine-gradeable results — a step up from the old script which printed messages but never aggregated a verdict.| Mechanism | Behaviour |
|---|---|
| Assertions | Each step asserts HTTP status and response/DB shape with expect. |
| Verdict | A case fails if any assertion throws; the run exits non-zero on any failure (CI-gateable). |
| Isolation | Every case provisions its own organisation and tears it down — no shared state between cases. |
| Reports | npm run test:ci emits reports/junit.xml and reports/results.json for dashboards. |
Type awareness
Tests are validated against the same generated database types the application uses.tests/src/types/database.ts re-exports Database, Tables,
TablesInsert, TablesUpdate from the frontend’s database.types.ts, and the
Supabase clients are typed with Database. REST/RPC procedures
(.from('policies'), .update(...)) are therefore checked column-by-column at
compile time.
Newly added tables not yet in the generated types (currently
organisation_deletion_log) use a clearly-marked untyped escape hatch until
supabase gen types is re-run.API fidelity (no direct SQL)
A core principle of this harness is that tests exercise the product the way real clients do — through edge functions and PostgREST/RPC with a user JWT (RLS enforced) — never via privileged SQL. The web and mobile apps talk to PostgREST with the user’s token; the harness does the same viactx.asAdmin.
Every feature procedure uses a real API. The service-role client (which
bypasses RLS, a path no client can take) is confined to two non-product roles:
| Usage | Where | Why it’s not an API call |
|---|---|---|
| Confirm the bootstrap admin’s email | test fixtures | The first admin of a brand-new org can’t be invited in-app. It registers via the public auth.signUp API; only the email-confirmation step uses admin access, standing in for the user clicking the confirmation link. |
| Cleanup leftovers | test fixtures | Safety-net teardown runs only after a mid-run failure; teardownOrg calls the offboard API first. |
| Verify deletion | lifecycle case | After offboard the user, session and rows are gone (and RLS-hidden); proving their absence and the audit record requires the privileged client. |
USER_SEED_MODE=signup_confirm: real public signup + an admin email-confirm that
mimics clicking the link. Members are added through the real invite-user
function. So the only irreducible admin access is the single confirmation step,
cleanup, and post-deletion verification.
All operations under test are driven through edge functions or user-JWT PostgREST/RPC.
Service-role access appears only in scaffolding and post-deletion verification, each clearly commented in code.
Environment & safety
These tests perform destructive operations (they create and then delete organisations and users). User accounts are seeded directly via the Admin API with known passwords and no invitation email, which is how the harness avoids the email round-trip that made the production invite flow untestable.Running
Test Plan Catalogue
Ten Test Plans cover the typical product operations ported fromtest-api.sh,
plus the onboard/offboard lifecycle.
TP-ORG — Organisation Lifecycle
Verify an organisation can be onboarded and fully erased (GDPR), with cascade deletion and an audit record.| Case | Steps |
|---|---|
| TC-LIFECYCLE — onboard → offboard | Seed admin → onboard → default profile via trigger → offboard rejected without confirmation → cascade delete → auth PII erased → audit record persists |
TP-DEVICE — Device Management
Verify device registration, idempotent re-registration, capability sync, and validation.| Case | Steps |
|---|---|
| TC-DEVICE | Register new (isNew=true) → re-register (isNew=false) → sync capabilities → reject unknown device → list device |
TP-TAG — NFC Tag Management
Verify NFC tag creation, listing, and duplicate rejection.| Case | Steps |
|---|---|
| TC-TAG | Create tag → list tag → reject duplicate tag_uid (409) |
TP-SCAN — Scan Resolution
Verify NFC scan profile resolution, event logging, and error paths.| Case | Steps |
|---|---|
| TC-SCAN | Resolve LOCK → resolve UNLOCK → events logged → reject unregistered device → reject unregistered tag |
TP-RPROFILE — Restriction Profiles
Verify CRUD for restriction profiles.| Case | Steps |
|---|---|
| TC-RPROFILE-CRUD | Create → list (incl. default) → update → delete |
TP-POLICY — Policies
Verify the full policy lifecycle including the schedule variant.| Case | Steps |
|---|---|
| TC-POLICY-CRUD | Create tag_scan → read (joined) → list → update name+priority → toggle off/on → create schedule policy → delete both |
TP-USER — User Management
Verify user invitation, listing, validation, deletion, and the self-deletion guard.| Case | Steps |
|---|---|
| TC-USER | Invite auto-confirmed member → find in profiles → reject invalid role → delete member → block self-deletion |
TP-APPCAT — App Catalogue
Verify app search and category listing (external iTunes proxy) contracts.| Case | Steps |
|---|---|
| TC-APPCAT | Search returns results array → reject missing query → return categories |
TP-LIMITS — Subscription Limits
Verify plan-limit enforcement on resource creation.| Case | Steps |
|---|---|
| TC-LIMITS | Fill restriction profiles to max → 403 over limit → fill policies to max → 403 over limit |
TP-ORGINFO — Organisation Info
Verify organisation read operations.| Case | Steps |
|---|---|
| TC-ORGINFO | Return stats + limits → read org details → list Stripe products |
Coverage vs test-api.sh
Every operation in the original script is represented as a procedure and
exercised by a case. The key improvements:
Each operation is a reusable, typed procedure (not inline curl).
Cases run procedures in order with real pass/fail assertions.
Per-run organisation isolation replaces shared-org mutation.
Onboarding and offboarding are first-class tested lifecycles.
Limit enforcement runs against a fresh org filled to its plan maximum.
Roadmap
The directory layout is structured so the next layers slot in without rework:Load testing (k6)
Reuse the seeded-user + edge-function patterns to script throughput/latency
runs under
load/.E2E (Playwright)
Drive the web UI (e.g. the Settings → Danger Zone offboard flow) under
src/e2e/, sharing .env.test.
