Test Plan Catalogue

Audience: engineering, QA, and release/audit reviewers. This documents the formal test harness in tests/ that supersedes the ad-hoc test-api.sh smoke script. It validates normal/typical product operations end to end.

How the harness is organised

The harness separates what an operation is from how it is verified, in three layers:

Procedures

One API operation per file (e.g. createPolicy, registerDevice). Reusable, typed building blocks under src/procedures/.

Cases

Ordered, asserted scenarios composing procedures (e.g. Policy CRUD) under src/cases/*.case.test.ts. Each step is individually reported.

Plans

Grouping of cases by capability with an objective. Catalogued in src/catalogue.ts for traceability.

Pass / fail determination

The harness uses Vitest, giving rigorous, machine-gradeable results — a step up from the old script which printed messages but never aggregated a verdict.

Mechanism	Behaviour
Assertions	Each step asserts HTTP status and response/DB shape with `expect`.
Verdict	A case fails if any assertion throws; the run exits non-zero on any failure (CI-gateable).
Isolation	Every case provisions its own organisation and tears it down — no shared state between cases.
Reports	`npm run test:ci` emits `reports/junit.xml` and `reports/results.json` for dashboards.

Type awareness

Tests are validated against the same generated database types the application uses. tests/src/types/database.ts re-exports Database, Tables, TablesInsert, TablesUpdate from the frontend’s database.types.ts, and the Supabase clients are typed with Database. REST/RPC procedures (.from('policies'), .update(...)) are therefore checked column-by-column at compile time.

Newly added tables not yet in the generated types (currently organisation_deletion_log) use a clearly-marked untyped escape hatch until supabase gen types is re-run.

API fidelity (no direct SQL)

A core principle of this harness is that tests exercise the product the way real clients do — through edge functions and PostgREST/RPC with a user JWT (RLS enforced) — never via privileged SQL. The web and mobile apps talk to PostgREST with the user’s token; the harness does the same via ctx.asAdmin. Every feature procedure uses a real API. The service-role client (which bypasses RLS, a path no client can take) is confined to two non-product roles:

Usage	Where	Why it’s not an API call
Confirm the bootstrap admin’s email	test fixtures	The first admin of a brand-new org can’t be invited in-app. It registers via the public `auth.signUp` API; only the email-confirmation step uses admin access, standing in for the user clicking the confirmation link.
Cleanup leftovers	test fixtures	Safety-net teardown runs only after a mid-run failure; `teardownOrg` calls the `offboard` API first.
Verify deletion	lifecycle case	After offboard the user, session and rows are gone (and RLS-hidden); proving their absence and the audit record requires the privileged client.

With email confirmation enabled, the bootstrap admin is created in USER_SEED_MODE=signup_confirm: real public signup + an admin email-confirm that mimics clicking the link. Members are added through the real invite-user function. So the only irreducible admin access is the single confirmation step, cleanup, and post-deletion verification.

All operations under test are driven through edge functions or user-JWT PostgREST/RPC.

Service-role access appears only in scaffolding and post-deletion verification, each clearly commented in code.

Environment & safety

These tests perform destructive operations (they create and then delete organisations and users).

Running against any non-local target requires ALLOW_DESTRUCTIVE_TESTS=true. Test data is namespaced (e2e-test-*, qa+*) and the safety-net teardown can only remove resources the run itself created — never a pre-existing tenant.

User accounts are seeded directly via the Admin API with known passwords and no invitation email, which is how the harness avoids the email round-trip that made the production invite flow untestable.

Running

cd tests
npm install
cp .env.test.example .env.test   # fill in keys + ALLOW_DESTRUCTIVE_TESTS
npm test            # all plans, non-zero exit on failure
npm run test:ci     # + JUnit/JSON reports
npm run test:cases  # just the case suites

Ten Test Plans cover the typical product operations ported from test-api.sh, plus the onboard/offboard lifecycle.

TP-ORG — Organisation Lifecycle

Verify an organisation can be onboarded and fully erased (GDPR), with cascade deletion and an audit record.

Case	Steps
TC-LIFECYCLE — onboard → offboard	Seed admin → onboard → default profile via trigger → offboard rejected without confirmation → cascade delete → auth PII erased → audit record persists

TP-DEVICE — Device Management

Verify device registration, idempotent re-registration, capability sync, and validation.

Case	Steps
TC-DEVICE	Register new (isNew=true) → re-register (isNew=false) → sync capabilities → reject unknown device → list device

TP-TAG — NFC Tag Management

Verify NFC tag creation, listing, and duplicate rejection.

Case	Steps
TC-TAG	Create tag → list tag → reject duplicate `tag_uid` (409)

TP-SCAN — Scan Resolution

Verify NFC scan profile resolution, event logging, and error paths.

Case	Steps
TC-SCAN	Resolve LOCK → resolve UNLOCK → events logged → reject unregistered device → reject unregistered tag

TP-RPROFILE — Restriction Profiles

Verify CRUD for restriction profiles.

Case	Steps
TC-RPROFILE-CRUD	Create → list (incl. default) → update → delete

TP-POLICY — Policies

Verify the full policy lifecycle including the schedule variant.

Case	Steps
TC-POLICY-CRUD	Create tag_scan → read (joined) → list → update name+priority → toggle off/on → create schedule policy → delete both

TP-USER — User Management

Verify user invitation, listing, validation, deletion, and the self-deletion guard.

Case	Steps
TC-USER	Invite auto-confirmed member → find in profiles → reject invalid role → delete member → block self-deletion

TP-APPCAT — App Catalogue

Verify app search and category listing (external iTunes proxy) contracts.

Case	Steps
TC-APPCAT	Search returns results array → reject missing query → return categories

TP-LIMITS — Subscription Limits

Verify plan-limit enforcement on resource creation.

Case	Steps
TC-LIMITS	Fill restriction profiles to max → 403 over limit → fill policies to max → 403 over limit

TP-ORGINFO — Organisation Info

Verify organisation read operations.

Case	Steps
TC-ORGINFO	Return stats + limits → read org details → list Stripe products

Coverage vs `test-api.sh`

Every operation in the original script is represented as a procedure and exercised by a case. The key improvements:

Each operation is a reusable, typed procedure (not inline curl).

Cases run procedures in order with real pass/fail assertions.

Per-run organisation isolation replaces shared-org mutation.

Onboarding and offboarding are first-class tested lifecycles.

Limit enforcement runs against a fresh org filled to its plan maximum.

Roadmap

The directory layout is structured so the next layers slot in without rework:

Load testing (k6)

Reuse the seeded-user + edge-function patterns to script throughput/latency runs under load/.

E2E (Playwright)

Drive the web UI (e.g. the Settings → Danger Zone offboard flow) under src/e2e/, sharing .env.test.

​How the harness is organised

Procedures

Cases

Plans

​Pass / fail determination

​Type awareness

​API fidelity (no direct SQL)

​Environment & safety

​Running