🗂️ Devlog #6 – The Subtle Hell of Schema Design

28 May, 2025

You’d think by now I’d be used to it — the endless back-and-forth of schema design. But somehow it always manages to sneak up on me. The seduction of structure, followed by the slow grind of realising that your clever abstraction doesn’t quite match reality. Again.

I’ve been refactoring the data models for The Planner’s Assistant — not in some grand, sweeping overhaul, but in a deeply fiddly way. Field by field. Table by table. Cleaning up where AI enrichment lives, how to link vector embeddings back to chunked documents, whether policies should be keyed by policy_ref or just a foreign key to their parent document. All the things nobody sees, but everything depends on.

Some recent updates:

Harmonised IDs across documents, chunks, policies, and constraints (no more slug confusion).
Made cross_references between policies more consistent so the reasoning graph can build cleanly in memory.
Updated the embedding table to avoid unnecessary enrichment fields at the chunk level — only storing what I actually use.
Clarified which fields are authored (by planners), parsed (by the system), or generated (by AI), so I can later filter provenance.
Built in support for constraint sources that don’t link back to any one policy (e.g. from central datasets or statutory designations).

None of this feels glamorous. It’s the plumbing. But like all plumbing, it only matters when it leaks — or when it stops someone from flushing ideas through the system.

The frontend still lags slightly behind — mock data in Svelte uses counter-based string IDs (pol-1 etc.), while the backend is all UUIDs. But that can be bridged. More urgently, the prompts that generate officer_notes often just regurgitate the policy summary — I need to decide what genuinely adds value, and whether that field is even needed.

There is a structured prompt library — at least for agentic retrieval — but it still needs curation and some clearer metadata. I’m writing schemas for the LLMs as much as for the DB, and the two need to stay in sync. Especially once I start running agentic traces or re-ranking based on policy weightings.

Maybe that’s the art of it: making the invisible solid. Designing schemas that don’t just "work," but make sense to the next person (or the next version of myself) who needs to query, mutate, or explain the data.

Still to come:

Frontend type harmonisation
Prompt library metadata and enrichment audit
Fun static site experiment for policy summaries and "hypocrisy" dashboards

Not the most exciting week — but necessary. There’s a quiet satisfaction in knowing the foundation is strong, even if nobody claps for the database.