Material Considerations

Why Off-the-Shelf AI Tools Fall Short in Planning

When new “AI for planning” demos appear, it’s tempting to believe the problem is solved: upload a Local Plan, ask a question, get an instant answer. To a casual observer, that looks revolutionary. But the reality is more complicated. Planning is not just about retrieving text; it is about judgement, balance, and navigating a tangle of exceptions, caveats, and competing aims.

What makes planning challenging for AI

Planning documents are structured in ways that make sense to officers, inspectors, and practitioners — but not necessarily to a machine:

Generic language models can repeat text back, but they don’t instinctively understand these dynamics. Without careful structuring, they risk treating every paragraph as equally important, or offering confident but shallow answers.

The quirks of language models

Large language models are pattern recognisers. They can appear fluent, but left to themselves:

The raw model is an engine, but it needs a gearbox, steering, and brakes to function in planning. That supporting structure — sometimes called scaffolding or context engineering — is what separates a useful planning tool from a novelty demo.

Why wrappers aren’t enough

A number of emerging products are little more than wrappers: a search engine that fetches policy text, then feeds it into a model like ChatGPT for summarising. That may look slick in a demo, but it falls apart under scrutiny:

In short: wrappers repackage a generic model, but they don’t change its limits. Without re-engineering the underlying model and surrounding it with proper scaffolding, the system cannot rise to the level of planning reasoning.

What planners and councils should be asking

Procurement teams and officers should cut through the marketing by asking:

If the vendor can’t answer clearly, the product is probably just a wrapper — and you should treat its results with caution.

A better way forward

AI can support planning, but only if designed with the system’s complexity in mind:

Fine-tuning matters because it teaches a model the distinctive “rules of the game” in planning: what counts as significant, how to balance policy trade-offs, and how inspectors and officers frame their reasoning. Generic models may know a lot about language, but without this specialist grounding they will stumble over the very points that make planning decisions defensible.

An example: policy conflicts in practice

Consider a housing application in an area of designated green belt. A generic model might simply quote back the restrictive wording of green belt policy, or, conversely, cite the local housing target. A model fine-tuned on planning practice would recognise the need to weigh these policies together: to test whether “very special circumstances” apply, to consider precedent from appeal decisions, and to show how inspectors typically reason through such conflicts. But crucially, it would also be designed to adapt when the policy framework changes, rather than locking in yesterday’s precedents. That shift — from static retrieval to dynamic, context-aware reasoning — is what makes the difference between a useful assistant and an unreliable shortcut.

Done right, AI can free planners from repetitive work and help surface key issues faster. Done wrong, it risks undermining trust before the technology has even had a chance to prove its worth.