A guide for people who need to understand the system, not build it
Every document that Digital Reef monitors - plan changes, resource consents, gazette notices, legislation - gets tagged across five independent dimensions. These tags describe WHAT a document IS, never what it MEANS. Two people from different backgrounds must both accept any tag as accurate, even if they draw opposite conclusions.
Where the document was detected. Every document enters the system through a specific channel - gazette notices, fast-track applications, council emails, RSS feeds, or direct uploads.
What kind of regulatory action the document represents. Plan changes, resource consents, hearings, submissions, designations - the procedural context that determines timelines and legal weight.
What subject matter the document covers. Freshwater quality, biodiversity, conservation land, coastal hazards, transport corridors - the substantive topics that determine who cares about it.
189,000 locations with precise geometry from LINZ and DOC. Every place tag carries a PostGIS geometry - point, line, or polygon - so documents can be found on a map, not just in a search box.
Temporal characteristics - deadline types and states. Submission windows, hearing dates, decision deadlines. Time tags make it possible to answer "what do I need to act on this week?"
Guilds are communities of interest - groups of people who share a relationship with a particular environment or concern. There are six guilds, each with its own colour and icon.
Guilds are derived from tags - they are a cosmetic grouping, not a separate classification. The same document might be relevant to Waterspace (it mentions a river), Conservation (it affects native fish habitat), and Landspace (it is near a walking track).
When a document arrives, it goes through a classification pipeline:
Tags are transparent and challengeable. When the system is uncertain, it says so.
Tags record facts, not opinions. A document about a river gets tagged with the river's name, not with whether the river is healthy or polluted. Interpretation is for humans.
Each dimension is independent. A document's source (where it came from) tells you nothing about its domain (what it covers). This means five simple lists, not one impossible tree.
Every tag shows its reasoning. When a tag is applied by an LLM, the prompt and confidence score are recorded. When a gazetteer matches a place name, the match type and distance are stored.