Five sentences.
The Standards Companion is a suite of advisory, draft-only tools, deployed inside the BIES standards lifecycle to accelerate pattern-recognition and structured-drafting work that currently consumes 1,500 to 2,800 analyst hours per year. The architecture is human-in-the-loop by design: the model never publishes, never decides, never acts, and every output is a draft that an analyst reviews and signs out. Risk is managed by classifying every tool use into one of four tiers, each tier carrying defined controls, decision rights, and audit obligations. The directorate retains full authority over IHS development, consultation, revision, and publication; the tool produces drafts faster, the analyst remains accountable. Implementation is staged: a single tool, on a single workstream, with a 90-day evaluation gate, before any further deployment.
How the Standards Companion fits inside the BIES day.
The model below describes how an analyst, drafter, or notification author would use a Standards Companion tool in a normal working week, and how the artefact moves through the directorate's existing approval chain.
1. Trigger
A defined event triggers tool use: consultation closes (Tool 1), drafting begins (Tool 2), guidance refresh required (Tool 3), amendment issued (Tool 4). Triggers are documented in the IHS workstream procedure. No tool is invoked outside a defined trigger.
2. Scope check
The analyst confirms the work is in scope for tool use: it falls within the four supported workflows, it does not require scientific risk assessment, it does not require legal interpretation, it does not require Te Tiriti or stakeholder authority decisions. Out-of-scope work bypasses the tool entirely.
3. Tool run
The analyst uses the tool, supplies inputs, reviews the output. Every model call is logged automatically: prompt version, input, output, timestamp, user identifier. Logging is non-negotiable and not user-controlled.
4. Analyst review
The analyst reads the output line by line. Every cited quote is verified against source. Every disposition is reviewed. SME-flagged items are routed to the named SME. Confidence-flagged items are addressed before any draft progresses. The analyst marks the output as accepted, modified, or rejected. Rejection generates a structured note for the next prompt iteration.
5. SME review (where flagged)
Items the model flags as needing scientific, legal, technical, or policy SME review go to the named SME with the model's draft as background only. The SME's response is the operative input, not the model's draft. The model does not see the SME response.
6. Director sign-out
Any artefact derived from a tool that goes outside BIES (Review of Submissions, IHS amendment, importer notification) carries a Director sign-out under the existing delegation. The sign-out form includes a tick that the AI tool was used and that the analyst has verified the output. No artefact is published without sign-out.
7. Audit and retention
Tool logs are retained under the Public Records Act 2005 alongside the IHS workstream record. Annual audit samples 5% of tool runs against the corresponding artefact for compliance with the operating model. OIA requests can be served against tool runs as part of any IHS workstream OIA.
Four tiers. Every tool use sits in exactly one.
The tier dictates the controls that apply, the decision rights, and the audit obligations. The same tool can be used at different tiers depending on what is being analysed.
| Tier | What it covers | Examples | Controls | Decision right |
|---|---|---|---|---|
| T1 | Internal drafting support, low-stakes | Drafting consistency check on a routine IHS refresh; plain-language draft of an existing published clause. | Standard logging, citation discipline, analyst review. | Analyst |
| T2 | Consultation analysis with public-facing output | Submission synthesis, plain-language guidance for publication, change explainer for industry notification. | T1 plus Senior Analyst sign-off, full SME routing where flagged. | Senior Analyst + SME (where flagged) |
| T3 | Contentious or high-impact IHS workstreams | Synthesis of a contentious IHS consultation (60+ submissions); change explainer for an amendment with material commercial impact. | T2 plus Standards Manager sign-off and second-analyst verification of cited quotes. | Standards Manager |
| T4 | Out of scope for tool use | Scientific risk assessment; legal interpretation of the Biosecurity Act; novel pathway analysis; Te Tiriti partnership decisions; stakeholder authority decisions. | Tool not used. Manual analyst and SME work only. | Director (under existing delegation) |
Who can decide what when a tool is in the loop.
A short list. Anything not on it requires escalation.
| Decision | Decision-maker | Notes |
|---|---|---|
| Whether to use a tool on a given workstream | Analyst | Within the scope and tier rules above. |
| Accepting an individual model output theme or finding | Analyst | After verification of cited quote. |
| Modifying or rejecting a model output | Analyst | Modification logged for prompt iteration. |
| Routing a flagged theme to SME | Analyst | SME response is operative; model draft is background only. |
| Submitting an artefact for Director sign-out | Senior Analyst (T2/T3) | Senior Analyst confirms the operating model has been followed. |
| Publishing an IHS, RoS, or notification | Director | Existing BIES delegation. Unchanged. |
| Pausing the tool for a workstream | Standards Manager | If quality issues observed; documented and reported to the AI for Standards lead. |
| Withdrawing the tool from BIES use | Director | On recommendation from the AI for Standards lead, after the kill conditions trigger. |
| Changing a system prompt or output schema | AI for Standards lead | Versioned. Old version retained. Change recorded in the workstream record. |
| Approving a new tool for BIES use | Director | On recommendation from the AI for Standards lead, after pilot and evaluation. |
Roles across the tool lifecycle.
| Activity | Analyst | Senior Analyst | Standards Manager | SME | AI lead | Director |
|---|---|---|---|---|---|---|
| Run tool on workstream | R | I | I | · | I | · |
| Verify cited quotes | R | A (T2/T3) | · | · | · | · |
| Resolve SME-flagged items | R | A | · | R | · | · |
| Approve artefact for Director | R | A | C (T3) | C | · | I |
| Sign out artefact for publication | R | R | C | · | · | A |
| Iterate prompt or schema | C | C | · | · | A/R | I |
| Pause or withdraw tool | I | I | R | C | R | A |
| Quarterly evaluation report | C | C | R | · | A/R | I |
R = Responsible, A = Accountable, C = Consulted, I = Informed.
Twelve months in, the rollout has failed. Why?
A structured pre-mortem of the seven scenarios most likely to derail a BIES genAI deployment, with the controls in this operating model that address each.
| # | Failure mode | Likelihood | Impact | Mitigation in this operating model |
|---|---|---|---|---|
| F1 | Model fabricates a verbatim quote attributed to a real submitter, surfaces in a published Review of Submissions, complaint reaches the Ombudsman. | Low | High | Citation discipline; analyst verification of every quote against source; Senior Analyst and Director sign-out before publication. |
| F2 | Plain-language guidance omits a material requirement from the source IHS, importer relies on it, consignment fails compliance. | Medium | High | Tool 3 outputs the must-do checklist with source clause references; analyst reviews against the source; the published IHS remains the operative document. |
| F3 | Submitter requests their submission text under OIA; the model logs are not retained or are not searchable; OIA response is incomplete. | Medium | Medium | Tool logs retained under Public Records Act 2005 alongside the workstream record; logs indexed by submission ID for OIA retrieval. |
| F4 | Analyst over-relies on tool output, signs out a Review of Submissions without verifying the cited quotes; quality issue surfaces in a subsequent submission. | Medium | Medium | 5% audit sample; modification rate tracked; over-reliance pattern triggers AI lead review and additional verification step. |
| F5 | System prompt is changed without versioning; outputs from a different prompt version are mixed in the workstream record; auditability fails. | Low | Medium | System prompts versioned and immutable per release; AI lead controls change; old prompt versions retained. |
| F6 | The four tools become a workstream the directorate cannot afford to staff or maintain; tools fall out of date; usage drops. | Medium | Medium | One full-time AI for Standards lead; quarterly evaluation; explicit kill conditions trigger withdrawal rather than slow decline. |
| F7 | Te Tiriti partner is not engaged on a tool that affects how iwi/hapū submissions are analysed; engagement gap criticised at consultation review. | Medium | High | Tier 4 explicitly excludes partnership decisions; engagement plan with Te Uepū and other Te Tiriti partners is part of the AI for Standards lead's first-90-days workplan. |
How we know it is working. And when we stop.
Five quantitative measures and three qualitative measures, reported quarterly, against published baselines.
Quality
Modification rate: % of model output items the analyst materially changes. Target: above 10% (under-correction risk) and below 60% (the tool is not adding value).
Citation accuracy: % of cited quotes verified accurate by analyst. Target: 100%.
Speed
Time saved per workstream: measured against pre-tool baseline by activity sampling. Target: 40% or more time saving on the supported activity.
Cycle time: total elapsed time from consultation close to RoS publication. Target: 25% reduction within 12 months.
Trust
Analyst confidence (anonymous quarterly survey). Target: ≥ 70% net positive.
External quality issues raised (in submissions, by media, by Ombudsman). Target: zero attributable to tool output that survived sign-out.
Kill conditions
The tool is withdrawn from BIES use if any of the following occurs: a single fabricated quote survives sign-out and reaches the public domain; modification rate exceeds 70% across two consecutive quarters; cycle time worsens against pre-tool baseline; a single OIA response is materially compromised by tool log gaps; the analyst confidence survey returns net negative for two consecutive quarters; the Director, on advice from the AI for Standards lead, determines the tool is no longer fit for purpose. Withdrawal is not failure. It is the operating model working.
First 90 days. Then 6 months. Then 12.
Single tool, single workstream, full evaluation
Tool 1 (Submission Synthesis), one IHS workstream, paired with one analyst. Time-and-motion baseline before tool use. Operating model in force. First quarterly evaluation at day 90.
Two tools, two workstreams, formalised governance
Add Tool 2 (Consistency Check) on a separate workstream. Operating model formalised in BIES procedures. AI for Standards lead role established as the steady-state owner.
All four tools, BIES-wide
Tools 3 and 4 added. Use across the directorate as a standard part of the IHS workstream. Annual review against the evaluation framework. Review whether to extend to OMARs and Export Standards work.
Where this operating model draws from.
This operating model is consistent with the Public Service Commission's published guidance on the use of generative AI in government, the Government Chief Digital Officer's principles for safe AI use, the Privacy Act 2020 obligations for managing personal information at scale, and the Official Information Act 1982 obligations for record retention and discoverability.
It is also consistent with the existing BIES delegation structure under section 24A of the Biosecurity Act 1993 and with the published MPI consultation policy. Nothing in this operating model changes the Director's authority to issue an Import Health Standard or the Chief Technical Officer's authority under section 27.
Where this would land in a real BIES engagement: the operating model would be reviewed against the current MPI Risk Management Framework and the most recent Public Service Commission AI guidance in the first 30 days of the role, and amended where required. The aim of this artefact is to demonstrate the kind of thinking the role calls for, not to publish final policy.