Governance · BIES Standards Companion

Executive summary

Five sentences.

The Standards Companion is a suite of advisory, draft-only tools, deployed inside the BIES standards lifecycle to accelerate pattern-recognition and structured-drafting work that currently consumes 1,500 to 2,800 analyst hours per year. The architecture is human-in-the-loop by design: the model never publishes, never decides, never acts, and every output is a draft that an analyst reviews and signs out. Risk is managed by classifying every tool use into one of four tiers, each tier carrying defined controls, decision rights, and audit obligations. The directorate retains full authority over IHS development, consultation, revision, and publication; the tool produces drafts faster, the analyst remains accountable. Implementation is staged: a single tool, on a single workstream, with a 90-day evaluation gate, before any further deployment.

Operating model

How the Standards Companion fits inside the BIES day.

The model below describes how an analyst, drafter, or notification author would use a Standards Companion tool in a normal working week, and how the artefact moves through the directorate's existing approval chain.

1. Trigger

A defined event triggers tool use: consultation closes (Tool 1), drafting begins (Tool 2), guidance refresh required (Tool 3), amendment issued (Tool 4). Triggers are documented in the IHS workstream procedure. No tool is invoked outside a defined trigger.

2. Scope check

The analyst confirms the work is in scope for tool use: it falls within the four supported workflows, it does not require scientific risk assessment, it does not require legal interpretation, it does not require Te Tiriti or stakeholder authority decisions. Out-of-scope work bypasses the tool entirely.

3. Tool run

The analyst uses the tool, supplies inputs, reviews the output. Every model call is logged automatically: prompt version, input, output, timestamp, user identifier. Logging is non-negotiable and not user-controlled.

4. Analyst review

The analyst reads the output line by line. Every cited quote is verified against source. Every disposition is reviewed. SME-flagged items are routed to the named SME. Confidence-flagged items are addressed before any draft progresses. The analyst marks the output as accepted, modified, or rejected. Rejection generates a structured note for the next prompt iteration.

5. SME review (where flagged)

Items the model flags as needing scientific, legal, technical, or policy SME review go to the named SME with the model's draft as background only. The SME's response is the operative input, not the model's draft. The model does not see the SME response.

6. Director sign-out

Any artefact derived from a tool that goes outside BIES (Review of Submissions, IHS amendment, importer notification) carries a Director sign-out under the existing delegation. The sign-out form includes a tick that the AI tool was used and that the analyst has verified the output. No artefact is published without sign-out.

7. Audit and retention

Tool logs are retained under the Public Records Act 2005 alongside the IHS workstream record. Annual audit samples 5% of tool runs against the corresponding artefact for compliance with the operating model. OIA requests can be served against tool runs as part of any IHS workstream OIA.

Risk classification

Four tiers. Every tool use sits in exactly one.

The tier dictates the controls that apply, the decision rights, and the audit obligations. The same tool can be used at different tiers depending on what is being analysed.

Tier	What it covers	Examples	Controls	Decision right
T1	Internal drafting support, low-stakes	Drafting consistency check on a routine IHS refresh; plain-language draft of an existing published clause.	Standard logging, citation discipline, analyst review.	Analyst
T2	Consultation analysis with public-facing output	Submission synthesis, plain-language guidance for publication, change explainer for industry notification.	T1 plus Senior Analyst sign-off, full SME routing where flagged.	Senior Analyst + SME (where flagged)
T3	Contentious or high-impact IHS workstreams	Synthesis of a contentious IHS consultation (60+ submissions); change explainer for an amendment with material commercial impact.	T2 plus Standards Manager sign-off and second-analyst verification of cited quotes.	Standards Manager
T4	Out of scope for tool use	Scientific risk assessment; legal interpretation of the Biosecurity Act; novel pathway analysis; Te Tiriti partnership decisions; stakeholder authority decisions.	Tool not used. Manual analyst and SME work only.	Director (under existing delegation)

Decision rights

Who can decide what when a tool is in the loop.

A short list. Anything not on it requires escalation.

Decision	Decision-maker	Notes
Whether to use a tool on a given workstream	Analyst	Within the scope and tier rules above.
Accepting an individual model output theme or finding	Analyst	After verification of cited quote.
Modifying or rejecting a model output	Analyst	Modification logged for prompt iteration.
Routing a flagged theme to SME	Analyst	SME response is operative; model draft is background only.
Submitting an artefact for Director sign-out	Senior Analyst (T2/T3)	Senior Analyst confirms the operating model has been followed.
Publishing an IHS, RoS, or notification	Director	Existing BIES delegation. Unchanged.
Pausing the tool for a workstream	Standards Manager	If quality issues observed; documented and reported to the AI for Standards lead.
Withdrawing the tool from BIES use	Director	On recommendation from the AI for Standards lead, after the kill conditions trigger.
Changing a system prompt or output schema	AI for Standards lead	Versioned. Old version retained. Change recorded in the workstream record.
Approving a new tool for BIES use	Director	On recommendation from the AI for Standards lead, after pilot and evaluation.

RACI

Roles across the tool lifecycle.

Activity	Analyst	Senior Analyst	Standards Manager	SME	AI lead	Director
Run tool on workstream	R	I	I	·	I	·
Verify cited quotes	R	A (T2/T3)	·	·	·	·
Resolve SME-flagged items	R	A	·	R	·	·
Approve artefact for Director	R	A	C (T3)	C	·	I
Sign out artefact for publication	R	R	C	·	·	A
Iterate prompt or schema	C	C	·	·	A/R	I
Pause or withdraw tool	I	I	R	C	R	A
Quarterly evaluation report	C	C	R	·	A/R	I

R = Responsible, A = Accountable, C = Consulted, I = Informed.

Pre-mortem

Twelve months in, the rollout has failed. Why?

A structured pre-mortem of the seven scenarios most likely to derail a BIES genAI deployment, with the controls in this operating model that address each.

#	Failure mode	Likelihood	Impact	Mitigation in this operating model
F1	Model fabricates a verbatim quote attributed to a real submitter, surfaces in a published Review of Submissions, complaint reaches the Ombudsman.	Low	High	Citation discipline; analyst verification of every quote against source; Senior Analyst and Director sign-out before publication.
F2	Plain-language guidance omits a material requirement from the source IHS, importer relies on it, consignment fails compliance.	Medium	High	Tool 3 outputs the must-do checklist with source clause references; analyst reviews against the source; the published IHS remains the operative document.
F3	Submitter requests their submission text under OIA; the model logs are not retained or are not searchable; OIA response is incomplete.	Medium	Medium	Tool logs retained under Public Records Act 2005 alongside the workstream record; logs indexed by submission ID for OIA retrieval.
F4	Analyst over-relies on tool output, signs out a Review of Submissions without verifying the cited quotes; quality issue surfaces in a subsequent submission.	Medium	Medium	5% audit sample; modification rate tracked; over-reliance pattern triggers AI lead review and additional verification step.
F5	System prompt is changed without versioning; outputs from a different prompt version are mixed in the workstream record; auditability fails.	Low	Medium	System prompts versioned and immutable per release; AI lead controls change; old prompt versions retained.
F6	The four tools become a workstream the directorate cannot afford to staff or maintain; tools fall out of date; usage drops.	Medium	Medium	One full-time AI for Standards lead; quarterly evaluation; explicit kill conditions trigger withdrawal rather than slow decline.
F7	Te Tiriti partner is not engaged on a tool that affects how iwi/hapū submissions are analysed; engagement gap criticised at consultation review.	Medium	High	Tier 4 explicitly excludes partnership decisions; engagement plan with Te Uepū and other Te Tiriti partners is part of the AI for Standards lead's first-90-days workplan.

Evaluation framework

How we know it is working. And when we stop.

Five quantitative measures and three qualitative measures, reported quarterly, against published baselines.

Quality

Modification rate: % of model output items the analyst materially changes. Target: above 10% (under-correction risk) and below 60% (the tool is not adding value).

Citation accuracy: % of cited quotes verified accurate by analyst. Target: 100%.

Speed

Time saved per workstream: measured against pre-tool baseline by activity sampling. Target: 40% or more time saving on the supported activity.

Cycle time: total elapsed time from consultation close to RoS publication. Target: 25% reduction within 12 months.

Trust

Analyst confidence (anonymous quarterly survey). Target: ≥ 70% net positive.

External quality issues raised (in submissions, by media, by Ombudsman). Target: zero attributable to tool output that survived sign-out.

Kill conditions

The tool is withdrawn from BIES use if any of the following occurs: a single fabricated quote survives sign-out and reaches the public domain; modification rate exceeds 70% across two consecutive quarters; cycle time worsens against pre-tool baseline; a single OIA response is materially compromised by tool log gaps; the analyst confidence survey returns net negative for two consecutive quarters; the Director, on advice from the AI for Standards lead, determines the tool is no longer fit for purpose. Withdrawal is not failure. It is the operating model working.

Implementation

First 90 days. Then 6 months. Then 12.

Days 1 to 90

Single tool, single workstream, full evaluation

Tool 1 (Submission Synthesis), one IHS workstream, paired with one analyst. Time-and-motion baseline before tool use. Operating model in force. First quarterly evaluation at day 90.

Decision gate at day 90: proceed, iterate, or kill.

Days 91 to 180

Two tools, two workstreams, formalised governance

Add Tool 2 (Consistency Check) on a separate workstream. Operating model formalised in BIES procedures. AI for Standards lead role established as the steady-state owner.

Decision gate at day 180: assess scaling readiness across the directorate.

Days 181 to 365

All four tools, BIES-wide

Tools 3 and 4 added. Use across the directorate as a standard part of the IHS workstream. Annual review against the evaluation framework. Review whether to extend to OMARs and Export Standards work.

Decision gate at day 365: annual board report; sustain, scale, or sunset.

Alignment

Where this operating model draws from.

This operating model is consistent with the Public Service Commission's published guidance on the use of generative AI in government, the Government Chief Digital Officer's principles for safe AI use, the Privacy Act 2020 obligations for managing personal information at scale, and the Official Information Act 1982 obligations for record retention and discoverability.

It is also consistent with the existing BIES delegation structure under section 24A of the Biosecurity Act 1993 and with the published MPI consultation policy. Nothing in this operating model changes the Director's authority to issue an Import Health Standard or the Chief Technical Officer's authority under section 27.

Where this would land in a real BIES engagement: the operating model would be reviewed against the current MPI Risk Management Framework and the most recent Public Service Commission AI guidance in the first 30 days of the role, and amended where required. The aim of this artefact is to demonstrate the kind of thinking the role calls for, not to publish final policy.