What MPI is using right now, and what the alternatives look like.
As of early 2026, around 32 New Zealand Government agencies have access to Microsoft 365 Copilot. The Government Chief Digital Officer's Responsible AI Guidance (February 2025) does not require any agency to lock in to a single vendor. It asks agencies to assess risks, apply procurement and security standards, and make informed choices, weighing cost, functionality, privacy, and security across providers. The "Copilot default" is a real risk to good practice, not a requirement.
Platform comparison for the BIES use case
A short read on the platforms in the JD plus the obvious adjacent options. The "verdict" column reflects the suitability for the four BIES tools described elsewhere on this site, not a generic rating.
M365 Copilot
Native to the M365 environment most BIES analysts already use. Strong for everyday document and email assistance, weaker for the structured-output workflows the four BIES tools require. Privacy and tenant residency improving (local data processing rolled out 2025), but agencies retain responsibility for IPP compliance.
For BIES tools: useful for the analyst-facing review and editing layer (analyst opens the model output in Word and refines), less useful for the structured JSON output and citation discipline the tools require.
AWS Bedrock with Agentic AI
Named in the JD. Allows multiple foundation models behind a single AWS-tenanted boundary, including Anthropic Claude, Meta Llama, and AWS Titan models. Strong fit for the structured, prompt-versioned, audit-logged pattern in the Operating Model. Sovereignty story available via Sydney region with explicit data residency configuration.
For BIES tools: the natural production landing place for the four tools. Holds the API key server-side, gives the audit trail the Operating Model requires, supports the tier-based access control the Governance page sets out.
Anthropic Claude API
The model behind every working tool on this site. Strong on structured-output discipline, refusing to invent content, and following long, schema-strict instructions reliably. Available direct from Anthropic or through AWS Bedrock. Direct browser access (used in the prototypes here) is a development-only pattern; production usage runs server-side.
For BIES tools: the model layer. Whether procured directly or through Bedrock is a procurement and sovereignty choice, not a capability one.
ChatGPT Enterprise (and Azure OpenAI)
Used by some NZ Government agencies. Equivalent capability to Claude on most BIES-style synthesis and drafting tasks. Available in the Azure tenancy via the Azure OpenAI Service, which is the procurement path most agencies on M365 already understand.
For BIES tools: a credible alternative model layer. Decision is procurement, sovereignty, and cost, not capability.
Personal ChatGPT, Gemini, Copilot Chat (consumer)
Free or low-cost public tools. Analysts may already be using these on personal accounts to draft work content. This is the channel most likely to leak personal information into a third-party model that has not signed an enterprise data processing addendum.
For BIES tools: avoid for any work containing submission text, personal information, or pre-publication standards content. The Operating Model's tier system explicitly excludes consumer-channel use of identifiable submission data.
Verdicts assume the Operating Model on the Governance page is in force. Without those controls, even the recommended platforms carry meaningful residual risk.
The Privacy Act 2020, OPC AI guidance, and Te ao Māori considerations are not an afterthought.
The 13 Information Privacy Principles in the Privacy Act 2020 apply directly to the use of AI tools by New Zealand agencies. The Office of the Privacy Commissioner published targeted guidance on AI and the IPPs in September 2023. The Operating Model on the Governance page is built to meet these obligations from day one.
The OPC's six expectations for organisations using generative AI
- Senior leadership approval before use. AI use approved only after risks and mitigations are fully considered. Mapped to Director sign-out in the Operating Model.
- Necessity and proportionality test. Justify why a generative AI tool is the right tool given the privacy impact. Mapped to Step 2 (Scope check) in the Operating Model.
- Privacy impact assessment before use. Standard PIA process applied. Mapped to a first-90-days deliverable in the AI for Standards lead role.
- Communicate to affected people. How, when, and why the tool is being used. Mapped to published Operating Model and OIA discoverability.
- Consult Māori. Te ao Māori considerations on data, bias, and decisions affecting Māori. Mapped to Tier 4 carve-out and first-90-days engagement plan.
- Human review before action. Reduce inaccuracy and bias by requiring human review. Mapped to analyst review and Director sign-out for every artefact.
The 13 Information Privacy Principles, applied to the BIES tools
Every IPP is in force. Below, the principles most active for the BIES tools, with the specific control in this concept that addresses each.
Purpose of collection
Personal information collected only for a lawful purpose connected to the agency's function.
Collection from individual
Tell people why their information is being collected, who will receive it, and how to access it.
Manner of collection
Personal information must not be collected by unlawful, unfair, or unreasonably intrusive means.
Storage and security
Reasonable safeguards against loss, unauthorised access, use, modification, or disclosure.
Accuracy before use
Take reasonable steps to ensure accuracy before using personal information.
Limits on disclosure
Personal information used only for the purpose for which it was obtained.
Disclosure outside NZ
Cross-border disclosures only where comparable safeguards exist.
Unique identifiers
Limits on assigning, using and disclosing unique identifiers.
Documented PIA
The full IPP set will be exercised through a privacy impact assessment in the first 30 days of any pilot.
The OPC guidance is explicit about consultation with Māori. So is the Operating Model.
The OPC AI guidance recognises te ao Māori perspectives on privacy and identifies three specific concerns. Each requires a deliberate response in the Operating Model.
Bias from systems that do not work accurately for Māori
Foundation models trained on global data may perform poorly on te reo Māori, on submissions raising tikanga concerns, or on Māori organisations.
Response: the synthesis tool flags submissions referencing te reo, tikanga, or Te Tiriti for explicit human review before disposition. Confidence is rated low by default for these submissions. SME review type "policy" includes te ao Māori expertise.
Collection of Māori information without trust relationships
Submissions from iwi, hapū, hapori, and Māori organisations carry expectations of mana enhancement and reciprocity that a generic AI summarisation does not satisfy.
Response: Tier 4 in the Governance model excludes any tool use where the partnership relationship is the operative concern. Engagement with Te Uepū and other partners is a first-90-days workplan item.
Exclusion from processes affecting Māori
A tool that quietly summarises Māori submissions without engagement is the kind of process the guidance is built to prevent.
Response: the published Operating Model and the OIA discoverability of every model run keep the use of AI visible to Māori submitters. The Operating Model is itself a public artefact.
The Department of Corrections Copilot Chat incident, late 2025.
A short worked example of why the controls in the Operating Model are not theoretical. In late 2025, the Department of Corrections tightened AI use after a Copilot Chat misuse incident triggered a privacy review. The detail matters less than the pattern.
The pattern
- Default access without controls. A capable tool was made available to staff without a tier-based use policy in place.
- Use case drift. Staff used it for content the original procurement decision had not anticipated.
- Privacy issue surfaces externally. Not internally. Once it surfaces, the agency tightens after the fact, with reputational cost.
What the BIES Operating Model does differently: trigger-based use only (Step 1), explicit scope check before any tool run (Step 2), tier-based decision rights (T1 to T4), automatic logging of every model call (Step 3), and explicit kill conditions on the Governance page. The pattern above does not survive that operating model.
An honest read on the pace of change. Drag the slider.
The same workstream evaluated against AI capability six months ago, today, and twelve months from now produces three substantively different answers. Building for the wrong horizon locks the directorate into yesterday's tool. Below is a working view of the trajectory, with concrete BIES implications at each stage.
Capability index is illustrative, scored against synthesis quality on a fixed BIES-style task. Cost figures are indicative public list prices for frontier-class inference, projected forward at observed rates of decline (roughly 4× per year on cost, 1.5 to 2× per year on capability since 2023). Forward projections are subject to substantial uncertainty.
Five things BIES needs in place to absorb that pace of change without rework.
If the velocity above is broadly right, the directorate cannot afford to hard-code itself to today's vendor, today's model, or today's capability. The Operating Model and the four tools on this site are designed for that constraint. Five specific moves keep the directorate agile.
| # | Move | Why |
|---|---|---|
| 1 | Model-portable architecture. Every tool calls the model behind a thin wrapper, with the prompt as the operational unit, not the vendor. | Today the wrapper points at Anthropic Claude. In 12 months it might point at the same model on AWS Bedrock with a different residency boundary, or at a smaller, cheaper model fine-tuned for the BIES corpus. None of that should require rebuilding the tool. |
| 2 | Versioned, reviewed prompts. System prompts are first-class artefacts under change control. | The capability frontier moves; the prompt that worked last quarter may be sub-optimal next quarter. Versioning enables A/B testing and roll-back, both of which are decision-rights items in the RACI. |
| 3 | Quarterly evaluation gate. Every tool reassessed against the evaluation framework on a fixed cadence. | Catches both improvement (the new model is better, switch) and regression (the new model is worse on this task, do not switch). Set out in the Governance page. |
| 4 | Sovereignty optionality. The architecture supports the residency boundary changing without changing the user-facing tools. | M365 Copilot data sovereignty rolled out locally in 2025. The same kind of move is likely on AWS Bedrock and on direct API providers. The directorate should be in a position to take advantage on day one rather than after a six-month re-procurement. |
| 5 | Explicit kill conditions. A tool is withdrawn before it becomes a legacy liability. | Most fast-moving capability deployments in government become legacy not because the technology stops working but because no one defined when to stop. The kill conditions on the Governance page are designed to retire a tool the moment a better answer exists. |
The bottom line
The four BIES tools on this site are not the point. The pattern is. A directorate that has run one disciplined genAI workstream end to end (build, evaluate, govern, retire when needed) is in a fundamentally different position to take advantage of the next eighteen months of capability change than a directorate that has not. The Standards Companion is one way to start that learning, on a workstream where the controls genuinely fit the work.