Key takeaways in 3 minutes
Most human-in-the-loop AI systems produce logs, not decision records.
A log can tell you a button was clicked. It cannot prove a decision was made.
A log captures that someone clicked Approve. A real decision record captures what the human knew, what they decided, why they decided it, what was executed and what happened afterwards.
The key field is the brief as presented: an immutable snapshot of exactly what the approver saw at the time.
For AI accountability, event logs are not enough. Organisations need records that can explain decisions months later.
Read alongside: The Oversight Illusion and The Context Problem. The full product architecture is written up in the AI Agent Accountability Framework case study.
Six months after an AI agent made a consequential decision, someone asks what happened.
Not in vague terms. Properly.
What did the agent propose? What did the human know? What alternatives were considered? What risk factors applied? Who approved it? What authority did they have? Was anything modified? What was executed? Did it work?
You open the log and find:
{ event: "approved", user_id: "u_0042", timestamp: "2026-05-23T14:32:07Z", proposal_id: "p_0187", confidence: 0.83 }
That record is accurate.
It is also useless for proving meaningful oversight.
It tells you a button was clicked. It does not tell you a decision was made.
A log can tell you a button was clicked. It cannot prove a decision was made.
A Log Is Not A Decision Record
Most systems capture events because events are easy.
Approved. Dismissed. Executed. Failed.
That may be enough for debugging. It is not enough for accountability.
A decision record has to answer three human questions:
- What did the approver know?
- What did they decide, and why?
- What happened as a result?
If the record cannot answer those questions without someone reconstructing the story from five systems and a mild sense of dread, it is not really a decision record.
Layer One: What The Approver Knew
The most important field is the one many systems never capture: the brief as presented.
This is an immutable snapshot of the proposal exactly as it appeared to the human at the time of decision.
Not the current proposal object. Not the cleaned-up version after data was corrected. Not a plausible reconstruction.
Exactly what the approver saw.
That matters because accountability is temporal. The audit question is not "what do we know now?" It is "what did the human know then?"
The audit question is not what do we know now? It is what did the human know then?
The reasoning layer should also capture agent reasoning, alternatives considered, risk factors triggered, governance manifest version and time spent reviewing the brief.
Time on brief should not be used to shame approvers. It is a governance health signal. A consequential decision approved in four seconds is worth noticing.
Layer Two: What They Decided
The action layer captures the human decision.
Approved, overridden or dismissed.
It should include the approver's name, role and authority tier, not only a user ID. A compliance lead preparing for an audit should not need to cross-reference a technical account table to explain who was responsible.
Overrides deserve special care.
If the human modified the agent's recommendation, that is not a failure. It is a legitimate governance outcome and often the most valuable signal in the system. The record should capture the original proposal, the modified action and the rationale.
An override is not a failure. It is a legitimate governance outcome.
That is how the system learns what the business actually meant.
Layer Three: What Happened
The outcome layer closes the loop.
Did the action execute? Did it achieve the intended result? Did the supplier switch work? Did the rerouted shipment arrive? Was the cost saving realised?
Not every decision has a clean outcome. Strategic decisions may unfold over months. In those cases, outcome_trackable: false is better than a fake metric.
But where outcomes are trackable, they matter.
The record is never complete at the moment of decision. It completes when the outcome is known.
If similar proposals keep getting approved and similar outcomes keep underperforming, the problem may be the agent's model of success. Or the manifest. Or the approval process. You cannot see that pattern if the record ends at the button click.
Narrative Mode
The record should exist as structured data and as human-readable narrative.
Structured data is for systems.
Narrative mode is for compliance leads, auditors and business stakeholders who need to read what happened without spelunking through JSON.
The narrative should be generated automatically from the record fields. Same structure every time. Same evidence every time. No extra work for the approver.
It does not need to be literary. It needs to be complete, readable and defensible.
The Practical Move
What A Decision Record Must Capture
- 01The brief exactly as presented
- 02The agent's reasoning and alternatives
- 03Applicable risk factors
- 04The governance context version
- 05The approver's name, role and authority
- 06The decision and rationale
- 07Any override detail
- 08The executed action
- 09The outcome, where trackable
Logs help engineers debug systems. Decision records help organisations defend decisions.


