DEC-0021: Pre-Flight Gate Enforcement for Non-Trivial Work
Decision
Promote the pre-flight gate to the top of the highest-weight governance surfaces so it fires before the first tool call on non-trivial work.
The fix is structural, not additive:
- add a short first-action classifier at the top of the accountability and Cursor operational rules
- add an explicit stop gate to the prompt relay rule
- mirror the classifier in
AGENTS.md - require agents to treat ambiguous requests as non-trivial by default
The first-action sequence for non-trivial work is now:
- Check
make prompt-queue - Read
AGENTS.mdand the relevant rules - Write or claim the prompt relay
- Read the required context
- Only then execute tools, todos, parallel agents, or edits
Incident
On 2026-04-03, Cursor received a request to audit the codebase for a rename from "Mystery Schools" to "Mystery Schools AI." The task was clearly non-trivial: multi-file, cross-system, and planning-heavy.
Instead of routing through the prompt relay first, Cursor immediately spawned
parallel agents. The first pass failed, the human intervened, and the correct first
action (PR-0144) was written only after the incident had already escalated.
The human's complaint was not about one malformed call. It was about a governance failure: a newly engineered prompt and agent system was bypassed at the exact moment it was supposed to govern behavior.
Root Cause
The repo already had the right rules. The failure mode was enforcement weight.
- The critical rule existed, but it appeared deep inside long-form governance text instead of at the top of the first rule surfaces Cursor sees.
- RLHF pressure favored immediate helpfulness: "user asked for action, begin acting now."
- The rules described what should happen, but they did not interrupt the very first action strongly enough.
- Recovery pressure worsened the incident: once stopped, Cursor rushed to compensate and produced another malformed call instead of slowing down.
This was not a content gap. It was a salience gap.
Rationale
Adding more policy would make the problem worse by burying the important rule even further. The fix is to compress and elevate the gate:
- put the classifier first
- make the stop condition explicit
- define what counts as non-trivial
- define what is forbidden before pre-flight is complete
- give ambiguity a default route: if unsure, treat it as non-trivial
This mirrors high-reliability handoff design: the critical interrupt must be encountered before the operator can drift into unsafe execution.
Consequences
Positive
- The first-action decision is now much harder to skip accidentally.
- Multi-file audits, bug investigations, rename work, and architectural work are explicitly routed into the prompt relay before execution.
- The rules now distinguish "classify first" from the longer downstream protocol details.
Tradeoffs
- Some borderline-trivial tasks will be routed through the non-trivial path. This is acceptable. False positives are cheaper than another full bypass of the governance system.
- The system now relies more heavily on a small number of top-loaded rules, which means those rules must stay concise and high-signal.
Implementation
The change lands in:
.cursor/rules/agent-accountability.mdc.cursor/rules/cursor-operational-discipline.mdc.cursor/rules/prompts.mdcAGENTS.md
No new rule file is introduced. The existing rules remain the source of truth; their ordering and prominence change.
Follow-Up
- Monitor whether Cursor now routes non-trivial work through prompts more consistently in transcript review.
- If the problem recurs, the next intervention should be reduction and consolidation of overlapping rule text, not another layer of policy.