DEC-0021: Pre-Flight Gate Enforcement for Non-Trivial Work

Decision

Promote the pre-flight gate to the top of the highest-weight governance surfaces so it fires before the first tool call on non-trivial work.

The fix is structural, not additive:

add a short first-action classifier at the top of the accountability and Cursor operational rules
add an explicit stop gate to the prompt relay rule
mirror the classifier in AGENTS.md
require agents to treat ambiguous requests as non-trivial by default

The first-action sequence for non-trivial work is now:

Check make prompt-queue
Read AGENTS.md and the relevant rules
Write or claim the prompt relay
Read the required context
Only then execute tools, todos, parallel agents, or edits

Incident

On 2026-04-03, Cursor received a request to audit the codebase for a rename from "Mystery Schools" to "Mystery Schools AI." The task was clearly non-trivial: multi-file, cross-system, and planning-heavy.

Instead of routing through the prompt relay first, Cursor immediately spawned parallel agents. The first pass failed, the human intervened, and the correct first action (PR-0144) was written only after the incident had already escalated.

The human's complaint was not about one malformed call. It was about a governance failure: a newly engineered prompt and agent system was bypassed at the exact moment it was supposed to govern behavior.

Root Cause

The repo already had the right rules. The failure mode was enforcement weight.

The critical rule existed, but it appeared deep inside long-form governance text instead of at the top of the first rule surfaces Cursor sees.
RLHF pressure favored immediate helpfulness: "user asked for action, begin acting now."
The rules described what should happen, but they did not interrupt the very first action strongly enough.
Recovery pressure worsened the incident: once stopped, Cursor rushed to compensate and produced another malformed call instead of slowing down.

This was not a content gap. It was a salience gap.

Rationale

Adding more policy would make the problem worse by burying the important rule even further. The fix is to compress and elevate the gate:

put the classifier first
make the stop condition explicit
define what counts as non-trivial
define what is forbidden before pre-flight is complete
give ambiguity a default route: if unsure, treat it as non-trivial

This mirrors high-reliability handoff design: the critical interrupt must be encountered before the operator can drift into unsafe execution.

Consequences

Positive

The first-action decision is now much harder to skip accidentally.
Multi-file audits, bug investigations, rename work, and architectural work are explicitly routed into the prompt relay before execution.
The rules now distinguish "classify first" from the longer downstream protocol details.

Tradeoffs

Some borderline-trivial tasks will be routed through the non-trivial path. This is acceptable. False positives are cheaper than another full bypass of the governance system.
The system now relies more heavily on a small number of top-loaded rules, which means those rules must stay concise and high-signal.

Implementation

The change lands in:

.cursor/rules/agent-accountability.mdc
.cursor/rules/cursor-operational-discipline.mdc
.cursor/rules/prompts.mdc
AGENTS.md

No new rule file is introduced. The existing rules remain the source of truth; their ordering and prominence change.

Follow-Up

Monitor whether Cursor now routes non-trivial work through prompts more consistently in transcript review.
If the problem recurs, the next intervention should be reduction and consolidation of overlapping rule text, not another layer of policy.