Why Traditional Tickets Fail for AI Agents
Jira tickets, GitHub Issues, and Linear cards were designed for humans who can infer context, navigate codebases, and ask clarifying questions. AI agents can do none of these things. They need a different format — one where every field exists because its absence caused a real failure.
The SpecForge Format was distilled from 400+ builds across 14M+ lines of production TypeScript. Every field you see in the schema earned its place by solving a problem that no amount of prompt engineering could fix.
Tickets designed for agents, not humans.
A human developer reading "Add JWT auth" knows to check the existing codebase for auth patterns, look at the project structure, find related types, and ask a colleague if something is unclear. An AI agent reading the same ticket will hallucinate file paths, invent type signatures, and produce code that compiles but integrates with nothing. The gap between what a human infers and what an agent needs explicitly stated is the entire problem SpecForge Format solves.
Human Ticket vs. Agent Ticket
The left ticket is how humans communicate work. The right ticket is how agents need it. Every field in the SpecForge Format exists because its absence caused a real failure.
{
"title": "Implement JWT auth middleware",
"implementation": {
"filesToCreate": [
"src/middleware/auth.middleware.ts",
"src/types/auth.types.ts"
],
"filesToModify": [
"src/routes/index.ts",
"src/types/express.d.ts"
],
"steps": [
"Create AuthPayload and AuthRequest types",
"Implement verifyToken middleware",
"Add middleware to protected route group",
"Extend Express Request type declaration"
]
},
"codeReferences": [{
"name": "JwtConfig",
"code": "src/config/jwt.config.ts::JwtConfig",
"language": "typescript"
}],
"typeReferences": [{
"name": "UserDocument",
"definition": "src/models/user.model.ts::UserDocument"
}],
"acceptanceCriteria": [
"Returns 401 for missing token",
"Returns 403 for expired token",
"req.user contains decoded UserDocument",
"Existing routes remain unaffected"
],
"testSpecification": {
"gates": ["unit", "integration"],
"commands": ["npm run test:auth"]
},
"dependencies": [{
"dependsOnId": "uuid-of-user-model-ticket",
"type": "requires"
}],
"estimatedHours": 2
}requires means the scheduler can run this ticket in parallel with non-dependent work. Type blocks would gate the entire wave.Eight Lessons from 400+ Builds
Each lesson below describes a real failure mode encountered while building production software with AI agents. The schema field that fixed it is listed at the end. Nothing in the format is theoretical — every field is scar tissue.
The Jira Ticket Hallucination
An agent received a human-style user story: "Add JWT authentication to the API." No file paths. No type signatures. No reference to existing code.
The agent hallucinated file paths that didn't exist, invented API signatures incompatible with the codebase, and created types that duplicated ones already defined three directories away. The code compiled. It was wrong in every way that mattered.
Every ticket now carries explicit implementation instructions: filesToCreate lists exact paths for new files. filesToModify names every existing file the agent will touch. codeReferences provides the actual signatures and patterns the agent must follow. typeReferences defines the TypeScript types the agent must use — not invent.
implementation.filesToCreateimplementation.filesToModifycodeReferencestypeReferencesThe Merge Conflict Massacre
Multiple agents working in parallel modified the same file. Agent A added an import at line 3. Agent B rewrote the function starting at line 5. Agent C added a new export at the end.
Three-way merge conflicts on every file touched by more than one agent. Manual resolution wiped out the productivity gains of parallel execution. The more agents ran simultaneously, the worse it got.
With filesToCreate and filesToModify declared in the schema, a scheduler can compute which tickets touch which files before execution starts. Tickets that share files go into the same wave — sequential within the wave, parallel across waves. Zero file collisions by construction, not by luck.
implementation.filesToCreateimplementation.filesToModifyThe Status That Shouldn't Exist
The ticket lifecycle had an explicit "blocked" status. When a dependency wasn't met, the ticket was marked blocked. When the dependency resolved, someone (or something) had to remember to flip it back to "ready."
Agents got stuck in blocked state because nothing triggered the transition back. Tickets that were theoretically unblocked stayed frozen. The blocked status created a dead state that required external intervention to escape — the opposite of autonomous execution.
Blocked was eliminated as a status. The lifecycle became four clean states: pending → ready → active → done. Blocked is now inferred from the dependency graph: if a ticket's dependencies (type: blocks) aren't done, it's pending. When they resolve, it becomes ready automatically. No status transition needed. The dependencies.type field distinguishes blocks (hard gate — cannot start) from requires (needs output, but can be sequenced flexibly).
ticket.status (pending | ready | active | done)dependencies[].type (blocks | requires)The Style Drift at Scale
With dozens of agents executing tickets in parallel, each one produced code in its own style. Different naming conventions. Different error handling patterns. Different import orderings. Different return type shapes.
The codebase compiled but looked like it was written by 50 different people — because it was. Consistency reviews became the bottleneck. The more agents you added, the worse the Frankenstein effect got.
Patterns became a first-class entity in the schema, defined at the Specification level and inherited by every Epic and Ticket beneath it. codeStandards encodes naming conventions, error handling strategy, and language-specific rules. commonImports lists the import statements every ticket should use. returnTypes defines the standard response shapes (e.g., Result<T, E> patterns). Define once at the top. Every agent reads the same patterns.
patterns.codeStandardspatterns.commonImportspatterns.returnTypesThe Orphan Design Document
Architecture diagrams, ADRs (Architecture Decision Records), and mockups lived in separate documents — Notion pages, Confluence wikis, Figma files, scattered markdown. No agent knew they existed.
Agents reimplemented things that had already been decided differently. ERD diagrams showed a normalized schema; the agent created a denormalized one. An ADR specified event-driven communication; the agent built synchronous REST calls. Design decisions existed but had no path to execution.
Blueprints became a first-class entity in the schema, directly linked to a Specification. Each blueprint has a typed category (flowchart, architecture, sequence, erd, adr, mockup, component, deployment, api), a format (mermaid, markdown, ascii, mixed), a lifecycle status (draft, review, approved, deprecated), and the actual content inline. The agent doesn't need to search for design documents — they're in the same payload as the tickets.
blueprints[].categoryblueprints[].formatblueprints[].statusblueprints[].contentThe Acceptance Criteria Gap
Tickets without explicit acceptance criteria produced code that "worked" in isolation but didn't satisfy the actual requirement. The agent declared the ticket done because it had no way to verify otherwise.
Reviews became the only quality gate. But by the time a human reviewed the output, the agent had already moved on, building subsequent tickets on top of incorrect foundations. Errors compounded geometrically.
acceptanceCriteria became a required array of verifiable strings on every ticket. Combined with testSpecification.gates (unit, integration, e2e), it creates a binary contract: either the ticket passes all criteria and all gates, or it's not done. No ambiguity. No "it mostly works." The agent has a checklist it can verify before declaring completion.
ticket.acceptanceCriteriatestSpecification.gatestestSpecification.commandsThe Feedback Loop That Teaches Decomposition
Without tracking estimated vs. actual time, there was no signal to calibrate decomposition quality. Tickets estimated at 2 hours that took 20 were indistinguishable from tickets that took 2.
Decomposition quality stagnated. The system produced tickets that were either too granular (wasting orchestration overhead) or too broad (exceeding context windows and agent capabilities). There was no feedback mechanism to improve.
Both estimatedHours and actualHours live on every ticket. The delta between them is the decomposition quality signal. If actual consistently exceeds estimated, the tickets aren't atomic enough — the decomposer needs to produce smaller units. If actual is consistently a fraction of estimated, tickets are over-decomposed. The schema carries the evidence that the next decomposition cycle needs to improve.
ticket.estimatedHoursticket.actualHoursThe Context Window Graveyard
In long conversations, the model compacts context. Critical information — which file to create, which type to reference, which pattern to follow — silently disappears from the window. The agent keeps executing, but it's now operating on incomplete information.
The agent produces plausible but wrong output. It invents what it can't remember. And because the output looks correct — syntactically valid, structurally reasonable — the error isn't caught until integration, when the damage is already done.
Tickets are designed to be atomic and self-contained. Each ticket carries its complete implementation context: filesToCreate, filesToModify, codeReferences, typeReferences, inherited Patterns, resolved dependencies. The agent doesn't need to remember anything from previous tickets or earlier conversation turns. The ticket is the entire context. The schema is designed for intentional amnesia.
ticket.implementation (complete)ticket.codeReferences (inline)ticket.typeReferences (inline)patterns (inherited from specification)Design Principles
These principles guide every RFC and schema change. They emerge directly from the eight lessons above.
Explicit over inferred
If the agent would need to guess, make it a field. Implicit context is where hallucinations begin.
Atomic and self-contained
Every ticket carries its complete execution context. Assume the agent remembers nothing from previous work.
Graph over sequence
Execution order comes from the dependency graph, not from array position. Derived state is computed, not stored.
Consistency at specification time
Patterns, standards, and conventions are declared once and inherited. Consistency is a specification problem, not a review problem.
Verifiable exit contracts
Every ticket has acceptance criteria and test gates. If the agent can't verify completion, it isn't complete.
Engine-agnostic
The format describes work, not workflow. Any compliant engine can consume it. The schema belongs to the ecosystem.
The SpecForge Format is an open specification. Any tool can generate, validate, and consume .specforge.json files. SpecForge (the product) is a reference implementation — the most advanced engine for executing specs — but the format belongs to the ecosystem.