Dev Context Methodology + Wardstone
A structured project lifecycle methodology for AI-assisted development, paired with a Tauri 2 desktop dashboard for real-time project management.
Problem
AI-assisted development moves fast, but it doesn't come with structure by default. Without a clear lifecycle, projects accumulate undefined stages, no definition of done, and no way to tell whether your process is getting better or just getting faster at producing mess. The tools that do exist — Jira, Linear, Notion — were designed for teams. They assume handoffs, standups, and a separation between the person who plans and the person who builds. None of them are built for a solo developer working in deep collaboration with an AI agent.
I needed a system that could impose structure without slowing me down: one that was readable by both me and Claude, that lived in the repo itself, and that produced measurable signals about workflow quality over time.
How It Works
The Methodology
DCM is a 5-layer context model built around an 8-stage project lifecycle. The five layers — workspace, project, feature, session, and task — each carry a distinct scope of state, from long-lived architectural decisions down to the work happening in a single conversation. Every project gets a CONTEXT.md that tracks its current stage, exit criteria, and open to-dos. The 8 stages run from Scaffold through Maintained, with explicit gate contracts defining what must be true before a project advances. MoSCoW-prioritized backlogs live in plain markdown and get updated in-place as work progresses.
The goal was to make the lifecycle machine-readable without making it feel bureaucratic. Claude Code reads these files at session start, which means the agent walks into every conversation with full project context — no re-explaining what stage things are at or what the current sprint goal is.
The Plugin
The DCM Claude Code plugin ships 10 slash commands, 4 skills, and 8 hooks. Commands cover the full project management surface: /track-new to scaffold a project into the tracker, /track-advance to gate-check and promote a stage, /track-sprint to open a sprint with goal and scope, /track-retro to close it with a structured retrospective. The hooks fire automatically — before a session starts, after a commit, when a task contract is opened or closed — keeping state synchronized without requiring manual discipline.
Task contracts are the unit of execution. Each feature, bugfix, refactor, or release gets a contract with a stated scope, acceptance criteria, and audit checklist. The agent is expected to self-report when it takes a wrong approach, gets corrected, or fails an exit gate. Those incidents go into the tracker and feed the benchmark system.
The Benchmark System
The benchmark system uses git history to derive cycle time, sprint completion rate, and incident density — then compares them against industry baselines sourced from published research. Each project gets a baseline snapshot at v1.0.0 and a controlled experiment protocol for testing whether specific process changes improve measurable outcomes.
This was the part that required the most design discipline. The metrics are only meaningful if the definitions are stable. "Cycle time" has to mean the same thing in sprint 3 as it did in sprint 1. "Incident density" needs a consistent denominator. Getting that right meant specifying the measurement protocol before writing a single line of tracking code, and resisting the temptation to add proxy metrics that felt interesting but couldn't be consistently collected.
The Dashboard
Wardstone is a Tauri 2 desktop application — Rust backend, React 19 frontend — that provides a bidirectional editing interface for all DCM markdown state. It reads the .tracker/ directory structure, renders projects, sprints, backlogs, and incidents in a clean dashboard UI, and writes changes back to disk atomically.
The file-system interaction required more care than expected. Writes use a temp-file-then-rename pattern to prevent partial states on crash. A file watcher detects external changes (when the CLI modifies a file while the dashboard is open) and prompts the user rather than silently overwriting. The deferred undo system pauses the watcher during undo operations so it doesn't re-read its own rollback as an external change. Wardstone also ships with 9 themes, a command palette, and keyboard-first navigation.
File-System-as-Database
All state in both DCM and Wardstone is plain markdown with YAML frontmatter. There is no database, no SQLite file, no binary format. A project's full lifecycle history is readable with any text editor, diffable in git, and portable across machines without a migration step.
This was a deliberate constraint. The tradeoff is that you give up query flexibility and enforce a rigid schema through convention rather than enforcement. The payoff is that the system never has a "the database is corrupted" failure mode, and the AI agent can read and write state using the same tools it uses for everything else. When the agent updates a sprint's status or appends an incident entry, it's just editing a markdown file — no ORM, no connection pooling, no schema migration.
What I Learned
Designing the methodology before any tooling existed forced a kind of clarity that's hard to achieve when you're building and designing simultaneously. Every decision about what state to track had to be justified on paper, not rationalized after the fact by "well, it was easy to store." The 5-layer context model went through several revisions before the boundaries felt right — the key insight was that "session" and "task" needed to be separate layers because a session can contain multiple tasks, and the agent needs to distinguish between work-in-progress and completed contracts.
The hardest technical problem in Wardstone was making file-system state feel real-time without introducing race conditions. The combination of atomic writes, mtime-based change detection, and watcher pause-during-undo took three iterations to get right. The first version used optimistic UI updates that occasionally desynchronized from disk state. The second version was too conservative and felt sluggish. The third — watching for external changes at the file level and diffing mtime before applying — hit the balance.
The gap between "methodology as document" and "methodology as enforced system" turned out to be enormous. Writing down an 8-stage lifecycle is easy. Building the hooks and contracts that make skipping stages feel wrong — without making them so onerous that you bypass them deliberately — required living inside the system while building it. The gates had to create just enough friction to prompt intentional decisions, not enough to feel like paperwork.
One non-obvious outcome: the incident self-reporting system changed how I think about agent errors. Treating a wrong approach as a loggable event rather than something to quietly fix shifts the relationship from "the AI made a mistake" to "here is a data point about where the agent needs better guidance." That reframe made the methodology feel less like a constraint and more like a continuous feedback loop.
Outcome
DCM shipped at v1.0.0 with 50 commits. Wardstone shipped at v1.0.0 concurrently. Both are open source. The methodology is actively used across every project in this portfolio — the tracker driving this site's development is a live instance of DCM. The benchmark system is instrumented and collecting baseline data, with the first controlled experiment protocol drafted for sprint cycle time improvement.