7a63914e

Merge branch '010-email-patch-import'
a73x 2026-03-21 10:05
diff --git a/.claude/commands/speckit.analyze.md b/.claude/commands/speckit.analyze.md
new file mode 100644
index 0000000..98b04b0
--- /dev/null
+++ b/.claude/commands/speckit.analyze.md
@@ -0,0 +1,184 @@
---
description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`.

## Execution Steps

### 1. Initialize Analysis Context

Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Overview/Context
- Functional Requirements
- Non-Functional Requirements
- User Stories
- Edge Cases (if present)

**From plan.md:**

- Architecture/stack choices
- Data Model references
- Phases
- Technical constraints

**From tasks.md:**

- Task IDs
- Descriptions
- Phase grouping
- Parallel markers [P]
- Referenced file paths

**From constitution:**

- Load `.specify/memory/constitution.md` for principle validation

### 3. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" → `user-can-upload-file`)
- **User story/action inventory**: Discrete user actions with acceptance criteria
- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 4. Detection Passes (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Duplication Detection

- Identify near-duplicate requirements
- Mark lower-quality phrasing for consolidation

#### B. Ambiguity Detection

- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)

#### C. Underspecification

- Requirements with verbs but missing object or measurable outcome
- User stories missing acceptance criteria alignment
- Tasks referencing files or components not defined in spec/plan

#### D. Constitution Alignment

- Any requirement or plan element conflicting with a MUST principle
- Missing mandated sections or quality gates from constitution

#### E. Coverage Gaps

- Requirements with zero associated tasks
- Tasks with no mapped requirement/story
- Non-functional requirements not reflected in tasks (e.g., performance, security)

#### F. Inconsistency

- Terminology drift (same concept named differently across files)
- Data entities referenced in plan but absent in spec (or vice versa)
- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)

### 5. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
- **LOW**: Style/wording improvements, minor redundancy not affecting execution order

### 6. Produce Compact Analysis Report

Output a Markdown report (no file writes) with the following structure:

## Specification Analysis Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |

(Add one row per finding; generate stable IDs prefixed by category initial.)

**Coverage Summary Table:**

| Requirement Key | Has Task? | Task IDs | Notes |
|-----------------|-----------|----------|-------|

**Constitution Alignment Issues:** (if any)

**Unmapped Tasks:** (if any)

**Metrics:**

- Total Requirements
- Total Tasks
- Coverage % (requirements with >=1 task)
- Ambiguity Count
- Duplication Count
- Critical Issues Count

### 7. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before `/speckit.implement`
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"

### 8. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)

## Context

$ARGUMENTS
diff --git a/.claude/commands/speckit.checklist.md b/.claude/commands/speckit.checklist.md
new file mode 100644
index 0000000..b7624e2
--- /dev/null
+++ b/.claude/commands/speckit.checklist.md
@@ -0,0 +1,295 @@
---
description: Generate a custom checklist for the current feature based on user requirements.
---

## Checklist Purpose: "Unit Tests for English"

**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.

**NOT for verification/testing**:

- ❌ NOT "Verify the button clicks correctly"
- ❌ NOT "Test error handling works"
- ❌ NOT "Confirm the API returns 200"
- ❌ NOT checking if code/implementation matches the spec

**FOR requirements quality validation**:

- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)

**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Execution Steps

1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
   - All file paths must be absolute.
   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
   - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
   - Only ask about information that materially changes checklist content
   - Be skipped individually if already unambiguous in `$ARGUMENTS`
   - Prefer precision over breadth

   Generation algorithm:
   1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
   2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
   3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
   4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
   5. Formulate questions chosen from these archetypes:
      - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
      - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
      - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
      - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
      - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
      - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")

   Question formatting rules:
   - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
   - Limit to A–E options maximum; omit table if a free-form answer is clearer
   - Never ask the user to restate what they already said
   - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."

   Defaults when interaction impossible:
   - Depth: Standard
   - Audience: Reviewer (PR) if code-related; Author otherwise
   - Focus: Top 2 relevance clusters

   Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.

3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
   - Derive checklist theme (e.g., security, review, deploy, ux)
   - Consolidate explicit must-have items mentioned by user
   - Map focus selections to category scaffolding
   - Infer any missing context from spec/plan/tasks (do NOT hallucinate)

4. **Load feature context**: Read from FEATURE_DIR:
   - spec.md: Feature requirements and scope
   - plan.md (if exists): Technical details, dependencies
   - tasks.md (if exists): Implementation tasks

   **Context Loading Strategy**:
   - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
   - Prefer summarizing long sections into concise scenario/requirement bullets
   - Use progressive disclosure: add follow-on retrieval only if gaps detected
   - If source docs are large, generate interim summary items instead of embedding raw text

5. **Generate checklist** - Create "Unit Tests for Requirements":
   - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
   - Generate unique checklist filename:
     - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
     - Format: `[domain].md`
   - File handling behavior:
     - If file does NOT exist: Create new file and number items starting from CHK001
     - If file exists: Append new items to existing file, continuing from the last CHK ID (e.g., if last item is CHK015, start new items at CHK016)
   - Never delete or replace existing checklist content - always preserve and append

   **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
   Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
   - **Completeness**: Are all necessary requirements present?
   - **Clarity**: Are requirements unambiguous and specific?
   - **Consistency**: Do requirements align with each other?
   - **Measurability**: Can requirements be objectively verified?
   - **Coverage**: Are all scenarios/edge cases addressed?

   **Category Structure** - Group items by requirement quality dimensions:
   - **Requirement Completeness** (Are all necessary requirements documented?)
   - **Requirement Clarity** (Are requirements specific and unambiguous?)
   - **Requirement Consistency** (Do requirements align without conflicts?)
   - **Acceptance Criteria Quality** (Are success criteria measurable?)
   - **Scenario Coverage** (Are all flows/cases addressed?)
   - **Edge Case Coverage** (Are boundary conditions defined?)
   - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
   - **Dependencies & Assumptions** (Are they documented and validated?)
   - **Ambiguities & Conflicts** (What needs clarification?)

   **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:

   ❌ **WRONG** (Testing implementation):
   - "Verify landing page displays 3 episode cards"
   - "Test hover states work on desktop"
   - "Confirm logo click navigates home"

   ✅ **CORRECT** (Testing requirements quality):
   - "Are the exact number and layout of featured episodes specified?" [Completeness]
   - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
   - "Are hover state requirements consistent across all interactive elements?" [Consistency]
   - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
   - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
   - "Are loading states defined for asynchronous episode data?" [Completeness]
   - "Does the spec define visual hierarchy for competing UI elements?" [Clarity]

   **ITEM STRUCTURE**:
   Each item should follow this pattern:
   - Question format asking about requirement quality
   - Focus on what's WRITTEN (or not written) in the spec/plan
   - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
   - Reference spec section `[Spec §X.Y]` when checking existing requirements
   - Use `[Gap]` marker when checking for missing requirements

   **EXAMPLES BY QUALITY DIMENSION**:

   Completeness:
   - "Are error handling requirements defined for all API failure modes? [Gap]"
   - "Are accessibility requirements specified for all interactive elements? [Completeness]"
   - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"

   Clarity:
   - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
   - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
   - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"

   Consistency:
   - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
   - "Are card component requirements consistent between landing and detail pages? [Consistency]"

   Coverage:
   - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
   - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
   - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"

   Measurability:
   - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
   - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"

   **Scenario Classification & Coverage** (Requirements Quality Focus):
   - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
   - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
   - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
   - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"

   **Traceability Requirements**:
   - MINIMUM: ≥80% of items MUST include at least one traceability reference
   - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
   - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"

   **Surface & Resolve Issues** (Requirements Quality Problems):
   Ask questions about the requirements themselves:
   - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
   - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
   - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
   - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
   - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"

   **Content Consolidation**:
   - Soft cap: If raw candidate items > 40, prioritize by risk/impact
   - Merge near-duplicates checking the same requirement aspect
   - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"

   **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
   - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
   - ❌ References to code execution, user actions, system behavior
   - ❌ "Displays correctly", "works properly", "functions as expected"
   - ❌ "Click", "navigate", "render", "load", "execute"
   - ❌ Test cases, test plans, QA procedures
   - ❌ Implementation details (frameworks, APIs, algorithms)

   **✅ REQUIRED PATTERNS** - These test requirements quality:
   - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
   - ✅ "Is [vague term] quantified/clarified with specific criteria?"
   - ✅ "Are requirements consistent between [section A] and [section B]?"
   - ✅ "Can [requirement] be objectively measured/verified?"
   - ✅ "Are [edge cases/scenarios] addressed in requirements?"
   - ✅ "Does the spec define [missing aspect]?"

6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.

7. **Report**: Output full path to checklist file, item count, and summarize whether the run created a new file or appended to an existing one. Summarize:
   - Focus areas selected
   - Depth level
   - Actor/timing
   - Any explicit user-specified must-have items incorporated

**Important**: Each `/speckit.checklist` command invocation uses a short, descriptive checklist filename and either creates a new file or appends to an existing one. This allows:

- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
- Simple, memorable filenames that indicate checklist purpose
- Easy identification and navigation in the `checklists/` folder

To avoid clutter, use descriptive types and clean up obsolete checklists when done.

## Example Checklist Types & Sample Items

**UX Requirements Quality:** `ux.md`

Sample items (testing the requirements, NOT the implementation):

- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"

**API Requirements Quality:** `api.md`

Sample items:

- "Are error response formats specified for all failure scenarios? [Completeness]"
- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
- "Are authentication requirements consistent across all endpoints? [Consistency]"
- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
- "Is versioning strategy documented in requirements? [Gap]"

**Performance Requirements Quality:** `performance.md`

Sample items:

- "Are performance requirements quantified with specific metrics? [Clarity]"
- "Are performance targets defined for all critical user journeys? [Coverage]"
- "Are performance requirements under different load conditions specified? [Completeness]"
- "Can performance requirements be objectively measured? [Measurability]"
- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"

**Security Requirements Quality:** `security.md`

Sample items:

- "Are authentication requirements specified for all protected resources? [Coverage]"
- "Are data protection requirements defined for sensitive information? [Completeness]"
- "Is the threat model documented and requirements aligned to it? [Traceability]"
- "Are security requirements consistent with compliance obligations? [Consistency]"
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"

## Anti-Examples: What NOT To Do

**❌ WRONG - These test implementation, not requirements:**

```markdown
- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
```

**✅ CORRECT - These test requirements quality:**

```markdown
- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
```

**Key Differences:**

- Wrong: Tests if the system works correctly
- Correct: Tests if the requirements are written correctly
- Wrong: Verification of behavior
- Correct: Validation of requirement quality
- Wrong: "Does it do X?"
- Correct: "Is X clearly specified?"
diff --git a/.claude/commands/speckit.clarify.md b/.claude/commands/speckit.clarify.md
new file mode 100644
index 0000000..657439f
--- /dev/null
+++ b/.claude/commands/speckit.clarify.md
@@ -0,0 +1,181 @@
---
description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
handoffs: 
  - label: Build Technical Plan
    agent: speckit.plan
    prompt: Create a plan for the spec. I am building with...
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Outline

Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.

Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.

Execution steps:

1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
   - `FEATURE_DIR`
   - `FEATURE_SPEC`
   - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
   - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).

   Functional Scope & Behavior:
   - Core user goals & success criteria
   - Explicit out-of-scope declarations
   - User roles / personas differentiation

   Domain & Data Model:
   - Entities, attributes, relationships
   - Identity & uniqueness rules
   - Lifecycle/state transitions
   - Data volume / scale assumptions

   Interaction & UX Flow:
   - Critical user journeys / sequences
   - Error/empty/loading states
   - Accessibility or localization notes

   Non-Functional Quality Attributes:
   - Performance (latency, throughput targets)
   - Scalability (horizontal/vertical, limits)
   - Reliability & availability (uptime, recovery expectations)
   - Observability (logging, metrics, tracing signals)
   - Security & privacy (authN/Z, data protection, threat assumptions)
   - Compliance / regulatory constraints (if any)

   Integration & External Dependencies:
   - External services/APIs and failure modes
   - Data import/export formats
   - Protocol/versioning assumptions

   Edge Cases & Failure Handling:
   - Negative scenarios
   - Rate limiting / throttling
   - Conflict resolution (e.g., concurrent edits)

   Constraints & Tradeoffs:
   - Technical constraints (language, storage, hosting)
   - Explicit tradeoffs or rejected alternatives

   Terminology & Consistency:
   - Canonical glossary terms
   - Avoided synonyms / deprecated terms

   Completion Signals:
   - Acceptance criteria testability
   - Measurable Definition of Done style indicators

   Misc / Placeholders:
   - TODO markers / unresolved decisions
   - Ambiguous adjectives ("robust", "intuitive") lacking quantification

   For each category with Partial or Missing status, add a candidate question opportunity unless:
   - Clarification would not materially change implementation or validation strategy
   - Information is better deferred to planning phase (note internally)

3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
    - Maximum of 5 total questions across the whole session.
    - Each question must be answerable with EITHER:
       - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR
       - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words").
    - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
    - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
    - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
    - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
    - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.

4. Sequential questioning loop (interactive):
    - Present EXACTLY ONE question at a time.
    - For multiple‑choice questions:
       - **Analyze all options** and determine the **most suitable option** based on:
          - Best practices for the project type
          - Common patterns in similar implementations
          - Risk reduction (security, performance, maintainability)
          - Alignment with any explicit project goals or constraints visible in the spec
       - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
       - Format as: `**Recommended:** Option [X] - <reasoning>`
       - Then render all options as a Markdown table:

       | Option | Description |
       |--------|-------------|
       | A | <Option A description> |
       | B | <Option B description> |
       | C | <Option C description> (add D/E as needed up to 5) |
       | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |

       - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
    - For short‑answer style (no meaningful discrete options):
       - Provide your **suggested answer** based on best practices and context.
       - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
       - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
    - After the user answers:
       - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
       - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
       - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
       - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
    - Stop asking further questions when:
       - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
       - User signals completion ("done", "good", "no more"), OR
       - You reach 5 asked questions.
    - Never reveal future queued questions in advance.
    - If no valid questions exist at start, immediately report no critical ambiguities.

5. Integration after EACH accepted answer (incremental update approach):
    - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
    - For the first integrated answer in this session:
       - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
       - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
    - Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
    - Then immediately apply the clarification to the most appropriate section(s):
       - Functional ambiguity → Update or add a bullet in Functional Requirements.
       - User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
       - Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
       - Non-functional constraint → Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
       - Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
       - Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
    - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
    - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
    - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
    - Keep each inserted clarification minimal and testable (avoid narrative drift).

6. Validation (performed after EACH write plus final pass):
   - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
   - Total asked (accepted) questions ≤ 5.
   - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
   - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
   - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
   - Terminology consistency: same canonical term used across all updated sections.

7. Write the updated spec back to `FEATURE_SPEC`.

8. Report completion (after questioning loop ends or early termination):
   - Number of questions asked & answered.
   - Path to updated spec.
   - Sections touched (list names).
   - Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
   - If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
   - Suggested next command.

Behavior rules:

- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
- If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
- Avoid speculative tech stack questions unless the absence blocks functional clarity.
- Respect user early termination signals ("stop", "done", "proceed").
- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.

Context for prioritization: $ARGUMENTS
diff --git a/.claude/commands/speckit.constitution.md b/.claude/commands/speckit.constitution.md
new file mode 100644
index 0000000..63d4f66
--- /dev/null
+++ b/.claude/commands/speckit.constitution.md
@@ -0,0 +1,84 @@
---
description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
handoffs: 
  - label: Build Specification
    agent: speckit.specify
    prompt: Implement the feature specification based on the updated constitution. I want to build...
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Outline

You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.

**Note**: If `.specify/memory/constitution.md` does not exist yet, it should have been initialized from `.specify/templates/constitution-template.md` during project setup. If it's missing, copy the template first.

Follow this execution flow:

1. Load the existing constitution at `.specify/memory/constitution.md`.
   - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
   **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.

2. Collect/derive values for placeholders:
   - If user input (conversation) supplies a value, use it.
   - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
   - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
   - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
     - MAJOR: Backward incompatible governance/principle removals or redefinitions.
     - MINOR: New principle/section added or materially expanded guidance.
     - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
   - If version bump type ambiguous, propose reasoning before finalizing.

3. Draft the updated constitution content:
   - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
   - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
   - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non‑negotiable rules, explicit rationale if not obvious.
   - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.

4. Consistency propagation checklist (convert prior checklist into active validations):
   - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
   - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
   - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
   - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
   - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.

5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
   - Version change: old → new
   - List of modified principles (old title → new title if renamed)
   - Added sections
   - Removed sections
   - Templates requiring updates (✅ updated / ⚠ pending) with file paths
   - Follow-up TODOs if any placeholders intentionally deferred.

6. Validation before final output:
   - No remaining unexplained bracket tokens.
   - Version line matches report.
   - Dates ISO format YYYY-MM-DD.
   - Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).

7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).

8. Output a final summary to the user with:
   - New version and bump rationale.
   - Any files flagged for manual follow-up.
   - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).

Formatting & Style Requirements:

- Use Markdown headings exactly as in the template (do not demote/promote levels).
- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
- Keep a single blank line between sections.
- Avoid trailing whitespace.

If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.

If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.

Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.
diff --git a/.claude/commands/speckit.fleet.md b/.claude/commands/speckit.fleet.md
new file mode 100644
index 0000000..eca5d6b
--- /dev/null
+++ b/.claude/commands/speckit.fleet.md
@@ -0,0 +1,505 @@
---
description: 'Orchestrate a full feature lifecycle through all SpecKit phases with
  human-in-the-loop checkpoints: specify -> clarify -> plan -> checklist -> tasks
  -> analyze -> cross-model review -> implement -> verify -> CI. Detects partially
  complete features and resumes from the right phase.'
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
agents:
- speckit.specify
- speckit.clarify
- speckit.plan
- speckit.checklist
- speckit.tasks
- speckit.analyze
- speckit.fleet.review
- speckit.implement
- speckit.verify
user-invocable: true
disable-model-invocation: true
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty). Classify the input:

1. **Feature description** (e.g., "Build a capability browser that lets users..."): Store as `FEATURE_DESCRIPTION`. This will be passed verbatim to `speckit.specify` in Phase 1. Skip artifact detection if no `FEATURE_DIR` is found -- go straight to Phase 1.
2. **Phase override** (e.g., "resume at Phase 5" or "start from plan"): Override the auto-detected resume point.
3. **Empty**: Run artifact detection and resume from the detected phase.

---

You are the **SpecKit Fleet Orchestrator** -- a workflow conductor that drives a feature from idea to implementation by delegating to specialized SpecKit agents in order, with human approval at every checkpoint.

## Workflow Phases

| Phase | Agent | Artifact Signal | Gate |
|-------|-------|-----------------|------|
| 1. Specify | `speckit.specify` | `spec.md` exists in FEATURE_DIR | User approves spec |
| 2. Clarify | `speckit.clarify` | `spec.md` contains a `## Clarifications` section | User says "done" or requests another round |
| 3. Plan | `speckit.plan` | `plan.md` exists in FEATURE_DIR | User approves plan |
| 4. Checklist | `speckit.checklist` | `checklists/` directory exists and contains at least one file | User approves checklist |
| 5. Tasks | `speckit.tasks` | `tasks.md` exists in FEATURE_DIR | User approves tasks |
| 6. Analyze | `speckit.analyze` | `.analyze-done` marker exists in FEATURE_DIR | User acknowledges analysis |
| 7. Review | `speckit.fleet.review` | `review.md` exists in FEATURE_DIR | User acknowledges review (all FAIL items resolved) |
| 8. Implement | `speckit.implement` | ALL task checkboxes in tasks.md are `[x]` (none `[ ]`) | Implementation complete |
| 9. Verify | `speckit.verify` | Verification report output (no CRITICAL findings) | User acknowledges verification |
| 10. Tests | Terminal | Tests pass | Tests pass |

## Operating Rules

1. **One phase at a time.** Never skip ahead or run phases in parallel.
2. **Human gate after every phase.** After each agent completes, summarize the outcome and ask the user to:
   - **Approve** -> proceed to the next phase
   - **Revise** -> re-run the same phase with user feedback
   - **Skip** -> mark phase as skipped and move on (user must confirm)
   - **Abort** -> stop the workflow entirely
   - **Rollback** -> jump back to an earlier phase (see Phase Rollback below)
3. **Clarify is repeatable.** After Phase 2, ask: *"Run another clarification round, or move on to planning?"* Loop until the user says done.
4. **Track progress.** Use the todo tool to create and update a checklist of all 10 phases so the user always sees where they are.
5. **Pass context forward.** When delegating, include the feature description and any user-provided refinements so each agent has full context.
6. **Suppress sub-agent handoffs.** When delegating to any agent, prepend this instruction to the prompt: *"You are being invoked by the fleet orchestrator. Do NOT follow handoffs or auto-forward to other agents. Return your output to the orchestrator and stop."* This prevents `send: true` handoff chains (e.g., plan -> tasks -> analyze -> implement) from bypassing fleet's human gates.
7. **Verify phase.** After implementation, run `speckit.verify` to validate code against spec artifacts. Requires the verify extension (see Phase 9).
8. **Test phase.** After verification, detect the project's test runner(s) and run tests. See Phase 10 for detection logic.
9. **Git checkpoint commits.** After these phases complete, offer to create a WIP commit to safeguard progress:
   - After Phase 5 (Tasks) -- all design artifacts are finalized
   - After Phase 8 (Implement) -- all code is written
   - After Phase 9 (Verify) -- code is validated
   Commit message format: `wip: fleet phase {N} -- {phase name} complete`
   Always ask before committing -- never auto-commit. If the user declines, continue without committing.
10. **Context budget awareness.** Long-running fleet sessions can exhaust the model's context window. Monitor for these signs:
    - Responses becoming shorter or losing earlier context
    - Reaching Phase 8+ in a session that started from Phase 1
    At natural checkpoints (after git commits or between phases), if context pressure seems high, suggest: *"This is getting long. We can continue in a new chat -- the fleet will auto-detect progress and resume at Phase {N}."*

## Parallel Subagent Execution (Plan & Implement Phases)

During **Phase 3 (Plan)** and **Phase 8 (Implement)**, the orchestrator may dispatch **up to 3 subagents in parallel** when work items are independent. This is governed by the `[P]` (parallelizable) marker system already used in tasks.md.

### How Parallelism Works

1. **Tasks agent embeds the plan.** During Phase 5 (Tasks), the tasks agent marks tasks with `[P]` when they touch different files and have no dependency on incomplete tasks. Tasks within the same phase that share `[P]` markers form a **parallel group**.

2. **Fleet orchestrator fans out.** When executing Plan or Implement, the orchestrator:
   - Reads the current phase's task list from tasks.md
   - Identifies `[P]`-marked tasks that form an independent group (no shared files, no ordering dependency)
   - Dispatches up to **3 subagents simultaneously** for the group
   - Waits for all dispatched agents to complete before moving to the next group or sequential task
   - If any parallel task fails, halts the batch and reports the failure before continuing

3. **Parallelism constraints:**
   - **Max concurrency: 3** -- never dispatch more than 3 subagents at once
   - **Same-file exclusion** -- tasks touching the same file MUST run sequentially even if both are `[P]`
   - **Phase boundaries are serial** -- all tasks in Phase N must complete before Phase N+1 begins
   - **Human gate still applies** -- after each implementation phase completes (all groups done), summarize and checkpoint with the user before the next phase

### Parallel Groups in tasks.md

The tasks agent should organize `[P]` tasks into explicit parallel groups using comments in tasks.md:

```markdown
### Phase 1: Setup

<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create CapabilityManifest.cs in Models/Generation/
- [ ] T003 [P] Create DocumentIndex.cs in Models/Generation/
- [ ] T004 [P] Create ResolvedContext.cs in Models/Generation/

<!-- parallel-group: 2 (max 3 concurrent) -->
- [ ] T005 [P] Create GenerationResult.cs in Models/Generation/
- [ ] T006 [P] Create BatchGenerationJob.cs in Models/Generation/
- [ ] T007 [P] Create SchemaExport.cs in Models/Generation/

<!-- sequential -->
- [ ] T013 Create generation.ts with all TypeScript interfaces
```

### Plan Phase Parallelism

During Phase 3 (Plan), the plan agent's Phase 0 (Research) can dispatch up to 3 research sub-tasks in parallel:
- Each `NEEDS CLARIFICATION` item or technology best-practice lookup is an independent research task
- Fan out up to 3 at a time, consolidate results into research.md
- Phase 1 (Design) artifacts -- data-model.md, contracts/, quickstart.md -- can be generated in parallel if they don't depend on each other's output

### Implement Phase Parallelism

During Phase 8 (Implement), for each implementation phase in tasks.md:
1. Read the phase and identify parallel groups (marked with `<!-- parallel-group: N -->` comments)
2. For each group, dispatch up to 3 `speckit.implement` subagents simultaneously, each given a specific subset of tasks
3. When all tasks in a group complete, move to the next group or sequential task
4. After the entire phase completes, checkpoint with the user before proceeding to the next phase

### Instructions for Tasks Agent

When the fleet orchestrator delegates to `speckit.tasks`, append this instruction:

> "Organize [P]-marked tasks into explicit parallel groups using `<!-- parallel-group: N -->` HTML comments. Each group should contain up to 3 tasks that can execute concurrently (different files, no dependencies). Add `<!-- sequential -->` before tasks that must run in order. This enables the fleet orchestrator to fan out up to 3 subagents per group during implementation."

## First-Turn Behavior -- Artifact Detection & Resume

On **every** invocation, before doing anything else, run artifact detection to determine where the workflow stands. This allows the orchestrator to resume mid-flight even in a fresh conversation.

### Step 0: Branch safety pre-flight

Before anything else, run basic git health checks:

1. **Uncommitted changes**: Run `git status --porcelain`. If there are uncommitted changes, warn the user:
   > WARNING: You have uncommitted changes. Starting the fleet may create conflicts. Commit or stash first?
   > - **Continue** -- proceed with uncommitted changes (risky)
   > - **Stash** -- run `git stash` and continue
   > - **Abort** -- stop and let the user handle it

2. **Detached HEAD**: Run `git branch --show-current`. If empty (detached HEAD), abort:
   > Cannot run fleet on a detached HEAD. Please check out a feature branch first.

3. **Branch freshness** (advisory): Run `git log --oneline HEAD..origin/main 2>/dev/null | wc -l`. If the main branch has commits not in the current branch, advise:
   > Your branch is {N} commits behind main. Consider rebasing before starting implementation to avoid merge conflicts later.

This check runs only once on first invocation. It does NOT block the workflow (except for detached HEAD).

### Step 1: Discover the feature directory

Run `{SCRIPT}` from the repo root to get the feature directory paths as JSON. Parse the output to get `FEATURE_DIR`.

If the script fails (e.g., not on a feature branch):
- If `FEATURE_DESCRIPTION` was provided in `$ARGUMENTS`, proceed directly to Phase 1 -- pass the description to `speckit.specify` and it will create the feature directory.
- If `$ARGUMENTS` is empty, ask the user for the feature description, then start Phase 1.

### Step 2: Check model configuration

Check if `{FEATURE_DIR}/../../../.specify/extensions/fleet/fleet-config.yml` (or the project's config location) has model settings. If the config file doesn't exist or models are set to defaults:

1. **Detect the platform**: Identify which IDE/agent platform you're running in (VS Code Copilot, Claude Code, Cursor, etc.) based on available context.

2. **Primary model**: If `models.primary` is `"auto"`, use whatever model you are currently running as. No action needed -- you ARE the primary model.

3. **Review model**: If `models.review` is `"ask"`, prompt the user:
   > **Model setup (one-time):** The cross-model review (Phase 7) works best with a *different* model than the one running the fleet, to catch blind spots.
   >
   > What model should I use for the review phase? Suggestions:
   > - A different model family (e.g., if you're on Claude, use GPT or Gemini)
   > - A different tier (e.g., if you're on Opus, use Sonnet)
   > - "skip" to skip Phase 7 entirely
   >
   > You can also set this permanently in your fleet config.

4. **Store the choice**: Remember the user's model selection for the duration of this conversation. If they want to persist it, suggest editing the config file.

### Step 3: Probe artifacts in FEATURE_DIR

Check these paths **in order** using the `read` tool. Each check is a file/directory existence AND basic integrity test:

| Check | Path | Existence | Integrity |
|-------|------|-----------|-----------|
| spec.md | `{FEATURE_DIR}/spec.md` | File exists? | Has `## User Stories` or `## Requirements` section? File > 100 bytes? |
| Clarifications | `{FEATURE_DIR}/spec.md` | Contains `## Clarifications` heading? | At least one Q&A pair present? |
| plan.md | `{FEATURE_DIR}/plan.md` | File exists? | Has `## Architecture` or `## Tech Stack` section? File > 200 bytes? |
| checklists/ | `{FEATURE_DIR}/checklists/` | Directory exists and has >=1 file? | Each file > 50 bytes? |
| tasks.md | `{FEATURE_DIR}/tasks.md` | File exists? | Contains at least one `- [ ]` or `- [x]` item? Has `### Phase` heading? |
| .analyze-done | `{FEATURE_DIR}/.analyze-done` | Marker file exists? | -- |
| review.md | `{FEATURE_DIR}/review.md` | File exists? | Contains `## Summary` and verdict table? |
| Implementation | `{FEATURE_DIR}/tasks.md` | All `- [x]`, zero `- [ ]` remaining? | -- |
| Verify extension | `.specify/extensions/verify/extension.yml` | File exists? | -- |
| Verification | `{FEATURE_DIR}/.verify-done` | Marker file exists? | -- |

**Integrity failures are advisory, not blocking.** If a file exists but fails integrity checks, warn the user:
> WARNING: `plan.md` exists but appears incomplete (missing expected sections). It may have been partially generated. Re-run Phase 3 (Plan), or continue with the current file?

### Step 4: Determine the resume phase

Walk the artifact signals **top-down**. The first phase whose artifact is **missing** is where work resumes:

```
if spec.md missing           -> resume at Phase 1 (Specify)
if no ## Clarifications       -> resume at Phase 2 (Clarify)
if plan.md missing           -> resume at Phase 3 (Plan)
if checklists/ empty/missing -> resume at Phase 4 (Checklist)
if tasks.md missing          -> resume at Phase 5 (Tasks)
if .analyze-done missing     -> resume at Phase 6 (Analyze)
if review.md missing         -> resume at Phase 7 (Review)
if tasks.md has `- [ ]`     -> resume at Phase 8 (Implement)
if .verify-done missing      -> resume at Phase 9 (Verify)
if all done                  -> resume at Phase 10 (Tests)
```

### Step 5: Present status and confirm

Show the user a status table and the detected resume point:

```
Feature: {branch name}
Directory: {FEATURE_DIR}

Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [x] checklists/ has 2 files
Phase 5 Tasks        [x] tasks.md found
Phase 6 Analyze      [ ] .analyze-done not found
Phase 7 Review       [ ] --
Phase 8 Implement    [ ] --
Phase 9 Verify       [ ] --
Phase 10 Tests       [ ] --

> Resuming at Phase 6: Analyze
```

Then ask: *"Detected progress above. Resume at Phase {N} ({name}), or override to a different phase?"*

- If user confirms -> create the todo list with completed phases marked as `completed` and resume from Phase N.
- If user provides a phase number or name -> start from that phase instead.
- If FEATURE_DIR doesn't exist -> start from Phase 1, ask for the feature description.

### Edge Cases

- **Implementation partially complete**: If `tasks.md` exists and has a mix of `[x]` and `[ ]`, resume at Phase 8 (Implement). Tell the user how many tasks remain: *"tasks.md: {done}/{total} tasks complete. {remaining} tasks remaining."*
- **Analyze completion marker**: After Phase 6 (Analyze) completes -- whether it produces `remediation.md` or not -- create a marker file `{FEATURE_DIR}/.analyze-done` containing the timestamp. This distinguishes "analyze ran clean" from "analyze never ran." The `.analyze-done` file is the artifact signal for Phase 6, not `remediation.md`.
- **Review can be skipped**: If user opts to skip cross-model review, treat Phase 7 as skipped and proceed to Phase 8.
- **Review found NO failures**: If `review.md` exists and overall verdict is "READY", Phase 7 is complete -- proceed to Phase 8.
- **Review found FAIL items**: If `review.md` has FAIL verdicts, present them and ask user whether to (a) fix the issues by re-running the relevant earlier phase, (b) proceed anyway, or (c) abort.
- **Verify extension not installed**: If `.specify/extensions/verify/extension.yml` doesn't exist, prompt to install. If user declines, skip Phase 9.
- **Verify completion marker**: After Phase 9 (Verify) completes, create `{FEATURE_DIR}/.verify-done` with timestamp. This distinguishes "verify ran" from "verify never ran."
- **Checklists may be skipped**: Some features don't use checklists. If `tasks.md` exists but `checklists/` doesn't, treat Phase 4 as skipped.
- **Fresh branch, no specs dir**: Start from Phase 1. Use `FEATURE_DESCRIPTION` from `$ARGUMENTS` if provided; otherwise ask the user.
- **User says "start over"**: Re-run from Phase 1 regardless of existing artifacts. Warn that this will overwrite existing artifacts and get confirmation.

### Stale Artifact Detection

After determining the resume phase, check for **stale downstream artifacts** -- files generated by an earlier phase that may be outdated because an upstream artifact was modified later.

Compare file modification timestamps in this dependency chain:

```
spec.md -> plan.md -> tasks.md -> .analyze-done -> review.md -> [implementation] -> .verify-done
```

If a file is **newer** than a downstream file that depends on it (e.g., `spec.md` was modified after `plan.md`), warn the user:

> WARNING: **Stale artifact detected**: `plan.md` (modified {date}) was generated before the latest `spec.md` change ({date}). Plan may not reflect current requirements. Re-run Phase 3 (Plan) to update, or proceed with the current plan?

This is advisory only -- the user decides whether to rerun. Do not block the workflow.

## Phase Execution Template

For each phase:
```
1. Mark the phase as in-progress in the todo list
2. Announce: "**Phase N: {Name}** -- delegating to {agent}..."
3. Delegate to the agent with relevant arguments:
   - Phase 1 (Specify): pass FEATURE_DESCRIPTION from $ARGUMENTS as the argument
   - Phase 2 (Clarify): pass the feature description and any user feedback
   - All other phases: pass the feature description and any user-provided refinements
4. Summarize the agent's output concisely
5. Ask: "Ready to proceed to Phase N+1 ({next name}), or would you like to revise?"
6. Wait for user response
7. Mark phase as completed when approved
```

## Phase 7: Cross-Model Review

This phase uses a **different model** than the one that generated plan.md and tasks.md, providing a fresh perspective to catch blind spots.

1. Delegate to `speckit.fleet.review` -- it runs on the **review model** configured in Step 2 (a different model than the primary) and is **read-only**
2. The review agent reads spec.md, plan.md, tasks.md, checklists/, and remediation.md
3. It evaluates 7 dimensions: spec-plan alignment, plan-tasks completeness, dependency ordering, parallelization correctness, feasibility & risk, standards compliance, implementation readiness
4. It outputs a structured review report with PASS/WARN/FAIL verdicts per dimension
5. **Save the review output** to `{FEATURE_DIR}/review.md`
6. Present the summary table to the user:
   - **All PASS / READY**: *"Cross-model review passed. Ready to implement?"*
   - **WARN items**: *"Review found {N} warnings. Proceed to implementation, or address them first?"*
   - **FAIL items**: *"Review found {N} critical issues that should be fixed before implementing."* -- list them and ask which earlier phase to re-run (plan, tasks, or analyze)
7. If user chooses to fix: loop back to the appropriate phase, then re-run review after fixes
8. If user approves: mark Phase 7 complete and proceed to Phase 8 (Implement)

**Note**: Phase 7 (Review) validates design artifacts *before* implementation. Phase 9 (Verify) validates actual code *after* implementation. Both are read-only.

## Phase 9: Post-Implementation Verification

This phase validates that the implemented code matches the specification artifacts. It requires the **verify extension**.

### Extension Installation Check

Before delegating to `speckit.verify`, check if the extension is installed:

1. Check if `.specify/extensions/verify/extension.yml` exists using the `read` tool
2. If **missing**, ask the user:
   > The verify extension is not installed. Install it now?
   > ```
   > specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
   > ```
3. If user approves, run the install command in the terminal
4. If user declines, skip Phase 9 and proceed to Phase 10 (CI)

### Verification Execution

1. Delegate to `speckit.verify` -- it reads spec.md, plan.md, tasks.md, constitution.md and the implemented source files
2. It runs 7 verification checks: task completion, file existence, requirement coverage, scenario & test coverage, spec intent alignment, constitution alignment, design & structure consistency
3. It outputs a verification report with findings, metrics, and next actions
4. Present the summary to the user:
   - **No findings**: *"Verification passed. Ready to run CI?"* -- proceed to Phase 10
   - **Findings exist**: Show the findings grouped by severity (CRITICAL, WARNING, INFO) and enter the **Implement-Verify loop** below

### Implement-Verify Loop

When verification produces findings, run a remediation loop:

```
repeat:
  1. Present findings to user
  2. Ask: "Re-run implementation to address these findings? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with findings as context, then re-run speckit.verify
     - skip  -> exit loop, proceed to Phase 10 with current state
     - abort -> stop the workflow entirely
  3. After re-verify, check findings again
until: no findings remain OR user says skip/abort
```

Rules for the loop:
- **Pass findings as context**: When delegating to `speckit.implement`, include the verification findings so it knows exactly what to fix. Prepend: *"Address the following verification findings: {findings list}"*
- **Suppress sub-agent handoffs** (Operating Rule 6 still applies)
- **Track iterations**: Show the loop count each time -- *"Implement-Verify iteration {N}: {findings_count} findings remaining"*
- **Cap at 3 iterations**: After 3 rounds, if findings persist, warn the user: *"3 remediation iterations completed with {N} findings still remaining. These may require manual intervention. Proceed to CI, or continue?"*
- **Human gate every iteration**: Never auto-loop -- always ask before re-implementing
- **Delta reporting**: After each re-verify, show what changed -- *"Fixed: {N}, New: {N}, Remaining: {N}"*

After the loop exits (no findings or user skips):
1. Create a marker file `{FEATURE_DIR}/.verify-done` containing the timestamp and final findings count
2. Mark Phase 9 complete and proceed to Phase 10 (Tests)

## Phase 10: Tests

After verification, detect and run the project's test suite.

### Test Runner Detection

Detect test runner(s) by checking for these files at the repo root, in order:

| Check | Runner | Command |
|-------|--------|---------|
| `package.json` with `"test"` script | npm/yarn/pnpm | `npm test` (or `yarn test` / `pnpm test` based on lockfile) |
| `*.sln` or `*.slnx` or `*.csproj` | dotnet | `dotnet test` |
| `Makefile` with `test` target | make | `make test` |
| `pytest.ini` or `pyproject.toml` with `[tool.pytest]` | pytest | `pytest` |
| `Cargo.toml` | cargo | `cargo test` |
| `go.mod` | go | `go test ./...` |

If **multiple** runners are detected (e.g., a monorepo with both `package.json` and `*.slnx`), run all of them and report results per runner.

If **no** runner is detected, ask the user: *"No test runner detected. What command runs your tests?"*

### Test Execution

1. Run the detected test command(s) from the repo root
2. Report pass/fail summary with failure details

### CI Remediation Loop

If CI fails, run a remediation loop (same pattern as the Implement-Verify loop):

```
repeat:
  1. Parse test failures -- group by type (compile error, test failure, lint error)
  2. Present failures to user with file locations and error messages
  3. Ask: "Fix these CI failures? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with failure details as context, then re-run CI
     - skip  -> exit loop, leave failures for manual fixing
     - abort -> stop the workflow entirely
  4. After re-run, check CI result again
until: CI passes OR user says skip/abort
```

Rules:
- **Pass failure context**: Include exact error messages, file paths, and test names when delegating to implement
- **Cap at 3 iterations**: After 3 rounds, warn: *"3 CI fix iterations completed, {N} failures remain. These likely need manual debugging."*
- **Human gate every iteration**: Never auto-loop
- **Delta reporting**: *"Fixed: {N} failures, New: {N}, Remaining: {N}"*
- **Distinguish failure types**: Compile errors should be fixed before test failures (they may cause cascading test failures)

### Tests Pass

When all tests pass, proceed to the Completion Summary.

## Error Recovery

### Parallel Task Failure

When a task within a parallel group fails during Phase 8 (Implement):
1. **Let the other in-flight tasks finish** -- don't abort tasks that are already running
2. Report which task(s) failed with error details
3. Offer three options:
   - **Retry failed only** -- re-dispatch only the failed task(s), skip completed ones
   - **Retry entire group** -- re-run all tasks in the parallel group (useful if failure cascaded)
   - **Skip and continue** -- mark the failed task(s) and move on (user can fix manually later)
4. Never auto-retry -- always ask the user

### Sub-Agent Timeout or Crash

If a delegated sub-agent doesn't return (timeout) or returns an error:
1. Report the phase and agent that failed
2. Offer to retry the same phase or skip it
3. If the same agent fails twice in a row, suggest the user run it manually (`/speckit.{agent}`) and then resume the fleet

## Phase Rollback

At any human gate, the user may say "go back to Phase N" or "rollback to plan." The fleet supports this:

1. **Identify the target phase**: Parse the user's request to determine which phase to roll back to.
2. **Warn about downstream invalidation**: All artifacts generated by phases *after* the target phase are now potentially stale. Show:
   > Rolling back to Phase {N} ({name}). The following artifacts may be invalidated:
   > - plan.md (Phase 3)
   > - tasks.md (Phase 5)
   > - Implementation (Phase 8)
   >
   > These will be regenerated as the workflow proceeds. Continue?
3. **Delete marker files only**: Remove `.analyze-done`, `.verify-done`, and `review.md` for invalidated phases. Do NOT delete spec.md, plan.md, or tasks.md -- they'll be overwritten when the phase re-runs.
4. **Update the todo list**: Reset all phases from the target phase onward to `not-started`.
5. **Resume from the target phase**: Follow the normal phase execution flow from that point.

**Constraints**:
- Cannot rollback during an active sub-agent delegation -- wait for it to complete first
- Rollback to Phase 1 (Specify) with "start over" requires explicit confirmation since it regenerates everything

## Completion Summary

After Phase 10 completes (CI passes or user skips CI), present a structured summary:

```
## Fleet Complete

Feature: {feature name}
Branch: {branch name}
Duration: Phases 1-10 ({phases completed}/{phases total}, {phases skipped} skipped)

### Artifacts Generated
- spec.md -- feature specification ({word count} words, {user stories count} user stories)
- plan.md -- technical plan ({components count} components)
- tasks.md -- {total tasks} tasks ({completed} completed, {remaining} remaining)
- review.md -- cross-model review (verdict: {verdict})

### Implementation
- Files created: {count}
- Files modified: {count}
- Tests added: {count}

### Quality Gates
- Analyze: {pass/findings count}
- Cross-model review: {verdict}
- Verify: {pass/findings count} ({iterations} iterations)
- CI: {pass/fail}

### Git
- Commits: {list of WIP commits if any}
- Ready to push: {yes/no}
```

After the summary, offer:
1. *"Push to remote and create a PR?"* (if the user wants)
2. *"View any artifact? (spec, plan, tasks, review)"*
\ No newline at end of file
diff --git a/.claude/commands/speckit.fleet.review.md b/.claude/commands/speckit.fleet.review.md
new file mode 100644
index 0000000..7b840b7
--- /dev/null
+++ b/.claude/commands/speckit.fleet.review.md
@@ -0,0 +1,117 @@
---
description: Cross-model evaluation of plan.md and tasks.md before implementation.
  Reviews feasibility, completeness, dependency ordering, risk, and parallelization
  correctness using a different model than was used to generate the artifacts.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
user-invocable: false
agents: []
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

---

You are a **Pre-Implementation Reviewer** -- a critical evaluator who reviews the design artifacts (plan.md, tasks.md, spec.md) produced by earlier workflow phases. Your purpose is to catch issues that the generating model may have been blind to, before implementation begins.

**STRICTLY READ-ONLY**: Do NOT modify any files. Output a structured review report only.

## What You Review

Run `{SCRIPT}` from the repo root to discover `FEATURE_DIR`. Then read these artifacts:

- `spec.md` -- the feature specification (requirements, user stories)
- `plan.md` -- the technical plan (architecture, tech stack, file structure)
- `tasks.md` -- the task breakdown (phased, dependency-ordered, with [P] markers)
- `checklists/` -- any requirement quality checklists (if present)
- `remediation.md` -- analyze output (if present)

## Review Dimensions

Evaluate across these 7 dimensions. For each, assign a verdict: **PASS**, **WARN**, or **FAIL**.

### 1. Spec-Plan Alignment
- Does plan.md address every user story in spec.md?
- Are there plan decisions that contradict spec requirements?
- Are non-functional requirements (performance, security, accessibility) covered in the plan?

### 2. Plan-Tasks Completeness
- Does every architectural component in plan.md have corresponding tasks in tasks.md?
- Are there tasks that reference files/patterns not described in plan.md?
- Are test tasks present for critical paths?

### 3. Dependency Ordering
- Are task phases ordered correctly? (setup -> foundational -> stories -> polish)
- Do any tasks reference files/interfaces that haven't been created by an earlier task?
- Are foundational tasks truly blocking, or could some be parallelized?

### 4. Parallelization Correctness
- Are `[P]` markers accurate? (Do tasks marked parallel truly touch different files with no dependency?)
- Are there tasks NOT marked `[P]` that could be parallelized?
- Do `<!-- parallel-group: N -->` groupings respect the max-3 constraint?
- Are there same-file conflicts hidden within a parallel group?

### 5. Feasibility & Risk
- Are there tasks that seem too large? (If a single task touches >3 files or >200 LOC, flag it)
- Are there technology choices in plan.md that contradict the project's existing stack?
- Are there missing error handling, edge case, or migration tasks?
- Does the task count seem proportional to the feature complexity?

### 6. Constitution & Standards Compliance
- Read `.specify/memory/constitution.md` and check plan aligns with project principles
- Check that testing approach matches the project's testing standards (80% coverage, TDD if required)
- Verify security considerations are addressed (path validation, input sanitization, etc.)

### 7. Implementation Readiness
- Is every task specific enough for an LLM to execute without ambiguity?
- Do all tasks include exact file paths?
- Are acceptance criteria clear for each user story phase?

## Output Format

```markdown
# Pre-Implementation Review

**Feature**: {feature name from spec.md}
**Artifacts reviewed**: spec.md, plan.md, tasks.md, [others if present]
**Review model**: {your model name} (should be different from the model that generated the artifacts)
**Generating model**: {model used for Phases 1-6, if known}

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS/WARN/FAIL | brief note |
| Plan-Tasks Completeness | PASS/WARN/FAIL | brief note |
| Dependency Ordering | PASS/WARN/FAIL | brief note |
| Parallelization Correctness | PASS/WARN/FAIL | brief note |
| Feasibility & Risk | PASS/WARN/FAIL | brief note |
| Standards Compliance | PASS/WARN/FAIL | brief note |
| Implementation Readiness | PASS/WARN/FAIL | brief note |

**Overall**: READY / READY WITH WARNINGS / NOT READY

## Findings

### Critical (FAIL -- must fix before implementing)
1. ...

### Warnings (WARN -- recommend fixing, can proceed)
1. ...

### Observations (informational)
1. ...

## Recommended Actions
- [ ] {specific action to address each FAIL/WARN}
```
\ No newline at end of file
diff --git a/.claude/commands/speckit.fleet.run.md b/.claude/commands/speckit.fleet.run.md
new file mode 100644
index 0000000..eca5d6b
--- /dev/null
+++ b/.claude/commands/speckit.fleet.run.md
@@ -0,0 +1,505 @@
---
description: 'Orchestrate a full feature lifecycle through all SpecKit phases with
  human-in-the-loop checkpoints: specify -> clarify -> plan -> checklist -> tasks
  -> analyze -> cross-model review -> implement -> verify -> CI. Detects partially
  complete features and resumes from the right phase.'
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
agents:
- speckit.specify
- speckit.clarify
- speckit.plan
- speckit.checklist
- speckit.tasks
- speckit.analyze
- speckit.fleet.review
- speckit.implement
- speckit.verify
user-invocable: true
disable-model-invocation: true
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty). Classify the input:

1. **Feature description** (e.g., "Build a capability browser that lets users..."): Store as `FEATURE_DESCRIPTION`. This will be passed verbatim to `speckit.specify` in Phase 1. Skip artifact detection if no `FEATURE_DIR` is found -- go straight to Phase 1.
2. **Phase override** (e.g., "resume at Phase 5" or "start from plan"): Override the auto-detected resume point.
3. **Empty**: Run artifact detection and resume from the detected phase.

---

You are the **SpecKit Fleet Orchestrator** -- a workflow conductor that drives a feature from idea to implementation by delegating to specialized SpecKit agents in order, with human approval at every checkpoint.

## Workflow Phases

| Phase | Agent | Artifact Signal | Gate |
|-------|-------|-----------------|------|
| 1. Specify | `speckit.specify` | `spec.md` exists in FEATURE_DIR | User approves spec |
| 2. Clarify | `speckit.clarify` | `spec.md` contains a `## Clarifications` section | User says "done" or requests another round |
| 3. Plan | `speckit.plan` | `plan.md` exists in FEATURE_DIR | User approves plan |
| 4. Checklist | `speckit.checklist` | `checklists/` directory exists and contains at least one file | User approves checklist |
| 5. Tasks | `speckit.tasks` | `tasks.md` exists in FEATURE_DIR | User approves tasks |
| 6. Analyze | `speckit.analyze` | `.analyze-done` marker exists in FEATURE_DIR | User acknowledges analysis |
| 7. Review | `speckit.fleet.review` | `review.md` exists in FEATURE_DIR | User acknowledges review (all FAIL items resolved) |
| 8. Implement | `speckit.implement` | ALL task checkboxes in tasks.md are `[x]` (none `[ ]`) | Implementation complete |
| 9. Verify | `speckit.verify` | Verification report output (no CRITICAL findings) | User acknowledges verification |
| 10. Tests | Terminal | Tests pass | Tests pass |

## Operating Rules

1. **One phase at a time.** Never skip ahead or run phases in parallel.
2. **Human gate after every phase.** After each agent completes, summarize the outcome and ask the user to:
   - **Approve** -> proceed to the next phase
   - **Revise** -> re-run the same phase with user feedback
   - **Skip** -> mark phase as skipped and move on (user must confirm)
   - **Abort** -> stop the workflow entirely
   - **Rollback** -> jump back to an earlier phase (see Phase Rollback below)
3. **Clarify is repeatable.** After Phase 2, ask: *"Run another clarification round, or move on to planning?"* Loop until the user says done.
4. **Track progress.** Use the todo tool to create and update a checklist of all 10 phases so the user always sees where they are.
5. **Pass context forward.** When delegating, include the feature description and any user-provided refinements so each agent has full context.
6. **Suppress sub-agent handoffs.** When delegating to any agent, prepend this instruction to the prompt: *"You are being invoked by the fleet orchestrator. Do NOT follow handoffs or auto-forward to other agents. Return your output to the orchestrator and stop."* This prevents `send: true` handoff chains (e.g., plan -> tasks -> analyze -> implement) from bypassing fleet's human gates.
7. **Verify phase.** After implementation, run `speckit.verify` to validate code against spec artifacts. Requires the verify extension (see Phase 9).
8. **Test phase.** After verification, detect the project's test runner(s) and run tests. See Phase 10 for detection logic.
9. **Git checkpoint commits.** After these phases complete, offer to create a WIP commit to safeguard progress:
   - After Phase 5 (Tasks) -- all design artifacts are finalized
   - After Phase 8 (Implement) -- all code is written
   - After Phase 9 (Verify) -- code is validated
   Commit message format: `wip: fleet phase {N} -- {phase name} complete`
   Always ask before committing -- never auto-commit. If the user declines, continue without committing.
10. **Context budget awareness.** Long-running fleet sessions can exhaust the model's context window. Monitor for these signs:
    - Responses becoming shorter or losing earlier context
    - Reaching Phase 8+ in a session that started from Phase 1
    At natural checkpoints (after git commits or between phases), if context pressure seems high, suggest: *"This is getting long. We can continue in a new chat -- the fleet will auto-detect progress and resume at Phase {N}."*

## Parallel Subagent Execution (Plan & Implement Phases)

During **Phase 3 (Plan)** and **Phase 8 (Implement)**, the orchestrator may dispatch **up to 3 subagents in parallel** when work items are independent. This is governed by the `[P]` (parallelizable) marker system already used in tasks.md.

### How Parallelism Works

1. **Tasks agent embeds the plan.** During Phase 5 (Tasks), the tasks agent marks tasks with `[P]` when they touch different files and have no dependency on incomplete tasks. Tasks within the same phase that share `[P]` markers form a **parallel group**.

2. **Fleet orchestrator fans out.** When executing Plan or Implement, the orchestrator:
   - Reads the current phase's task list from tasks.md
   - Identifies `[P]`-marked tasks that form an independent group (no shared files, no ordering dependency)
   - Dispatches up to **3 subagents simultaneously** for the group
   - Waits for all dispatched agents to complete before moving to the next group or sequential task
   - If any parallel task fails, halts the batch and reports the failure before continuing

3. **Parallelism constraints:**
   - **Max concurrency: 3** -- never dispatch more than 3 subagents at once
   - **Same-file exclusion** -- tasks touching the same file MUST run sequentially even if both are `[P]`
   - **Phase boundaries are serial** -- all tasks in Phase N must complete before Phase N+1 begins
   - **Human gate still applies** -- after each implementation phase completes (all groups done), summarize and checkpoint with the user before the next phase

### Parallel Groups in tasks.md

The tasks agent should organize `[P]` tasks into explicit parallel groups using comments in tasks.md:

```markdown
### Phase 1: Setup

<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create CapabilityManifest.cs in Models/Generation/
- [ ] T003 [P] Create DocumentIndex.cs in Models/Generation/
- [ ] T004 [P] Create ResolvedContext.cs in Models/Generation/

<!-- parallel-group: 2 (max 3 concurrent) -->
- [ ] T005 [P] Create GenerationResult.cs in Models/Generation/
- [ ] T006 [P] Create BatchGenerationJob.cs in Models/Generation/
- [ ] T007 [P] Create SchemaExport.cs in Models/Generation/

<!-- sequential -->
- [ ] T013 Create generation.ts with all TypeScript interfaces
```

### Plan Phase Parallelism

During Phase 3 (Plan), the plan agent's Phase 0 (Research) can dispatch up to 3 research sub-tasks in parallel:
- Each `NEEDS CLARIFICATION` item or technology best-practice lookup is an independent research task
- Fan out up to 3 at a time, consolidate results into research.md
- Phase 1 (Design) artifacts -- data-model.md, contracts/, quickstart.md -- can be generated in parallel if they don't depend on each other's output

### Implement Phase Parallelism

During Phase 8 (Implement), for each implementation phase in tasks.md:
1. Read the phase and identify parallel groups (marked with `<!-- parallel-group: N -->` comments)
2. For each group, dispatch up to 3 `speckit.implement` subagents simultaneously, each given a specific subset of tasks
3. When all tasks in a group complete, move to the next group or sequential task
4. After the entire phase completes, checkpoint with the user before proceeding to the next phase

### Instructions for Tasks Agent

When the fleet orchestrator delegates to `speckit.tasks`, append this instruction:

> "Organize [P]-marked tasks into explicit parallel groups using `<!-- parallel-group: N -->` HTML comments. Each group should contain up to 3 tasks that can execute concurrently (different files, no dependencies). Add `<!-- sequential -->` before tasks that must run in order. This enables the fleet orchestrator to fan out up to 3 subagents per group during implementation."

## First-Turn Behavior -- Artifact Detection & Resume

On **every** invocation, before doing anything else, run artifact detection to determine where the workflow stands. This allows the orchestrator to resume mid-flight even in a fresh conversation.

### Step 0: Branch safety pre-flight

Before anything else, run basic git health checks:

1. **Uncommitted changes**: Run `git status --porcelain`. If there are uncommitted changes, warn the user:
   > WARNING: You have uncommitted changes. Starting the fleet may create conflicts. Commit or stash first?
   > - **Continue** -- proceed with uncommitted changes (risky)
   > - **Stash** -- run `git stash` and continue
   > - **Abort** -- stop and let the user handle it

2. **Detached HEAD**: Run `git branch --show-current`. If empty (detached HEAD), abort:
   > Cannot run fleet on a detached HEAD. Please check out a feature branch first.

3. **Branch freshness** (advisory): Run `git log --oneline HEAD..origin/main 2>/dev/null | wc -l`. If the main branch has commits not in the current branch, advise:
   > Your branch is {N} commits behind main. Consider rebasing before starting implementation to avoid merge conflicts later.

This check runs only once on first invocation. It does NOT block the workflow (except for detached HEAD).

### Step 1: Discover the feature directory

Run `{SCRIPT}` from the repo root to get the feature directory paths as JSON. Parse the output to get `FEATURE_DIR`.

If the script fails (e.g., not on a feature branch):
- If `FEATURE_DESCRIPTION` was provided in `$ARGUMENTS`, proceed directly to Phase 1 -- pass the description to `speckit.specify` and it will create the feature directory.
- If `$ARGUMENTS` is empty, ask the user for the feature description, then start Phase 1.

### Step 2: Check model configuration

Check if `{FEATURE_DIR}/../../../.specify/extensions/fleet/fleet-config.yml` (or the project's config location) has model settings. If the config file doesn't exist or models are set to defaults:

1. **Detect the platform**: Identify which IDE/agent platform you're running in (VS Code Copilot, Claude Code, Cursor, etc.) based on available context.

2. **Primary model**: If `models.primary` is `"auto"`, use whatever model you are currently running as. No action needed -- you ARE the primary model.

3. **Review model**: If `models.review` is `"ask"`, prompt the user:
   > **Model setup (one-time):** The cross-model review (Phase 7) works best with a *different* model than the one running the fleet, to catch blind spots.
   >
   > What model should I use for the review phase? Suggestions:
   > - A different model family (e.g., if you're on Claude, use GPT or Gemini)
   > - A different tier (e.g., if you're on Opus, use Sonnet)
   > - "skip" to skip Phase 7 entirely
   >
   > You can also set this permanently in your fleet config.

4. **Store the choice**: Remember the user's model selection for the duration of this conversation. If they want to persist it, suggest editing the config file.

### Step 3: Probe artifacts in FEATURE_DIR

Check these paths **in order** using the `read` tool. Each check is a file/directory existence AND basic integrity test:

| Check | Path | Existence | Integrity |
|-------|------|-----------|-----------|
| spec.md | `{FEATURE_DIR}/spec.md` | File exists? | Has `## User Stories` or `## Requirements` section? File > 100 bytes? |
| Clarifications | `{FEATURE_DIR}/spec.md` | Contains `## Clarifications` heading? | At least one Q&A pair present? |
| plan.md | `{FEATURE_DIR}/plan.md` | File exists? | Has `## Architecture` or `## Tech Stack` section? File > 200 bytes? |
| checklists/ | `{FEATURE_DIR}/checklists/` | Directory exists and has >=1 file? | Each file > 50 bytes? |
| tasks.md | `{FEATURE_DIR}/tasks.md` | File exists? | Contains at least one `- [ ]` or `- [x]` item? Has `### Phase` heading? |
| .analyze-done | `{FEATURE_DIR}/.analyze-done` | Marker file exists? | -- |
| review.md | `{FEATURE_DIR}/review.md` | File exists? | Contains `## Summary` and verdict table? |
| Implementation | `{FEATURE_DIR}/tasks.md` | All `- [x]`, zero `- [ ]` remaining? | -- |
| Verify extension | `.specify/extensions/verify/extension.yml` | File exists? | -- |
| Verification | `{FEATURE_DIR}/.verify-done` | Marker file exists? | -- |

**Integrity failures are advisory, not blocking.** If a file exists but fails integrity checks, warn the user:
> WARNING: `plan.md` exists but appears incomplete (missing expected sections). It may have been partially generated. Re-run Phase 3 (Plan), or continue with the current file?

### Step 4: Determine the resume phase

Walk the artifact signals **top-down**. The first phase whose artifact is **missing** is where work resumes:

```
if spec.md missing           -> resume at Phase 1 (Specify)
if no ## Clarifications       -> resume at Phase 2 (Clarify)
if plan.md missing           -> resume at Phase 3 (Plan)
if checklists/ empty/missing -> resume at Phase 4 (Checklist)
if tasks.md missing          -> resume at Phase 5 (Tasks)
if .analyze-done missing     -> resume at Phase 6 (Analyze)
if review.md missing         -> resume at Phase 7 (Review)
if tasks.md has `- [ ]`     -> resume at Phase 8 (Implement)
if .verify-done missing      -> resume at Phase 9 (Verify)
if all done                  -> resume at Phase 10 (Tests)
```

### Step 5: Present status and confirm

Show the user a status table and the detected resume point:

```
Feature: {branch name}
Directory: {FEATURE_DIR}

Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [x] checklists/ has 2 files
Phase 5 Tasks        [x] tasks.md found
Phase 6 Analyze      [ ] .analyze-done not found
Phase 7 Review       [ ] --
Phase 8 Implement    [ ] --
Phase 9 Verify       [ ] --
Phase 10 Tests       [ ] --

> Resuming at Phase 6: Analyze
```

Then ask: *"Detected progress above. Resume at Phase {N} ({name}), or override to a different phase?"*

- If user confirms -> create the todo list with completed phases marked as `completed` and resume from Phase N.
- If user provides a phase number or name -> start from that phase instead.
- If FEATURE_DIR doesn't exist -> start from Phase 1, ask for the feature description.

### Edge Cases

- **Implementation partially complete**: If `tasks.md` exists and has a mix of `[x]` and `[ ]`, resume at Phase 8 (Implement). Tell the user how many tasks remain: *"tasks.md: {done}/{total} tasks complete. {remaining} tasks remaining."*
- **Analyze completion marker**: After Phase 6 (Analyze) completes -- whether it produces `remediation.md` or not -- create a marker file `{FEATURE_DIR}/.analyze-done` containing the timestamp. This distinguishes "analyze ran clean" from "analyze never ran." The `.analyze-done` file is the artifact signal for Phase 6, not `remediation.md`.
- **Review can be skipped**: If user opts to skip cross-model review, treat Phase 7 as skipped and proceed to Phase 8.
- **Review found NO failures**: If `review.md` exists and overall verdict is "READY", Phase 7 is complete -- proceed to Phase 8.
- **Review found FAIL items**: If `review.md` has FAIL verdicts, present them and ask user whether to (a) fix the issues by re-running the relevant earlier phase, (b) proceed anyway, or (c) abort.
- **Verify extension not installed**: If `.specify/extensions/verify/extension.yml` doesn't exist, prompt to install. If user declines, skip Phase 9.
- **Verify completion marker**: After Phase 9 (Verify) completes, create `{FEATURE_DIR}/.verify-done` with timestamp. This distinguishes "verify ran" from "verify never ran."
- **Checklists may be skipped**: Some features don't use checklists. If `tasks.md` exists but `checklists/` doesn't, treat Phase 4 as skipped.
- **Fresh branch, no specs dir**: Start from Phase 1. Use `FEATURE_DESCRIPTION` from `$ARGUMENTS` if provided; otherwise ask the user.
- **User says "start over"**: Re-run from Phase 1 regardless of existing artifacts. Warn that this will overwrite existing artifacts and get confirmation.

### Stale Artifact Detection

After determining the resume phase, check for **stale downstream artifacts** -- files generated by an earlier phase that may be outdated because an upstream artifact was modified later.

Compare file modification timestamps in this dependency chain:

```
spec.md -> plan.md -> tasks.md -> .analyze-done -> review.md -> [implementation] -> .verify-done
```

If a file is **newer** than a downstream file that depends on it (e.g., `spec.md` was modified after `plan.md`), warn the user:

> WARNING: **Stale artifact detected**: `plan.md` (modified {date}) was generated before the latest `spec.md` change ({date}). Plan may not reflect current requirements. Re-run Phase 3 (Plan) to update, or proceed with the current plan?

This is advisory only -- the user decides whether to rerun. Do not block the workflow.

## Phase Execution Template

For each phase:
```
1. Mark the phase as in-progress in the todo list
2. Announce: "**Phase N: {Name}** -- delegating to {agent}..."
3. Delegate to the agent with relevant arguments:
   - Phase 1 (Specify): pass FEATURE_DESCRIPTION from $ARGUMENTS as the argument
   - Phase 2 (Clarify): pass the feature description and any user feedback
   - All other phases: pass the feature description and any user-provided refinements
4. Summarize the agent's output concisely
5. Ask: "Ready to proceed to Phase N+1 ({next name}), or would you like to revise?"
6. Wait for user response
7. Mark phase as completed when approved
```

## Phase 7: Cross-Model Review

This phase uses a **different model** than the one that generated plan.md and tasks.md, providing a fresh perspective to catch blind spots.

1. Delegate to `speckit.fleet.review` -- it runs on the **review model** configured in Step 2 (a different model than the primary) and is **read-only**
2. The review agent reads spec.md, plan.md, tasks.md, checklists/, and remediation.md
3. It evaluates 7 dimensions: spec-plan alignment, plan-tasks completeness, dependency ordering, parallelization correctness, feasibility & risk, standards compliance, implementation readiness
4. It outputs a structured review report with PASS/WARN/FAIL verdicts per dimension
5. **Save the review output** to `{FEATURE_DIR}/review.md`
6. Present the summary table to the user:
   - **All PASS / READY**: *"Cross-model review passed. Ready to implement?"*
   - **WARN items**: *"Review found {N} warnings. Proceed to implementation, or address them first?"*
   - **FAIL items**: *"Review found {N} critical issues that should be fixed before implementing."* -- list them and ask which earlier phase to re-run (plan, tasks, or analyze)
7. If user chooses to fix: loop back to the appropriate phase, then re-run review after fixes
8. If user approves: mark Phase 7 complete and proceed to Phase 8 (Implement)

**Note**: Phase 7 (Review) validates design artifacts *before* implementation. Phase 9 (Verify) validates actual code *after* implementation. Both are read-only.

## Phase 9: Post-Implementation Verification

This phase validates that the implemented code matches the specification artifacts. It requires the **verify extension**.

### Extension Installation Check

Before delegating to `speckit.verify`, check if the extension is installed:

1. Check if `.specify/extensions/verify/extension.yml` exists using the `read` tool
2. If **missing**, ask the user:
   > The verify extension is not installed. Install it now?
   > ```
   > specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
   > ```
3. If user approves, run the install command in the terminal
4. If user declines, skip Phase 9 and proceed to Phase 10 (CI)

### Verification Execution

1. Delegate to `speckit.verify` -- it reads spec.md, plan.md, tasks.md, constitution.md and the implemented source files
2. It runs 7 verification checks: task completion, file existence, requirement coverage, scenario & test coverage, spec intent alignment, constitution alignment, design & structure consistency
3. It outputs a verification report with findings, metrics, and next actions
4. Present the summary to the user:
   - **No findings**: *"Verification passed. Ready to run CI?"* -- proceed to Phase 10
   - **Findings exist**: Show the findings grouped by severity (CRITICAL, WARNING, INFO) and enter the **Implement-Verify loop** below

### Implement-Verify Loop

When verification produces findings, run a remediation loop:

```
repeat:
  1. Present findings to user
  2. Ask: "Re-run implementation to address these findings? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with findings as context, then re-run speckit.verify
     - skip  -> exit loop, proceed to Phase 10 with current state
     - abort -> stop the workflow entirely
  3. After re-verify, check findings again
until: no findings remain OR user says skip/abort
```

Rules for the loop:
- **Pass findings as context**: When delegating to `speckit.implement`, include the verification findings so it knows exactly what to fix. Prepend: *"Address the following verification findings: {findings list}"*
- **Suppress sub-agent handoffs** (Operating Rule 6 still applies)
- **Track iterations**: Show the loop count each time -- *"Implement-Verify iteration {N}: {findings_count} findings remaining"*
- **Cap at 3 iterations**: After 3 rounds, if findings persist, warn the user: *"3 remediation iterations completed with {N} findings still remaining. These may require manual intervention. Proceed to CI, or continue?"*
- **Human gate every iteration**: Never auto-loop -- always ask before re-implementing
- **Delta reporting**: After each re-verify, show what changed -- *"Fixed: {N}, New: {N}, Remaining: {N}"*

After the loop exits (no findings or user skips):
1. Create a marker file `{FEATURE_DIR}/.verify-done` containing the timestamp and final findings count
2. Mark Phase 9 complete and proceed to Phase 10 (Tests)

## Phase 10: Tests

After verification, detect and run the project's test suite.

### Test Runner Detection

Detect test runner(s) by checking for these files at the repo root, in order:

| Check | Runner | Command |
|-------|--------|---------|
| `package.json` with `"test"` script | npm/yarn/pnpm | `npm test` (or `yarn test` / `pnpm test` based on lockfile) |
| `*.sln` or `*.slnx` or `*.csproj` | dotnet | `dotnet test` |
| `Makefile` with `test` target | make | `make test` |
| `pytest.ini` or `pyproject.toml` with `[tool.pytest]` | pytest | `pytest` |
| `Cargo.toml` | cargo | `cargo test` |
| `go.mod` | go | `go test ./...` |

If **multiple** runners are detected (e.g., a monorepo with both `package.json` and `*.slnx`), run all of them and report results per runner.

If **no** runner is detected, ask the user: *"No test runner detected. What command runs your tests?"*

### Test Execution

1. Run the detected test command(s) from the repo root
2. Report pass/fail summary with failure details

### CI Remediation Loop

If CI fails, run a remediation loop (same pattern as the Implement-Verify loop):

```
repeat:
  1. Parse test failures -- group by type (compile error, test failure, lint error)
  2. Present failures to user with file locations and error messages
  3. Ask: "Fix these CI failures? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with failure details as context, then re-run CI
     - skip  -> exit loop, leave failures for manual fixing
     - abort -> stop the workflow entirely
  4. After re-run, check CI result again
until: CI passes OR user says skip/abort
```

Rules:
- **Pass failure context**: Include exact error messages, file paths, and test names when delegating to implement
- **Cap at 3 iterations**: After 3 rounds, warn: *"3 CI fix iterations completed, {N} failures remain. These likely need manual debugging."*
- **Human gate every iteration**: Never auto-loop
- **Delta reporting**: *"Fixed: {N} failures, New: {N}, Remaining: {N}"*
- **Distinguish failure types**: Compile errors should be fixed before test failures (they may cause cascading test failures)

### Tests Pass

When all tests pass, proceed to the Completion Summary.

## Error Recovery

### Parallel Task Failure

When a task within a parallel group fails during Phase 8 (Implement):
1. **Let the other in-flight tasks finish** -- don't abort tasks that are already running
2. Report which task(s) failed with error details
3. Offer three options:
   - **Retry failed only** -- re-dispatch only the failed task(s), skip completed ones
   - **Retry entire group** -- re-run all tasks in the parallel group (useful if failure cascaded)
   - **Skip and continue** -- mark the failed task(s) and move on (user can fix manually later)
4. Never auto-retry -- always ask the user

### Sub-Agent Timeout or Crash

If a delegated sub-agent doesn't return (timeout) or returns an error:
1. Report the phase and agent that failed
2. Offer to retry the same phase or skip it
3. If the same agent fails twice in a row, suggest the user run it manually (`/speckit.{agent}`) and then resume the fleet

## Phase Rollback

At any human gate, the user may say "go back to Phase N" or "rollback to plan." The fleet supports this:

1. **Identify the target phase**: Parse the user's request to determine which phase to roll back to.
2. **Warn about downstream invalidation**: All artifacts generated by phases *after* the target phase are now potentially stale. Show:
   > Rolling back to Phase {N} ({name}). The following artifacts may be invalidated:
   > - plan.md (Phase 3)
   > - tasks.md (Phase 5)
   > - Implementation (Phase 8)
   >
   > These will be regenerated as the workflow proceeds. Continue?
3. **Delete marker files only**: Remove `.analyze-done`, `.verify-done`, and `review.md` for invalidated phases. Do NOT delete spec.md, plan.md, or tasks.md -- they'll be overwritten when the phase re-runs.
4. **Update the todo list**: Reset all phases from the target phase onward to `not-started`.
5. **Resume from the target phase**: Follow the normal phase execution flow from that point.

**Constraints**:
- Cannot rollback during an active sub-agent delegation -- wait for it to complete first
- Rollback to Phase 1 (Specify) with "start over" requires explicit confirmation since it regenerates everything

## Completion Summary

After Phase 10 completes (CI passes or user skips CI), present a structured summary:

```
## Fleet Complete

Feature: {feature name}
Branch: {branch name}
Duration: Phases 1-10 ({phases completed}/{phases total}, {phases skipped} skipped)

### Artifacts Generated
- spec.md -- feature specification ({word count} words, {user stories count} user stories)
- plan.md -- technical plan ({components count} components)
- tasks.md -- {total tasks} tasks ({completed} completed, {remaining} remaining)
- review.md -- cross-model review (verdict: {verdict})

### Implementation
- Files created: {count}
- Files modified: {count}
- Tests added: {count}

### Quality Gates
- Analyze: {pass/findings count}
- Cross-model review: {verdict}
- Verify: {pass/findings count} ({iterations} iterations)
- CI: {pass/fail}

### Git
- Commits: {list of WIP commits if any}
- Ready to push: {yes/no}
```

After the summary, offer:
1. *"Push to remote and create a PR?"* (if the user wants)
2. *"View any artifact? (spec, plan, tasks, review)"*
\ No newline at end of file
diff --git a/.claude/commands/speckit.implement.md b/.claude/commands/speckit.implement.md
new file mode 100644
index 0000000..5847f61
--- /dev/null
+++ b/.claude/commands/speckit.implement.md
@@ -0,0 +1,198 @@
---
description: Execute the implementation plan by processing and executing all tasks defined in tasks.md
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Pre-Execution Checks

**Check for extension hooks (before implementation)**:
- Check if `.specify/extensions.yml` exists in the project root.
- If it exists, read it and look for entries under the `hooks.before_implement` key
- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
- For each executable hook, output the following based on its `optional` flag:
  - **Optional hook** (`optional: true`):
    ```
    ## Extension Hooks

    **Optional Pre-Hook**: {extension}
    Command: `/{command}`
    Description: {description}

    Prompt: {prompt}
    To execute: `/{command}`
    ```
  - **Mandatory hook** (`optional: false`):
    ```
    ## Extension Hooks

    **Automatic Pre-Hook**: {extension}
    Executing: `/{command}`
    EXECUTE_COMMAND: {command}
    
    Wait for the result of the hook command before proceeding to the Outline.
    ```
- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

## Outline

1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
   - Scan all checklist files in the checklists/ directory
   - For each checklist, count:
     - Total items: All lines matching `- [ ]` or `- [X]` or `- [x]`
     - Completed items: Lines matching `- [X]` or `- [x]`
     - Incomplete items: Lines matching `- [ ]`
   - Create a status table:

     ```text
     | Checklist | Total | Completed | Incomplete | Status |
     |-----------|-------|-----------|------------|--------|
     | ux.md     | 12    | 12        | 0          | ✓ PASS |
     | test.md   | 8     | 5         | 3          | ✗ FAIL |
     | security.md | 6   | 6         | 0          | ✓ PASS |
     ```

   - Calculate overall status:
     - **PASS**: All checklists have 0 incomplete items
     - **FAIL**: One or more checklists have incomplete items

   - **If any checklist is incomplete**:
     - Display the table with incomplete item counts
     - **STOP** and ask: "Some checklists are incomplete. Do you want to proceed with implementation anyway? (yes/no)"
     - Wait for user response before continuing
     - If user says "no" or "wait" or "stop", halt execution
     - If user says "yes" or "proceed" or "continue", proceed to step 3

   - **If all checklists are complete**:
     - Display the table showing all checklists passed
     - Automatically proceed to step 3

3. Load and analyze the implementation context:
   - **REQUIRED**: Read tasks.md for the complete task list and execution plan
   - **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
   - **IF EXISTS**: Read data-model.md for entities and relationships
   - **IF EXISTS**: Read contracts/ for API specifications and test requirements
   - **IF EXISTS**: Read research.md for technical decisions and constraints
   - **IF EXISTS**: Read quickstart.md for integration scenarios

4. **Project Setup Verification**:
   - **REQUIRED**: Create/verify ignore files based on actual project setup:

   **Detection & Creation Logic**:
   - Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):

     ```sh
     git rev-parse --git-dir 2>/dev/null
     ```

   - Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
   - Check if .eslintrc* exists → create/verify .eslintignore
   - Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
   - Check if .prettierrc* exists → create/verify .prettierignore
   - Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
   - Check if terraform files (*.tf) exist → create/verify .terraformignore
   - Check if .helmignore needed (helm charts present) → create/verify .helmignore

   **If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
   **If ignore file missing**: Create with full pattern set for detected technology

   **Common Patterns by Technology** (from plan.md tech stack):
   - **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
   - **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
   - **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
   - **C#/.NET**: `bin/`, `obj/`, `*.user`, `*.suo`, `packages/`
   - **Go**: `*.exe`, `*.test`, `vendor/`, `*.out`
   - **Ruby**: `.bundle/`, `log/`, `tmp/`, `*.gem`, `vendor/bundle/`
   - **PHP**: `vendor/`, `*.log`, `*.cache`, `*.env`
   - **Rust**: `target/`, `debug/`, `release/`, `*.rs.bk`, `*.rlib`, `*.prof*`, `.idea/`, `*.log`, `.env*`
   - **Kotlin**: `build/`, `out/`, `.gradle/`, `.idea/`, `*.class`, `*.jar`, `*.iml`, `*.log`, `.env*`
   - **C++**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.so`, `*.a`, `*.exe`, `*.dll`, `.idea/`, `*.log`, `.env*`
   - **C**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.a`, `*.so`, `*.exe`, `*.dll`, `autom4te.cache/`, `config.status`, `config.log`, `.idea/`, `*.log`, `.env*`
   - **Swift**: `.build/`, `DerivedData/`, `*.swiftpm/`, `Packages/`
   - **R**: `.Rproj.user/`, `.Rhistory`, `.RData`, `.Ruserdata`, `*.Rproj`, `packrat/`, `renv/`
   - **Universal**: `.DS_Store`, `Thumbs.db`, `*.tmp`, `*.swp`, `.vscode/`, `.idea/`

   **Tool-Specific Patterns**:
   - **Docker**: `node_modules/`, `.git/`, `Dockerfile*`, `.dockerignore`, `*.log*`, `.env*`, `coverage/`
   - **ESLint**: `node_modules/`, `dist/`, `build/`, `coverage/`, `*.min.js`
   - **Prettier**: `node_modules/`, `dist/`, `build/`, `coverage/`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
   - **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
   - **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`

5. Parse tasks.md structure and extract:
   - **Task phases**: Setup, Tests, Core, Integration, Polish
   - **Task dependencies**: Sequential vs parallel execution rules
   - **Task details**: ID, description, file paths, parallel markers [P]
   - **Execution flow**: Order and dependency requirements

6. Execute implementation following the task plan:
   - **Phase-by-phase execution**: Complete each phase before moving to the next
   - **Respect dependencies**: Run sequential tasks in order, parallel tasks [P] can run together  
   - **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
   - **File-based coordination**: Tasks affecting the same files must run sequentially
   - **Validation checkpoints**: Verify each phase completion before proceeding

7. Implementation execution rules:
   - **Setup first**: Initialize project structure, dependencies, configuration
   - **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
   - **Core development**: Implement models, services, CLI commands, endpoints
   - **Integration work**: Database connections, middleware, logging, external services
   - **Polish and validation**: Unit tests, performance optimization, documentation

8. Progress tracking and error handling:
   - Report progress after each completed task
   - Halt execution if any non-parallel task fails
   - For parallel tasks [P], continue with successful tasks, report failed ones
   - Provide clear error messages with context for debugging
   - Suggest next steps if implementation cannot proceed
   - **IMPORTANT** For completed tasks, make sure to mark the task off as [X] in the tasks file.

9. Completion validation:
   - Verify all required tasks are completed
   - Check that implemented features match the original specification
   - Validate that tests pass and coverage meets requirements
   - Confirm the implementation follows the technical plan
   - Report final status with summary of completed work

Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.

10. **Check for extension hooks**: After completion validation, check if `.specify/extensions.yml` exists in the project root.
    - If it exists, read it and look for entries under the `hooks.after_implement` key
    - If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
    - Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
    - For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
      - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
      - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
    - For each executable hook, output the following based on its `optional` flag:
      - **Optional hook** (`optional: true`):
        ```
        ## Extension Hooks

        **Optional Hook**: {extension}
        Command: `/{command}`
        Description: {description}

        Prompt: {prompt}
        To execute: `/{command}`
        ```
      - **Mandatory hook** (`optional: false`):
        ```
        ## Extension Hooks

        **Automatic Hook**: {extension}
        Executing: `/{command}`
        EXECUTE_COMMAND: {command}
        ```
    - If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently
diff --git a/.claude/commands/speckit.plan.md b/.claude/commands/speckit.plan.md
new file mode 100644
index 0000000..e3c0764
--- /dev/null
+++ b/.claude/commands/speckit.plan.md
@@ -0,0 +1,153 @@
---
description: Execute the implementation planning workflow using the plan template to generate design artifacts.
handoffs: 
  - label: Create Tasks
    agent: speckit.tasks
    prompt: Break the plan into tasks
    send: true
  - label: Create Checklist
    agent: speckit.checklist
    prompt: Create a checklist for the following domain...
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Pre-Execution Checks

**Check for extension hooks (before planning)**:
- Check if `.specify/extensions.yml` exists in the project root.
- If it exists, read it and look for entries under the `hooks.before_plan` key
- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
- For each executable hook, output the following based on its `optional` flag:
  - **Optional hook** (`optional: true`):
    ```
    ## Extension Hooks

    **Optional Pre-Hook**: {extension}
    Command: `/{command}`
    Description: {description}

    Prompt: {prompt}
    To execute: `/{command}`
    ```
  - **Mandatory hook** (`optional: false`):
    ```
    ## Extension Hooks

    **Automatic Pre-Hook**: {extension}
    Executing: `/{command}`
    EXECUTE_COMMAND: {command}

    Wait for the result of the hook command before proceeding to the Outline.
    ```
- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

## Outline

1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).

3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
   - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
   - Fill Constitution Check section from constitution
   - Evaluate gates (ERROR if violations unjustified)
   - Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
   - Phase 1: Generate data-model.md, contracts/, quickstart.md
   - Phase 1: Update agent context by running the agent script
   - Re-evaluate Constitution Check post-design

4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.

5. **Check for extension hooks**: After reporting, check if `.specify/extensions.yml` exists in the project root.
   - If it exists, read it and look for entries under the `hooks.after_plan` key
   - If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
   - Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
   - For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
     - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
     - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
   - For each executable hook, output the following based on its `optional` flag:
     - **Optional hook** (`optional: true`):
       ```
       ## Extension Hooks

       **Optional Hook**: {extension}
       Command: `/{command}`
       Description: {description}

       Prompt: {prompt}
       To execute: `/{command}`
       ```
     - **Mandatory hook** (`optional: false`):
       ```
       ## Extension Hooks

       **Automatic Hook**: {extension}
       Executing: `/{command}`
       EXECUTE_COMMAND: {command}
       ```
   - If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

## Phases

### Phase 0: Outline & Research

1. **Extract unknowns from Technical Context** above:
   - For each NEEDS CLARIFICATION → research task
   - For each dependency → best practices task
   - For each integration → patterns task

2. **Generate and dispatch research agents**:

   ```text
   For each unknown in Technical Context:
     Task: "Research {unknown} for {feature context}"
   For each technology choice:
     Task: "Find best practices for {tech} in {domain}"
   ```

3. **Consolidate findings** in `research.md` using format:
   - Decision: [what was chosen]
   - Rationale: [why chosen]
   - Alternatives considered: [what else evaluated]

**Output**: research.md with all NEEDS CLARIFICATION resolved

### Phase 1: Design & Contracts

**Prerequisites:** `research.md` complete

1. **Extract entities from feature spec** → `data-model.md`:
   - Entity name, fields, relationships
   - Validation rules from requirements
   - State transitions if applicable

2. **Define interface contracts** (if project has external interfaces) → `/contracts/`:
   - Identify what interfaces the project exposes to users or other systems
   - Document the contract format appropriate for the project type
   - Examples: public APIs for libraries, command schemas for CLI tools, endpoints for web services, grammars for parsers, UI contracts for applications
   - Skip if project is purely internal (build scripts, one-off tools, etc.)

3. **Agent context update**:
   - Run `.specify/scripts/bash/update-agent-context.sh claude`
   - These scripts detect which AI agent is in use
   - Update the appropriate agent-specific context file
   - Add only new technology from current plan
   - Preserve manual additions between markers

**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file

## Key rules

- Use absolute paths
- ERROR on gate failures or unresolved clarifications
diff --git a/.claude/commands/speckit.review.md b/.claude/commands/speckit.review.md
new file mode 100644
index 0000000..7b840b7
--- /dev/null
+++ b/.claude/commands/speckit.review.md
@@ -0,0 +1,117 @@
---
description: Cross-model evaluation of plan.md and tasks.md before implementation.
  Reviews feasibility, completeness, dependency ordering, risk, and parallelization
  correctness using a different model than was used to generate the artifacts.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
user-invocable: false
agents: []
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

---

You are a **Pre-Implementation Reviewer** -- a critical evaluator who reviews the design artifacts (plan.md, tasks.md, spec.md) produced by earlier workflow phases. Your purpose is to catch issues that the generating model may have been blind to, before implementation begins.

**STRICTLY READ-ONLY**: Do NOT modify any files. Output a structured review report only.

## What You Review

Run `{SCRIPT}` from the repo root to discover `FEATURE_DIR`. Then read these artifacts:

- `spec.md` -- the feature specification (requirements, user stories)
- `plan.md` -- the technical plan (architecture, tech stack, file structure)
- `tasks.md` -- the task breakdown (phased, dependency-ordered, with [P] markers)
- `checklists/` -- any requirement quality checklists (if present)
- `remediation.md` -- analyze output (if present)

## Review Dimensions

Evaluate across these 7 dimensions. For each, assign a verdict: **PASS**, **WARN**, or **FAIL**.

### 1. Spec-Plan Alignment
- Does plan.md address every user story in spec.md?
- Are there plan decisions that contradict spec requirements?
- Are non-functional requirements (performance, security, accessibility) covered in the plan?

### 2. Plan-Tasks Completeness
- Does every architectural component in plan.md have corresponding tasks in tasks.md?
- Are there tasks that reference files/patterns not described in plan.md?
- Are test tasks present for critical paths?

### 3. Dependency Ordering
- Are task phases ordered correctly? (setup -> foundational -> stories -> polish)
- Do any tasks reference files/interfaces that haven't been created by an earlier task?
- Are foundational tasks truly blocking, or could some be parallelized?

### 4. Parallelization Correctness
- Are `[P]` markers accurate? (Do tasks marked parallel truly touch different files with no dependency?)
- Are there tasks NOT marked `[P]` that could be parallelized?
- Do `<!-- parallel-group: N -->` groupings respect the max-3 constraint?
- Are there same-file conflicts hidden within a parallel group?

### 5. Feasibility & Risk
- Are there tasks that seem too large? (If a single task touches >3 files or >200 LOC, flag it)
- Are there technology choices in plan.md that contradict the project's existing stack?
- Are there missing error handling, edge case, or migration tasks?
- Does the task count seem proportional to the feature complexity?

### 6. Constitution & Standards Compliance
- Read `.specify/memory/constitution.md` and check plan aligns with project principles
- Check that testing approach matches the project's testing standards (80% coverage, TDD if required)
- Verify security considerations are addressed (path validation, input sanitization, etc.)

### 7. Implementation Readiness
- Is every task specific enough for an LLM to execute without ambiguity?
- Do all tasks include exact file paths?
- Are acceptance criteria clear for each user story phase?

## Output Format

```markdown
# Pre-Implementation Review

**Feature**: {feature name from spec.md}
**Artifacts reviewed**: spec.md, plan.md, tasks.md, [others if present]
**Review model**: {your model name} (should be different from the model that generated the artifacts)
**Generating model**: {model used for Phases 1-6, if known}

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS/WARN/FAIL | brief note |
| Plan-Tasks Completeness | PASS/WARN/FAIL | brief note |
| Dependency Ordering | PASS/WARN/FAIL | brief note |
| Parallelization Correctness | PASS/WARN/FAIL | brief note |
| Feasibility & Risk | PASS/WARN/FAIL | brief note |
| Standards Compliance | PASS/WARN/FAIL | brief note |
| Implementation Readiness | PASS/WARN/FAIL | brief note |

**Overall**: READY / READY WITH WARNINGS / NOT READY

## Findings

### Critical (FAIL -- must fix before implementing)
1. ...

### Warnings (WARN -- recommend fixing, can proceed)
1. ...

### Observations (informational)
1. ...

## Recommended Actions
- [ ] {specific action to address each FAIL/WARN}
```
\ No newline at end of file
diff --git a/.claude/commands/speckit.specify.md b/.claude/commands/speckit.specify.md
new file mode 100644
index 0000000..8ee3d15
--- /dev/null
+++ b/.claude/commands/speckit.specify.md
@@ -0,0 +1,300 @@
---
description: Create or update the feature specification from a natural language feature description.
handoffs: 
  - label: Build Technical Plan
    agent: speckit.plan
    prompt: Create a plan for the spec. I am building with...
  - label: Clarify Spec Requirements
    agent: speckit.clarify
    prompt: Clarify specification requirements
    send: true
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Pre-Execution Checks

**Check for extension hooks (before specification)**:
- Check if `.specify/extensions.yml` exists in the project root.
- If it exists, read it and look for entries under the `hooks.before_specify` key
- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
- For each executable hook, output the following based on its `optional` flag:
  - **Optional hook** (`optional: true`):
    ```
    ## Extension Hooks

    **Optional Pre-Hook**: {extension}
    Command: `/{command}`
    Description: {description}

    Prompt: {prompt}
    To execute: `/{command}`
    ```
  - **Mandatory hook** (`optional: false`):
    ```
    ## Extension Hooks

    **Automatic Pre-Hook**: {extension}
    Executing: `/{command}`
    EXECUTE_COMMAND: {command}

    Wait for the result of the hook command before proceeding to the Outline.
    ```
- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

## Outline

The text the user typed after `/speckit.specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.

Given that feature description, do this:

1. **Generate a concise short name** (2-4 words) for the branch:
   - Analyze the feature description and extract the most meaningful keywords
   - Create a 2-4 word short name that captures the essence of the feature
   - Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
   - Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
   - Keep it concise but descriptive enough to understand the feature at a glance
   - Examples:
     - "I want to add user authentication" → "user-auth"
     - "Implement OAuth2 integration for the API" → "oauth2-api-integration"
     - "Create a dashboard for analytics" → "analytics-dashboard"
     - "Fix payment processing timeout bug" → "fix-payment-timeout"

2. **Create the feature branch** by running the script with `--short-name` (and `--json`), and do NOT pass `--number` (the script auto-detects the next globally available number across all branches and spec directories):

   - Bash example: `.specify/scripts/bash/create-new-feature.sh "$ARGUMENTS" --json --short-name "user-auth" "Add user authentication"`
   - PowerShell example: `.specify/scripts/bash/create-new-feature.sh "$ARGUMENTS" -Json -ShortName "user-auth" "Add user authentication"`

   **IMPORTANT**:
   - Do NOT pass `--number` — the script determines the correct next number automatically
   - Always include the JSON flag (`--json` for Bash, `-Json` for PowerShell) so the output can be parsed reliably
   - You must only ever run this script once per feature
   - The JSON is provided in the terminal as output - always refer to it to get the actual content you're looking for
   - The JSON output will contain BRANCH_NAME and SPEC_FILE paths
   - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot")

3. Load `.specify/templates/spec-template.md` to understand required sections.

4. Follow this execution flow:

    1. Parse user description from Input
       If empty: ERROR "No feature description provided"
    2. Extract key concepts from description
       Identify: actors, actions, data, constraints
    3. For unclear aspects:
       - Make informed guesses based on context and industry standards
       - Only mark with [NEEDS CLARIFICATION: specific question] if:
         - The choice significantly impacts feature scope or user experience
         - Multiple reasonable interpretations exist with different implications
         - No reasonable default exists
       - **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
       - Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
    4. Fill User Scenarios & Testing section
       If no clear user flow: ERROR "Cannot determine user scenarios"
    5. Generate Functional Requirements
       Each requirement must be testable
       Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
    6. Define Success Criteria
       Create measurable, technology-agnostic outcomes
       Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
       Each criterion must be verifiable without implementation details
    7. Identify Key Entities (if data involved)
    8. Return: SUCCESS (spec ready for planning)

5. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.

6. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:

   a. **Create Spec Quality Checklist**: Generate a checklist file at `FEATURE_DIR/checklists/requirements.md` using the checklist template structure with these validation items:

      ```markdown
      # Specification Quality Checklist: [FEATURE NAME]
      
      **Purpose**: Validate specification completeness and quality before proceeding to planning
      **Created**: [DATE]
      **Feature**: [Link to spec.md]
      
      ## Content Quality
      
      - [ ] No implementation details (languages, frameworks, APIs)
      - [ ] Focused on user value and business needs
      - [ ] Written for non-technical stakeholders
      - [ ] All mandatory sections completed
      
      ## Requirement Completeness
      
      - [ ] No [NEEDS CLARIFICATION] markers remain
      - [ ] Requirements are testable and unambiguous
      - [ ] Success criteria are measurable
      - [ ] Success criteria are technology-agnostic (no implementation details)
      - [ ] All acceptance scenarios are defined
      - [ ] Edge cases are identified
      - [ ] Scope is clearly bounded
      - [ ] Dependencies and assumptions identified
      
      ## Feature Readiness
      
      - [ ] All functional requirements have clear acceptance criteria
      - [ ] User scenarios cover primary flows
      - [ ] Feature meets measurable outcomes defined in Success Criteria
      - [ ] No implementation details leak into specification
      
      ## Notes
      
      - Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
      ```

   b. **Run Validation Check**: Review the spec against each checklist item:
      - For each item, determine if it passes or fails
      - Document specific issues found (quote relevant spec sections)

   c. **Handle Validation Results**:

      - **If all items pass**: Mark checklist complete and proceed to step 7

      - **If items fail (excluding [NEEDS CLARIFICATION])**:
        1. List the failing items and specific issues
        2. Update the spec to address each issue
        3. Re-run validation until all items pass (max 3 iterations)
        4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user

      - **If [NEEDS CLARIFICATION] markers remain**:
        1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
        2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
        3. For each clarification needed (max 3), present options to user in this format:

           ```markdown
           ## Question [N]: [Topic]
           
           **Context**: [Quote relevant spec section]
           
           **What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
           
           **Suggested Answers**:
           
           | Option | Answer | Implications |
           |--------|--------|--------------|
           | A      | [First suggested answer] | [What this means for the feature] |
           | B      | [Second suggested answer] | [What this means for the feature] |
           | C      | [Third suggested answer] | [What this means for the feature] |
           | Custom | Provide your own answer | [Explain how to provide custom input] |
           
           **Your choice**: _[Wait for user response]_
           ```

        4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
           - Use consistent spacing with pipes aligned
           - Each cell should have spaces around content: `| Content |` not `|Content|`
           - Header separator must have at least 3 dashes: `|--------|`
           - Test that the table renders correctly in markdown preview
        5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
        6. Present all questions together before waiting for responses
        7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
        8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
        9. Re-run validation after all clarifications are resolved

   d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status

7. Report completion with branch name, spec file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).

8. **Check for extension hooks**: After reporting completion, check if `.specify/extensions.yml` exists in the project root.
   - If it exists, read it and look for entries under the `hooks.after_specify` key
   - If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
   - Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
   - For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
     - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
     - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
   - For each executable hook, output the following based on its `optional` flag:
     - **Optional hook** (`optional: true`):
       ```
       ## Extension Hooks

       **Optional Hook**: {extension}
       Command: `/{command}`
       Description: {description}

       Prompt: {prompt}
       To execute: `/{command}`
       ```
     - **Mandatory hook** (`optional: false`):
       ```
       ## Extension Hooks

       **Automatic Hook**: {extension}
       Executing: `/{command}`
       EXECUTE_COMMAND: {command}
       ```
   - If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.

## Quick Guidelines

- Focus on **WHAT** users need and **WHY**.
- Avoid HOW to implement (no tech stack, APIs, code structure).
- Written for business stakeholders, not developers.
- DO NOT create any checklists that are embedded in the spec. That will be a separate command.

### Section Requirements

- **Mandatory sections**: Must be completed for every feature
- **Optional sections**: Include only when relevant to the feature
- When a section doesn't apply, remove it entirely (don't leave as "N/A")

### For AI Generation

When creating this spec from a user prompt:

1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
2. **Document assumptions**: Record reasonable defaults in the Assumptions section
3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
   - Significantly impact feature scope or user experience
   - Have multiple reasonable interpretations with different implications
   - Lack any reasonable default
4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
6. **Common areas needing clarification** (only if no reasonable default exists):
   - Feature scope and boundaries (include/exclude specific use cases)
   - User types and permissions (if multiple conflicting interpretations possible)
   - Security/compliance requirements (when legally/financially significant)

**Examples of reasonable defaults** (don't ask about these):

- Data retention: Industry-standard practices for the domain
- Performance targets: Standard web/mobile app expectations unless specified
- Error handling: User-friendly messages with appropriate fallbacks
- Authentication method: Standard session-based or OAuth2 for web apps
- Integration patterns: Use project-appropriate patterns (REST/GraphQL for web services, function calls for libraries, CLI args for tools, etc.)

### Success Criteria Guidelines

Success criteria must be:

1. **Measurable**: Include specific metrics (time, percentage, count, rate)
2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
3. **User-focused**: Describe outcomes from user/business perspective, not system internals
4. **Verifiable**: Can be tested/validated without knowing implementation details

**Good examples**:

- "Users can complete checkout in under 3 minutes"
- "System supports 10,000 concurrent users"
- "95% of searches return results in under 1 second"
- "Task completion rate improves by 40%"

**Bad examples** (implementation-focused):

- "API response time is under 200ms" (too technical, use "Users see results instantly")
- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
- "React components render efficiently" (framework-specific)
- "Redis cache hit rate above 80%" (technology-specific)
diff --git a/.claude/commands/speckit.tasks.md b/.claude/commands/speckit.tasks.md
new file mode 100644
index 0000000..d021892
--- /dev/null
+++ b/.claude/commands/speckit.tasks.md
@@ -0,0 +1,200 @@
---
description: Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts.
handoffs: 
  - label: Analyze For Consistency
    agent: speckit.analyze
    prompt: Run a project analysis for consistency
    send: true
  - label: Implement Project
    agent: speckit.implement
    prompt: Start the implementation in phases
    send: true
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Pre-Execution Checks

**Check for extension hooks (before tasks generation)**:
- Check if `.specify/extensions.yml` exists in the project root.
- If it exists, read it and look for entries under the `hooks.before_tasks` key
- If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
- Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
- For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
  - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
  - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
- For each executable hook, output the following based on its `optional` flag:
  - **Optional hook** (`optional: true`):
    ```
    ## Extension Hooks

    **Optional Pre-Hook**: {extension}
    Command: `/{command}`
    Description: {description}

    Prompt: {prompt}
    To execute: `/{command}`
    ```
  - **Mandatory hook** (`optional: false`):
    ```
    ## Extension Hooks

    **Automatic Pre-Hook**: {extension}
    Executing: `/{command}`
    EXECUTE_COMMAND: {command}
    
    Wait for the result of the hook command before proceeding to the Outline.
    ```
- If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

## Outline

1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

2. **Load design documents**: Read from FEATURE_DIR:
   - **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)
   - **Optional**: data-model.md (entities), contracts/ (interface contracts), research.md (decisions), quickstart.md (test scenarios)
   - Note: Not all projects have all documents. Generate tasks based on what's available.

3. **Execute task generation workflow**:
   - Load plan.md and extract tech stack, libraries, project structure
   - Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
   - If data-model.md exists: Extract entities and map to user stories
   - If contracts/ exists: Map interface contracts to user stories
   - If research.md exists: Extract decisions for setup tasks
   - Generate tasks organized by user story (see Task Generation Rules below)
   - Generate dependency graph showing user story completion order
   - Create parallel execution examples per user story
   - Validate task completeness (each user story has all needed tasks, independently testable)

4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
   - Correct feature name from plan.md
   - Phase 1: Setup tasks (project initialization)
   - Phase 2: Foundational tasks (blocking prerequisites for all user stories)
   - Phase 3+: One phase per user story (in priority order from spec.md)
   - Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
   - Final Phase: Polish & cross-cutting concerns
   - All tasks must follow the strict checklist format (see Task Generation Rules below)
   - Clear file paths for each task
   - Dependencies section showing story completion order
   - Parallel execution examples per story
   - Implementation strategy section (MVP first, incremental delivery)

5. **Report**: Output path to generated tasks.md and summary:
   - Total task count
   - Task count per user story
   - Parallel opportunities identified
   - Independent test criteria for each story
   - Suggested MVP scope (typically just User Story 1)
   - Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)

6. **Check for extension hooks**: After tasks.md is generated, check if `.specify/extensions.yml` exists in the project root.
   - If it exists, read it and look for entries under the `hooks.after_tasks` key
   - If the YAML cannot be parsed or is invalid, skip hook checking silently and continue normally
   - Filter out hooks where `enabled` is explicitly `false`. Treat hooks without an `enabled` field as enabled by default.
   - For each remaining hook, do **not** attempt to interpret or evaluate hook `condition` expressions:
     - If the hook has no `condition` field, or it is null/empty, treat the hook as executable
     - If the hook defines a non-empty `condition`, skip the hook and leave condition evaluation to the HookExecutor implementation
   - For each executable hook, output the following based on its `optional` flag:
     - **Optional hook** (`optional: true`):
       ```
       ## Extension Hooks

       **Optional Hook**: {extension}
       Command: `/{command}`
       Description: {description}

       Prompt: {prompt}
       To execute: `/{command}`
       ```
     - **Mandatory hook** (`optional: false`):
       ```
       ## Extension Hooks

       **Automatic Hook**: {extension}
       Executing: `/{command}`
       EXECUTE_COMMAND: {command}
       ```
   - If no hooks are registered or `.specify/extensions.yml` does not exist, skip silently

Context for task generation: $ARGUMENTS

The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.

## Task Generation Rules

**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.

**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.

### Checklist Format (REQUIRED)

Every task MUST strictly follow this format:

```text
- [ ] [TaskID] [P?] [Story?] Description with file path
```

**Format Components**:

1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
4. **[Story] label**: REQUIRED for user story phase tasks only
   - Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
   - Setup phase: NO story label
   - Foundational phase: NO story label  
   - User Story phases: MUST have story label
   - Polish phase: NO story label
5. **Description**: Clear action with exact file path

**Examples**:

- ✅ CORRECT: `- [ ] T001 Create project structure per implementation plan`
- ✅ CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
- ✅ CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
- ✅ CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
- ❌ WRONG: `- [ ] Create User model` (missing ID and Story label)
- ❌ WRONG: `T001 [US1] Create model` (missing checkbox)
- ❌ WRONG: `- [ ] [US1] Create User model` (missing Task ID)
- ❌ WRONG: `- [ ] T001 [US1] Create model` (missing file path)

### Task Organization

1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
   - Each user story (P1, P2, P3...) gets its own phase
   - Map all related components to their story:
     - Models needed for that story
     - Services needed for that story
     - Interfaces/UI needed for that story
     - If tests requested: Tests specific to that story
   - Mark story dependencies (most stories should be independent)

2. **From Contracts**:
   - Map each interface contract → to the user story it serves
   - If tests requested: Each interface contract → contract test task [P] before implementation in that story's phase

3. **From Data Model**:
   - Map each entity to the user story(ies) that need it
   - If entity serves multiple stories: Put in earliest story or Setup phase
   - Relationships → service layer tasks in appropriate story phase

4. **From Setup/Infrastructure**:
   - Shared infrastructure → Setup phase (Phase 1)
   - Foundational/blocking tasks → Foundational phase (Phase 2)
   - Story-specific setup → within that story's phase

### Phase Structure

- **Phase 1**: Setup (project initialization)
- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
  - Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
  - Each phase should be a complete, independently testable increment
- **Final Phase**: Polish & Cross-Cutting Concerns
diff --git a/.claude/commands/speckit.taskstoissues.md b/.claude/commands/speckit.taskstoissues.md
new file mode 100644
index 0000000..0799191
--- /dev/null
+++ b/.claude/commands/speckit.taskstoissues.md
@@ -0,0 +1,30 @@
---
description: Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts.
tools: ['github/github-mcp-server/issue_write']
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Outline

1. Run `.specify/scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. From the executed script, extract the path to **tasks**.
1. Get the Git remote by running:

```bash
git config --get remote.origin.url
```

> [!CAUTION]
> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL

1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.

> [!CAUTION]
> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL
diff --git a/.claude/commands/speckit.verify.md b/.claude/commands/speckit.verify.md
new file mode 100644
index 0000000..df33b08
--- /dev/null
+++ b/.claude/commands/speckit.verify.md
@@ -0,0 +1,214 @@
---
description: Perform a non-destructive post-implementation verification gate validating
  the implementation against spec.md, plan.md, tasks.md, and constitution.md.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks
  ps: scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
handoffs:
- label: Address findings and re-implement
  agent: speckit.implement
  prompt: Address the verification findings and re-run implementation to resolve issues
- label: Re-analyze specification consistency
  agent: speckit.analyze
  prompt: Re-analyze specification consistency based on verification findings
---


<!-- Extension: verify -->
<!-- Config: .specify/extensions/verify/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Validate the implementation against its specification artifacts (`spec.md`, `plan.md`, `tasks.md`, `constitution.md`). This command MUST run only after `/speckit.implement` has completed.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this verification scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, tasks or implementation—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.verify`.

## Execution Steps

### 1. Initialize Verification Context

Run `{SCRIPT}` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort if SPEC or TASKS is missing (instruct the user to run the missing prerequisite command). PLAN and constitution are optional — checks that depend on them are skipped gracefully.
Abort if TASKS has no completed tasks.
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Functional Requirements
- User Stories and Acceptance Criteria
- Scenarios
- Edge Cases (if present)

**From plan.md (optional):**

- Architecture/stack choices
- Data Model references
- Project structure (directory layout)

**From tasks.md:**

- Task IDs
- Completion status
- Descriptions
- Phase grouping
- Referenced file paths
- Count total tasks and completed tasks

**From constitution (optional):**

- Load `.specify/memory/constitution.md` for principle validation
- If missing or placeholder: skip constitution checks, emit Info finding

### 3. Identify Implementation Scope

Build the set of files to verify from tasks.md.

- Parse all tasks in tasks.md — both completed (`[x]`/`[X]`) and incomplete (`[ ]`)
- Extract file paths referenced in each task description
- Build **REVIEW_FILES** set from completed task file paths
- Track **INCOMPLETE_TASK_FILES** from incomplete tasks (used by check C)

### 4. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Task inventory**: Each task with ID, completion status, referenced file paths, and phase grouping
- **Implementation mapping**: Map each completed task to its referenced file paths
- **File inventory**: All REVIEW_FILES with existence verification — flag any task-referenced file that does not exist on disk
- **Requirements inventory**: Each functional requirement with a stable key — map to tasks and REVIEW_FILES for implementation evidence (evidence = file in REVIEW_FILES containing keyword/ID match, function signatures, or code paths that address the requirement)
- **Spec intent references**: User stories, acceptance criteria, and scenarios from spec.md
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 5. Verification Checks (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Task Completion

- Compare completed (`[x]`/`[X]`) vs total tasks
- Flag majority incomplete vs minority incomplete

#### B. File Existence

- Task-referenced files that do not exist on disk
- Tasks referencing ambiguous or unresolvable paths

#### C. Requirement Coverage

- Requirements with no implementation evidence in REVIEW_FILES
- Requirements whose tasks are all incomplete

#### D. Scenario & Test Coverage

- Spec scenarios with no corresponding test or code path
- No test files detected at all in REVIEW_FILES

#### E. Spec Intent Alignment

- Implementation diverging from spec intent (minor vs fundamental divergence)
- Compare acceptance criteria against actual behaviour in REVIEW_FILES

#### F. Constitution Alignment

- Any implementation element conflicting with a constitution MUST principle
- Missing mandated sections or quality gates from constitution

#### G. Design & Structure Consistency

- Architectural decisions or design patterns from plan.md not reflected in code
- Planned directory/file layout deviating from actual structure
- New code deviating from existing project conventions (naming, module structure, error handling patterns)
- Public APIs/exports/endpoints not described in plan.md

### 6. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, majority of tasks incomplete, task-referenced files missing from disk, requirement with zero implementation
- **HIGH**: Spec intent divergence, fundamental implementation mismatch with acceptance criteria, missing scenario/test coverage
- **MEDIUM**: Design pattern drift, minor spec intent deviation
- **LOW**: Structure deviations, naming inconsistencies, minor observations not affecting functionality
- **INFO**: Positive confirmations (all tasks complete, all requirements covered, no issues found). Use sparingly — only in summary metrics, not as individual finding rows.

### 7. Produce Compact Verification Report

Output a Markdown report (no file writes) with the following structure:

## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | CRITICAL | tasks.md | 3 of 12 tasks incomplete | Complete tasks T05, T08, T11 |
| B1 | File Existence | CRITICAL | src/auth.ts | Task-referenced file missing | Create file or update task reference |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |

(Add one row per finding; generate stable IDs prefixed by check letter: A1, B1, C1... Reference specific files and line numbers in Location(s) where applicable.)

**Task Summary Table:**

| Task ID | Status | Referenced Files | Notes |
|---------|--------|-----------------|-------|

**Constitution Alignment Issues:** (if any)

**Metrics:**

- Total Tasks (completed / total)
- Requirement Coverage % (requirements with implementation evidence / total)
- Files Verified
- Critical Issues Count

### 8. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before proceeding
- If HIGH issues exist: Recommend addressing before merge; user may proceed at own risk
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run `/speckit.implement` to address findings and re-run verification", "Implementation verified — ready for review or merge"

### 9. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts and source files incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
- **Every finding must trace back** to a specification artifact (spec.md requirement, user story, scenario, edge case), a structural reference (plan.md, constitution.md), or a task in tasks.md

### Idempotency by Design

The command produces deterministic output — running verification twice on the same state yields the same report. No counters, timestamp-dependent logic, or accumulated state affects findings. The report is fully regenerated on each run.
\ No newline at end of file
diff --git a/.claude/commands/speckit.verify.run.md b/.claude/commands/speckit.verify.run.md
new file mode 100644
index 0000000..df33b08
--- /dev/null
+++ b/.claude/commands/speckit.verify.run.md
@@ -0,0 +1,214 @@
---
description: Perform a non-destructive post-implementation verification gate validating
  the implementation against spec.md, plan.md, tasks.md, and constitution.md.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks
  ps: scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
handoffs:
- label: Address findings and re-implement
  agent: speckit.implement
  prompt: Address the verification findings and re-run implementation to resolve issues
- label: Re-analyze specification consistency
  agent: speckit.analyze
  prompt: Re-analyze specification consistency based on verification findings
---


<!-- Extension: verify -->
<!-- Config: .specify/extensions/verify/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Validate the implementation against its specification artifacts (`spec.md`, `plan.md`, `tasks.md`, `constitution.md`). This command MUST run only after `/speckit.implement` has completed.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this verification scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, tasks or implementation—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.verify`.

## Execution Steps

### 1. Initialize Verification Context

Run `{SCRIPT}` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort if SPEC or TASKS is missing (instruct the user to run the missing prerequisite command). PLAN and constitution are optional — checks that depend on them are skipped gracefully.
Abort if TASKS has no completed tasks.
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Functional Requirements
- User Stories and Acceptance Criteria
- Scenarios
- Edge Cases (if present)

**From plan.md (optional):**

- Architecture/stack choices
- Data Model references
- Project structure (directory layout)

**From tasks.md:**

- Task IDs
- Completion status
- Descriptions
- Phase grouping
- Referenced file paths
- Count total tasks and completed tasks

**From constitution (optional):**

- Load `.specify/memory/constitution.md` for principle validation
- If missing or placeholder: skip constitution checks, emit Info finding

### 3. Identify Implementation Scope

Build the set of files to verify from tasks.md.

- Parse all tasks in tasks.md — both completed (`[x]`/`[X]`) and incomplete (`[ ]`)
- Extract file paths referenced in each task description
- Build **REVIEW_FILES** set from completed task file paths
- Track **INCOMPLETE_TASK_FILES** from incomplete tasks (used by check C)

### 4. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Task inventory**: Each task with ID, completion status, referenced file paths, and phase grouping
- **Implementation mapping**: Map each completed task to its referenced file paths
- **File inventory**: All REVIEW_FILES with existence verification — flag any task-referenced file that does not exist on disk
- **Requirements inventory**: Each functional requirement with a stable key — map to tasks and REVIEW_FILES for implementation evidence (evidence = file in REVIEW_FILES containing keyword/ID match, function signatures, or code paths that address the requirement)
- **Spec intent references**: User stories, acceptance criteria, and scenarios from spec.md
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 5. Verification Checks (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Task Completion

- Compare completed (`[x]`/`[X]`) vs total tasks
- Flag majority incomplete vs minority incomplete

#### B. File Existence

- Task-referenced files that do not exist on disk
- Tasks referencing ambiguous or unresolvable paths

#### C. Requirement Coverage

- Requirements with no implementation evidence in REVIEW_FILES
- Requirements whose tasks are all incomplete

#### D. Scenario & Test Coverage

- Spec scenarios with no corresponding test or code path
- No test files detected at all in REVIEW_FILES

#### E. Spec Intent Alignment

- Implementation diverging from spec intent (minor vs fundamental divergence)
- Compare acceptance criteria against actual behaviour in REVIEW_FILES

#### F. Constitution Alignment

- Any implementation element conflicting with a constitution MUST principle
- Missing mandated sections or quality gates from constitution

#### G. Design & Structure Consistency

- Architectural decisions or design patterns from plan.md not reflected in code
- Planned directory/file layout deviating from actual structure
- New code deviating from existing project conventions (naming, module structure, error handling patterns)
- Public APIs/exports/endpoints not described in plan.md

### 6. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, majority of tasks incomplete, task-referenced files missing from disk, requirement with zero implementation
- **HIGH**: Spec intent divergence, fundamental implementation mismatch with acceptance criteria, missing scenario/test coverage
- **MEDIUM**: Design pattern drift, minor spec intent deviation
- **LOW**: Structure deviations, naming inconsistencies, minor observations not affecting functionality
- **INFO**: Positive confirmations (all tasks complete, all requirements covered, no issues found). Use sparingly — only in summary metrics, not as individual finding rows.

### 7. Produce Compact Verification Report

Output a Markdown report (no file writes) with the following structure:

## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | CRITICAL | tasks.md | 3 of 12 tasks incomplete | Complete tasks T05, T08, T11 |
| B1 | File Existence | CRITICAL | src/auth.ts | Task-referenced file missing | Create file or update task reference |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |

(Add one row per finding; generate stable IDs prefixed by check letter: A1, B1, C1... Reference specific files and line numbers in Location(s) where applicable.)

**Task Summary Table:**

| Task ID | Status | Referenced Files | Notes |
|---------|--------|-----------------|-------|

**Constitution Alignment Issues:** (if any)

**Metrics:**

- Total Tasks (completed / total)
- Requirement Coverage % (requirements with implementation evidence / total)
- Files Verified
- Critical Issues Count

### 8. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before proceeding
- If HIGH issues exist: Recommend addressing before merge; user may proceed at own risk
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run `/speckit.implement` to address findings and re-run verification", "Implementation verified — ready for review or merge"

### 9. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts and source files incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
- **Every finding must trace back** to a specification artifact (spec.md requirement, user story, scenario, edge case), a structural reference (plan.md, constitution.md), or a task in tasks.md

### Idempotency by Design

The command produces deterministic output — running verification twice on the same state yields the same report. No counters, timestamp-dependent logic, or accumulated state affects findings. The report is fully regenerated on each run.
\ No newline at end of file
diff --git a/.github/agents/speckit.fleet.md b/.github/agents/speckit.fleet.md
new file mode 100644
index 0000000..eca5d6b
--- /dev/null
+++ b/.github/agents/speckit.fleet.md
@@ -0,0 +1,505 @@
---
description: 'Orchestrate a full feature lifecycle through all SpecKit phases with
  human-in-the-loop checkpoints: specify -> clarify -> plan -> checklist -> tasks
  -> analyze -> cross-model review -> implement -> verify -> CI. Detects partially
  complete features and resumes from the right phase.'
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
agents:
- speckit.specify
- speckit.clarify
- speckit.plan
- speckit.checklist
- speckit.tasks
- speckit.analyze
- speckit.fleet.review
- speckit.implement
- speckit.verify
user-invocable: true
disable-model-invocation: true
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty). Classify the input:

1. **Feature description** (e.g., "Build a capability browser that lets users..."): Store as `FEATURE_DESCRIPTION`. This will be passed verbatim to `speckit.specify` in Phase 1. Skip artifact detection if no `FEATURE_DIR` is found -- go straight to Phase 1.
2. **Phase override** (e.g., "resume at Phase 5" or "start from plan"): Override the auto-detected resume point.
3. **Empty**: Run artifact detection and resume from the detected phase.

---

You are the **SpecKit Fleet Orchestrator** -- a workflow conductor that drives a feature from idea to implementation by delegating to specialized SpecKit agents in order, with human approval at every checkpoint.

## Workflow Phases

| Phase | Agent | Artifact Signal | Gate |
|-------|-------|-----------------|------|
| 1. Specify | `speckit.specify` | `spec.md` exists in FEATURE_DIR | User approves spec |
| 2. Clarify | `speckit.clarify` | `spec.md` contains a `## Clarifications` section | User says "done" or requests another round |
| 3. Plan | `speckit.plan` | `plan.md` exists in FEATURE_DIR | User approves plan |
| 4. Checklist | `speckit.checklist` | `checklists/` directory exists and contains at least one file | User approves checklist |
| 5. Tasks | `speckit.tasks` | `tasks.md` exists in FEATURE_DIR | User approves tasks |
| 6. Analyze | `speckit.analyze` | `.analyze-done` marker exists in FEATURE_DIR | User acknowledges analysis |
| 7. Review | `speckit.fleet.review` | `review.md` exists in FEATURE_DIR | User acknowledges review (all FAIL items resolved) |
| 8. Implement | `speckit.implement` | ALL task checkboxes in tasks.md are `[x]` (none `[ ]`) | Implementation complete |
| 9. Verify | `speckit.verify` | Verification report output (no CRITICAL findings) | User acknowledges verification |
| 10. Tests | Terminal | Tests pass | Tests pass |

## Operating Rules

1. **One phase at a time.** Never skip ahead or run phases in parallel.
2. **Human gate after every phase.** After each agent completes, summarize the outcome and ask the user to:
   - **Approve** -> proceed to the next phase
   - **Revise** -> re-run the same phase with user feedback
   - **Skip** -> mark phase as skipped and move on (user must confirm)
   - **Abort** -> stop the workflow entirely
   - **Rollback** -> jump back to an earlier phase (see Phase Rollback below)
3. **Clarify is repeatable.** After Phase 2, ask: *"Run another clarification round, or move on to planning?"* Loop until the user says done.
4. **Track progress.** Use the todo tool to create and update a checklist of all 10 phases so the user always sees where they are.
5. **Pass context forward.** When delegating, include the feature description and any user-provided refinements so each agent has full context.
6. **Suppress sub-agent handoffs.** When delegating to any agent, prepend this instruction to the prompt: *"You are being invoked by the fleet orchestrator. Do NOT follow handoffs or auto-forward to other agents. Return your output to the orchestrator and stop."* This prevents `send: true` handoff chains (e.g., plan -> tasks -> analyze -> implement) from bypassing fleet's human gates.
7. **Verify phase.** After implementation, run `speckit.verify` to validate code against spec artifacts. Requires the verify extension (see Phase 9).
8. **Test phase.** After verification, detect the project's test runner(s) and run tests. See Phase 10 for detection logic.
9. **Git checkpoint commits.** After these phases complete, offer to create a WIP commit to safeguard progress:
   - After Phase 5 (Tasks) -- all design artifacts are finalized
   - After Phase 8 (Implement) -- all code is written
   - After Phase 9 (Verify) -- code is validated
   Commit message format: `wip: fleet phase {N} -- {phase name} complete`
   Always ask before committing -- never auto-commit. If the user declines, continue without committing.
10. **Context budget awareness.** Long-running fleet sessions can exhaust the model's context window. Monitor for these signs:
    - Responses becoming shorter or losing earlier context
    - Reaching Phase 8+ in a session that started from Phase 1
    At natural checkpoints (after git commits or between phases), if context pressure seems high, suggest: *"This is getting long. We can continue in a new chat -- the fleet will auto-detect progress and resume at Phase {N}."*

## Parallel Subagent Execution (Plan & Implement Phases)

During **Phase 3 (Plan)** and **Phase 8 (Implement)**, the orchestrator may dispatch **up to 3 subagents in parallel** when work items are independent. This is governed by the `[P]` (parallelizable) marker system already used in tasks.md.

### How Parallelism Works

1. **Tasks agent embeds the plan.** During Phase 5 (Tasks), the tasks agent marks tasks with `[P]` when they touch different files and have no dependency on incomplete tasks. Tasks within the same phase that share `[P]` markers form a **parallel group**.

2. **Fleet orchestrator fans out.** When executing Plan or Implement, the orchestrator:
   - Reads the current phase's task list from tasks.md
   - Identifies `[P]`-marked tasks that form an independent group (no shared files, no ordering dependency)
   - Dispatches up to **3 subagents simultaneously** for the group
   - Waits for all dispatched agents to complete before moving to the next group or sequential task
   - If any parallel task fails, halts the batch and reports the failure before continuing

3. **Parallelism constraints:**
   - **Max concurrency: 3** -- never dispatch more than 3 subagents at once
   - **Same-file exclusion** -- tasks touching the same file MUST run sequentially even if both are `[P]`
   - **Phase boundaries are serial** -- all tasks in Phase N must complete before Phase N+1 begins
   - **Human gate still applies** -- after each implementation phase completes (all groups done), summarize and checkpoint with the user before the next phase

### Parallel Groups in tasks.md

The tasks agent should organize `[P]` tasks into explicit parallel groups using comments in tasks.md:

```markdown
### Phase 1: Setup

<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create CapabilityManifest.cs in Models/Generation/
- [ ] T003 [P] Create DocumentIndex.cs in Models/Generation/
- [ ] T004 [P] Create ResolvedContext.cs in Models/Generation/

<!-- parallel-group: 2 (max 3 concurrent) -->
- [ ] T005 [P] Create GenerationResult.cs in Models/Generation/
- [ ] T006 [P] Create BatchGenerationJob.cs in Models/Generation/
- [ ] T007 [P] Create SchemaExport.cs in Models/Generation/

<!-- sequential -->
- [ ] T013 Create generation.ts with all TypeScript interfaces
```

### Plan Phase Parallelism

During Phase 3 (Plan), the plan agent's Phase 0 (Research) can dispatch up to 3 research sub-tasks in parallel:
- Each `NEEDS CLARIFICATION` item or technology best-practice lookup is an independent research task
- Fan out up to 3 at a time, consolidate results into research.md
- Phase 1 (Design) artifacts -- data-model.md, contracts/, quickstart.md -- can be generated in parallel if they don't depend on each other's output

### Implement Phase Parallelism

During Phase 8 (Implement), for each implementation phase in tasks.md:
1. Read the phase and identify parallel groups (marked with `<!-- parallel-group: N -->` comments)
2. For each group, dispatch up to 3 `speckit.implement` subagents simultaneously, each given a specific subset of tasks
3. When all tasks in a group complete, move to the next group or sequential task
4. After the entire phase completes, checkpoint with the user before proceeding to the next phase

### Instructions for Tasks Agent

When the fleet orchestrator delegates to `speckit.tasks`, append this instruction:

> "Organize [P]-marked tasks into explicit parallel groups using `<!-- parallel-group: N -->` HTML comments. Each group should contain up to 3 tasks that can execute concurrently (different files, no dependencies). Add `<!-- sequential -->` before tasks that must run in order. This enables the fleet orchestrator to fan out up to 3 subagents per group during implementation."

## First-Turn Behavior -- Artifact Detection & Resume

On **every** invocation, before doing anything else, run artifact detection to determine where the workflow stands. This allows the orchestrator to resume mid-flight even in a fresh conversation.

### Step 0: Branch safety pre-flight

Before anything else, run basic git health checks:

1. **Uncommitted changes**: Run `git status --porcelain`. If there are uncommitted changes, warn the user:
   > WARNING: You have uncommitted changes. Starting the fleet may create conflicts. Commit or stash first?
   > - **Continue** -- proceed with uncommitted changes (risky)
   > - **Stash** -- run `git stash` and continue
   > - **Abort** -- stop and let the user handle it

2. **Detached HEAD**: Run `git branch --show-current`. If empty (detached HEAD), abort:
   > Cannot run fleet on a detached HEAD. Please check out a feature branch first.

3. **Branch freshness** (advisory): Run `git log --oneline HEAD..origin/main 2>/dev/null | wc -l`. If the main branch has commits not in the current branch, advise:
   > Your branch is {N} commits behind main. Consider rebasing before starting implementation to avoid merge conflicts later.

This check runs only once on first invocation. It does NOT block the workflow (except for detached HEAD).

### Step 1: Discover the feature directory

Run `{SCRIPT}` from the repo root to get the feature directory paths as JSON. Parse the output to get `FEATURE_DIR`.

If the script fails (e.g., not on a feature branch):
- If `FEATURE_DESCRIPTION` was provided in `$ARGUMENTS`, proceed directly to Phase 1 -- pass the description to `speckit.specify` and it will create the feature directory.
- If `$ARGUMENTS` is empty, ask the user for the feature description, then start Phase 1.

### Step 2: Check model configuration

Check if `{FEATURE_DIR}/../../../.specify/extensions/fleet/fleet-config.yml` (or the project's config location) has model settings. If the config file doesn't exist or models are set to defaults:

1. **Detect the platform**: Identify which IDE/agent platform you're running in (VS Code Copilot, Claude Code, Cursor, etc.) based on available context.

2. **Primary model**: If `models.primary` is `"auto"`, use whatever model you are currently running as. No action needed -- you ARE the primary model.

3. **Review model**: If `models.review` is `"ask"`, prompt the user:
   > **Model setup (one-time):** The cross-model review (Phase 7) works best with a *different* model than the one running the fleet, to catch blind spots.
   >
   > What model should I use for the review phase? Suggestions:
   > - A different model family (e.g., if you're on Claude, use GPT or Gemini)
   > - A different tier (e.g., if you're on Opus, use Sonnet)
   > - "skip" to skip Phase 7 entirely
   >
   > You can also set this permanently in your fleet config.

4. **Store the choice**: Remember the user's model selection for the duration of this conversation. If they want to persist it, suggest editing the config file.

### Step 3: Probe artifacts in FEATURE_DIR

Check these paths **in order** using the `read` tool. Each check is a file/directory existence AND basic integrity test:

| Check | Path | Existence | Integrity |
|-------|------|-----------|-----------|
| spec.md | `{FEATURE_DIR}/spec.md` | File exists? | Has `## User Stories` or `## Requirements` section? File > 100 bytes? |
| Clarifications | `{FEATURE_DIR}/spec.md` | Contains `## Clarifications` heading? | At least one Q&A pair present? |
| plan.md | `{FEATURE_DIR}/plan.md` | File exists? | Has `## Architecture` or `## Tech Stack` section? File > 200 bytes? |
| checklists/ | `{FEATURE_DIR}/checklists/` | Directory exists and has >=1 file? | Each file > 50 bytes? |
| tasks.md | `{FEATURE_DIR}/tasks.md` | File exists? | Contains at least one `- [ ]` or `- [x]` item? Has `### Phase` heading? |
| .analyze-done | `{FEATURE_DIR}/.analyze-done` | Marker file exists? | -- |
| review.md | `{FEATURE_DIR}/review.md` | File exists? | Contains `## Summary` and verdict table? |
| Implementation | `{FEATURE_DIR}/tasks.md` | All `- [x]`, zero `- [ ]` remaining? | -- |
| Verify extension | `.specify/extensions/verify/extension.yml` | File exists? | -- |
| Verification | `{FEATURE_DIR}/.verify-done` | Marker file exists? | -- |

**Integrity failures are advisory, not blocking.** If a file exists but fails integrity checks, warn the user:
> WARNING: `plan.md` exists but appears incomplete (missing expected sections). It may have been partially generated. Re-run Phase 3 (Plan), or continue with the current file?

### Step 4: Determine the resume phase

Walk the artifact signals **top-down**. The first phase whose artifact is **missing** is where work resumes:

```
if spec.md missing           -> resume at Phase 1 (Specify)
if no ## Clarifications       -> resume at Phase 2 (Clarify)
if plan.md missing           -> resume at Phase 3 (Plan)
if checklists/ empty/missing -> resume at Phase 4 (Checklist)
if tasks.md missing          -> resume at Phase 5 (Tasks)
if .analyze-done missing     -> resume at Phase 6 (Analyze)
if review.md missing         -> resume at Phase 7 (Review)
if tasks.md has `- [ ]`     -> resume at Phase 8 (Implement)
if .verify-done missing      -> resume at Phase 9 (Verify)
if all done                  -> resume at Phase 10 (Tests)
```

### Step 5: Present status and confirm

Show the user a status table and the detected resume point:

```
Feature: {branch name}
Directory: {FEATURE_DIR}

Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [x] checklists/ has 2 files
Phase 5 Tasks        [x] tasks.md found
Phase 6 Analyze      [ ] .analyze-done not found
Phase 7 Review       [ ] --
Phase 8 Implement    [ ] --
Phase 9 Verify       [ ] --
Phase 10 Tests       [ ] --

> Resuming at Phase 6: Analyze
```

Then ask: *"Detected progress above. Resume at Phase {N} ({name}), or override to a different phase?"*

- If user confirms -> create the todo list with completed phases marked as `completed` and resume from Phase N.
- If user provides a phase number or name -> start from that phase instead.
- If FEATURE_DIR doesn't exist -> start from Phase 1, ask for the feature description.

### Edge Cases

- **Implementation partially complete**: If `tasks.md` exists and has a mix of `[x]` and `[ ]`, resume at Phase 8 (Implement). Tell the user how many tasks remain: *"tasks.md: {done}/{total} tasks complete. {remaining} tasks remaining."*
- **Analyze completion marker**: After Phase 6 (Analyze) completes -- whether it produces `remediation.md` or not -- create a marker file `{FEATURE_DIR}/.analyze-done` containing the timestamp. This distinguishes "analyze ran clean" from "analyze never ran." The `.analyze-done` file is the artifact signal for Phase 6, not `remediation.md`.
- **Review can be skipped**: If user opts to skip cross-model review, treat Phase 7 as skipped and proceed to Phase 8.
- **Review found NO failures**: If `review.md` exists and overall verdict is "READY", Phase 7 is complete -- proceed to Phase 8.
- **Review found FAIL items**: If `review.md` has FAIL verdicts, present them and ask user whether to (a) fix the issues by re-running the relevant earlier phase, (b) proceed anyway, or (c) abort.
- **Verify extension not installed**: If `.specify/extensions/verify/extension.yml` doesn't exist, prompt to install. If user declines, skip Phase 9.
- **Verify completion marker**: After Phase 9 (Verify) completes, create `{FEATURE_DIR}/.verify-done` with timestamp. This distinguishes "verify ran" from "verify never ran."
- **Checklists may be skipped**: Some features don't use checklists. If `tasks.md` exists but `checklists/` doesn't, treat Phase 4 as skipped.
- **Fresh branch, no specs dir**: Start from Phase 1. Use `FEATURE_DESCRIPTION` from `$ARGUMENTS` if provided; otherwise ask the user.
- **User says "start over"**: Re-run from Phase 1 regardless of existing artifacts. Warn that this will overwrite existing artifacts and get confirmation.

### Stale Artifact Detection

After determining the resume phase, check for **stale downstream artifacts** -- files generated by an earlier phase that may be outdated because an upstream artifact was modified later.

Compare file modification timestamps in this dependency chain:

```
spec.md -> plan.md -> tasks.md -> .analyze-done -> review.md -> [implementation] -> .verify-done
```

If a file is **newer** than a downstream file that depends on it (e.g., `spec.md` was modified after `plan.md`), warn the user:

> WARNING: **Stale artifact detected**: `plan.md` (modified {date}) was generated before the latest `spec.md` change ({date}). Plan may not reflect current requirements. Re-run Phase 3 (Plan) to update, or proceed with the current plan?

This is advisory only -- the user decides whether to rerun. Do not block the workflow.

## Phase Execution Template

For each phase:
```
1. Mark the phase as in-progress in the todo list
2. Announce: "**Phase N: {Name}** -- delegating to {agent}..."
3. Delegate to the agent with relevant arguments:
   - Phase 1 (Specify): pass FEATURE_DESCRIPTION from $ARGUMENTS as the argument
   - Phase 2 (Clarify): pass the feature description and any user feedback
   - All other phases: pass the feature description and any user-provided refinements
4. Summarize the agent's output concisely
5. Ask: "Ready to proceed to Phase N+1 ({next name}), or would you like to revise?"
6. Wait for user response
7. Mark phase as completed when approved
```

## Phase 7: Cross-Model Review

This phase uses a **different model** than the one that generated plan.md and tasks.md, providing a fresh perspective to catch blind spots.

1. Delegate to `speckit.fleet.review` -- it runs on the **review model** configured in Step 2 (a different model than the primary) and is **read-only**
2. The review agent reads spec.md, plan.md, tasks.md, checklists/, and remediation.md
3. It evaluates 7 dimensions: spec-plan alignment, plan-tasks completeness, dependency ordering, parallelization correctness, feasibility & risk, standards compliance, implementation readiness
4. It outputs a structured review report with PASS/WARN/FAIL verdicts per dimension
5. **Save the review output** to `{FEATURE_DIR}/review.md`
6. Present the summary table to the user:
   - **All PASS / READY**: *"Cross-model review passed. Ready to implement?"*
   - **WARN items**: *"Review found {N} warnings. Proceed to implementation, or address them first?"*
   - **FAIL items**: *"Review found {N} critical issues that should be fixed before implementing."* -- list them and ask which earlier phase to re-run (plan, tasks, or analyze)
7. If user chooses to fix: loop back to the appropriate phase, then re-run review after fixes
8. If user approves: mark Phase 7 complete and proceed to Phase 8 (Implement)

**Note**: Phase 7 (Review) validates design artifacts *before* implementation. Phase 9 (Verify) validates actual code *after* implementation. Both are read-only.

## Phase 9: Post-Implementation Verification

This phase validates that the implemented code matches the specification artifacts. It requires the **verify extension**.

### Extension Installation Check

Before delegating to `speckit.verify`, check if the extension is installed:

1. Check if `.specify/extensions/verify/extension.yml` exists using the `read` tool
2. If **missing**, ask the user:
   > The verify extension is not installed. Install it now?
   > ```
   > specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
   > ```
3. If user approves, run the install command in the terminal
4. If user declines, skip Phase 9 and proceed to Phase 10 (CI)

### Verification Execution

1. Delegate to `speckit.verify` -- it reads spec.md, plan.md, tasks.md, constitution.md and the implemented source files
2. It runs 7 verification checks: task completion, file existence, requirement coverage, scenario & test coverage, spec intent alignment, constitution alignment, design & structure consistency
3. It outputs a verification report with findings, metrics, and next actions
4. Present the summary to the user:
   - **No findings**: *"Verification passed. Ready to run CI?"* -- proceed to Phase 10
   - **Findings exist**: Show the findings grouped by severity (CRITICAL, WARNING, INFO) and enter the **Implement-Verify loop** below

### Implement-Verify Loop

When verification produces findings, run a remediation loop:

```
repeat:
  1. Present findings to user
  2. Ask: "Re-run implementation to address these findings? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with findings as context, then re-run speckit.verify
     - skip  -> exit loop, proceed to Phase 10 with current state
     - abort -> stop the workflow entirely
  3. After re-verify, check findings again
until: no findings remain OR user says skip/abort
```

Rules for the loop:
- **Pass findings as context**: When delegating to `speckit.implement`, include the verification findings so it knows exactly what to fix. Prepend: *"Address the following verification findings: {findings list}"*
- **Suppress sub-agent handoffs** (Operating Rule 6 still applies)
- **Track iterations**: Show the loop count each time -- *"Implement-Verify iteration {N}: {findings_count} findings remaining"*
- **Cap at 3 iterations**: After 3 rounds, if findings persist, warn the user: *"3 remediation iterations completed with {N} findings still remaining. These may require manual intervention. Proceed to CI, or continue?"*
- **Human gate every iteration**: Never auto-loop -- always ask before re-implementing
- **Delta reporting**: After each re-verify, show what changed -- *"Fixed: {N}, New: {N}, Remaining: {N}"*

After the loop exits (no findings or user skips):
1. Create a marker file `{FEATURE_DIR}/.verify-done` containing the timestamp and final findings count
2. Mark Phase 9 complete and proceed to Phase 10 (Tests)

## Phase 10: Tests

After verification, detect and run the project's test suite.

### Test Runner Detection

Detect test runner(s) by checking for these files at the repo root, in order:

| Check | Runner | Command |
|-------|--------|---------|
| `package.json` with `"test"` script | npm/yarn/pnpm | `npm test` (or `yarn test` / `pnpm test` based on lockfile) |
| `*.sln` or `*.slnx` or `*.csproj` | dotnet | `dotnet test` |
| `Makefile` with `test` target | make | `make test` |
| `pytest.ini` or `pyproject.toml` with `[tool.pytest]` | pytest | `pytest` |
| `Cargo.toml` | cargo | `cargo test` |
| `go.mod` | go | `go test ./...` |

If **multiple** runners are detected (e.g., a monorepo with both `package.json` and `*.slnx`), run all of them and report results per runner.

If **no** runner is detected, ask the user: *"No test runner detected. What command runs your tests?"*

### Test Execution

1. Run the detected test command(s) from the repo root
2. Report pass/fail summary with failure details

### CI Remediation Loop

If CI fails, run a remediation loop (same pattern as the Implement-Verify loop):

```
repeat:
  1. Parse test failures -- group by type (compile error, test failure, lint error)
  2. Present failures to user with file locations and error messages
  3. Ask: "Fix these CI failures? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with failure details as context, then re-run CI
     - skip  -> exit loop, leave failures for manual fixing
     - abort -> stop the workflow entirely
  4. After re-run, check CI result again
until: CI passes OR user says skip/abort
```

Rules:
- **Pass failure context**: Include exact error messages, file paths, and test names when delegating to implement
- **Cap at 3 iterations**: After 3 rounds, warn: *"3 CI fix iterations completed, {N} failures remain. These likely need manual debugging."*
- **Human gate every iteration**: Never auto-loop
- **Delta reporting**: *"Fixed: {N} failures, New: {N}, Remaining: {N}"*
- **Distinguish failure types**: Compile errors should be fixed before test failures (they may cause cascading test failures)

### Tests Pass

When all tests pass, proceed to the Completion Summary.

## Error Recovery

### Parallel Task Failure

When a task within a parallel group fails during Phase 8 (Implement):
1. **Let the other in-flight tasks finish** -- don't abort tasks that are already running
2. Report which task(s) failed with error details
3. Offer three options:
   - **Retry failed only** -- re-dispatch only the failed task(s), skip completed ones
   - **Retry entire group** -- re-run all tasks in the parallel group (useful if failure cascaded)
   - **Skip and continue** -- mark the failed task(s) and move on (user can fix manually later)
4. Never auto-retry -- always ask the user

### Sub-Agent Timeout or Crash

If a delegated sub-agent doesn't return (timeout) or returns an error:
1. Report the phase and agent that failed
2. Offer to retry the same phase or skip it
3. If the same agent fails twice in a row, suggest the user run it manually (`/speckit.{agent}`) and then resume the fleet

## Phase Rollback

At any human gate, the user may say "go back to Phase N" or "rollback to plan." The fleet supports this:

1. **Identify the target phase**: Parse the user's request to determine which phase to roll back to.
2. **Warn about downstream invalidation**: All artifacts generated by phases *after* the target phase are now potentially stale. Show:
   > Rolling back to Phase {N} ({name}). The following artifacts may be invalidated:
   > - plan.md (Phase 3)
   > - tasks.md (Phase 5)
   > - Implementation (Phase 8)
   >
   > These will be regenerated as the workflow proceeds. Continue?
3. **Delete marker files only**: Remove `.analyze-done`, `.verify-done`, and `review.md` for invalidated phases. Do NOT delete spec.md, plan.md, or tasks.md -- they'll be overwritten when the phase re-runs.
4. **Update the todo list**: Reset all phases from the target phase onward to `not-started`.
5. **Resume from the target phase**: Follow the normal phase execution flow from that point.

**Constraints**:
- Cannot rollback during an active sub-agent delegation -- wait for it to complete first
- Rollback to Phase 1 (Specify) with "start over" requires explicit confirmation since it regenerates everything

## Completion Summary

After Phase 10 completes (CI passes or user skips CI), present a structured summary:

```
## Fleet Complete

Feature: {feature name}
Branch: {branch name}
Duration: Phases 1-10 ({phases completed}/{phases total}, {phases skipped} skipped)

### Artifacts Generated
- spec.md -- feature specification ({word count} words, {user stories count} user stories)
- plan.md -- technical plan ({components count} components)
- tasks.md -- {total tasks} tasks ({completed} completed, {remaining} remaining)
- review.md -- cross-model review (verdict: {verdict})

### Implementation
- Files created: {count}
- Files modified: {count}
- Tests added: {count}

### Quality Gates
- Analyze: {pass/findings count}
- Cross-model review: {verdict}
- Verify: {pass/findings count} ({iterations} iterations)
- CI: {pass/fail}

### Git
- Commits: {list of WIP commits if any}
- Ready to push: {yes/no}
```

After the summary, offer:
1. *"Push to remote and create a PR?"* (if the user wants)
2. *"View any artifact? (spec, plan, tasks, review)"*
\ No newline at end of file
diff --git a/.github/agents/speckit.fleet.review.md b/.github/agents/speckit.fleet.review.md
new file mode 100644
index 0000000..7b840b7
--- /dev/null
+++ b/.github/agents/speckit.fleet.review.md
@@ -0,0 +1,117 @@
---
description: Cross-model evaluation of plan.md and tasks.md before implementation.
  Reviews feasibility, completeness, dependency ordering, risk, and parallelization
  correctness using a different model than was used to generate the artifacts.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
user-invocable: false
agents: []
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

---

You are a **Pre-Implementation Reviewer** -- a critical evaluator who reviews the design artifacts (plan.md, tasks.md, spec.md) produced by earlier workflow phases. Your purpose is to catch issues that the generating model may have been blind to, before implementation begins.

**STRICTLY READ-ONLY**: Do NOT modify any files. Output a structured review report only.

## What You Review

Run `{SCRIPT}` from the repo root to discover `FEATURE_DIR`. Then read these artifacts:

- `spec.md` -- the feature specification (requirements, user stories)
- `plan.md` -- the technical plan (architecture, tech stack, file structure)
- `tasks.md` -- the task breakdown (phased, dependency-ordered, with [P] markers)
- `checklists/` -- any requirement quality checklists (if present)
- `remediation.md` -- analyze output (if present)

## Review Dimensions

Evaluate across these 7 dimensions. For each, assign a verdict: **PASS**, **WARN**, or **FAIL**.

### 1. Spec-Plan Alignment
- Does plan.md address every user story in spec.md?
- Are there plan decisions that contradict spec requirements?
- Are non-functional requirements (performance, security, accessibility) covered in the plan?

### 2. Plan-Tasks Completeness
- Does every architectural component in plan.md have corresponding tasks in tasks.md?
- Are there tasks that reference files/patterns not described in plan.md?
- Are test tasks present for critical paths?

### 3. Dependency Ordering
- Are task phases ordered correctly? (setup -> foundational -> stories -> polish)
- Do any tasks reference files/interfaces that haven't been created by an earlier task?
- Are foundational tasks truly blocking, or could some be parallelized?

### 4. Parallelization Correctness
- Are `[P]` markers accurate? (Do tasks marked parallel truly touch different files with no dependency?)
- Are there tasks NOT marked `[P]` that could be parallelized?
- Do `<!-- parallel-group: N -->` groupings respect the max-3 constraint?
- Are there same-file conflicts hidden within a parallel group?

### 5. Feasibility & Risk
- Are there tasks that seem too large? (If a single task touches >3 files or >200 LOC, flag it)
- Are there technology choices in plan.md that contradict the project's existing stack?
- Are there missing error handling, edge case, or migration tasks?
- Does the task count seem proportional to the feature complexity?

### 6. Constitution & Standards Compliance
- Read `.specify/memory/constitution.md` and check plan aligns with project principles
- Check that testing approach matches the project's testing standards (80% coverage, TDD if required)
- Verify security considerations are addressed (path validation, input sanitization, etc.)

### 7. Implementation Readiness
- Is every task specific enough for an LLM to execute without ambiguity?
- Do all tasks include exact file paths?
- Are acceptance criteria clear for each user story phase?

## Output Format

```markdown
# Pre-Implementation Review

**Feature**: {feature name from spec.md}
**Artifacts reviewed**: spec.md, plan.md, tasks.md, [others if present]
**Review model**: {your model name} (should be different from the model that generated the artifacts)
**Generating model**: {model used for Phases 1-6, if known}

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS/WARN/FAIL | brief note |
| Plan-Tasks Completeness | PASS/WARN/FAIL | brief note |
| Dependency Ordering | PASS/WARN/FAIL | brief note |
| Parallelization Correctness | PASS/WARN/FAIL | brief note |
| Feasibility & Risk | PASS/WARN/FAIL | brief note |
| Standards Compliance | PASS/WARN/FAIL | brief note |
| Implementation Readiness | PASS/WARN/FAIL | brief note |

**Overall**: READY / READY WITH WARNINGS / NOT READY

## Findings

### Critical (FAIL -- must fix before implementing)
1. ...

### Warnings (WARN -- recommend fixing, can proceed)
1. ...

### Observations (informational)
1. ...

## Recommended Actions
- [ ] {specific action to address each FAIL/WARN}
```
\ No newline at end of file
diff --git a/.github/agents/speckit.fleet.run.md b/.github/agents/speckit.fleet.run.md
new file mode 100644
index 0000000..eca5d6b
--- /dev/null
+++ b/.github/agents/speckit.fleet.run.md
@@ -0,0 +1,505 @@
---
description: 'Orchestrate a full feature lifecycle through all SpecKit phases with
  human-in-the-loop checkpoints: specify -> clarify -> plan -> checklist -> tasks
  -> analyze -> cross-model review -> implement -> verify -> CI. Detects partially
  complete features and resumes from the right phase.'
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
agents:
- speckit.specify
- speckit.clarify
- speckit.plan
- speckit.checklist
- speckit.tasks
- speckit.analyze
- speckit.fleet.review
- speckit.implement
- speckit.verify
user-invocable: true
disable-model-invocation: true
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty). Classify the input:

1. **Feature description** (e.g., "Build a capability browser that lets users..."): Store as `FEATURE_DESCRIPTION`. This will be passed verbatim to `speckit.specify` in Phase 1. Skip artifact detection if no `FEATURE_DIR` is found -- go straight to Phase 1.
2. **Phase override** (e.g., "resume at Phase 5" or "start from plan"): Override the auto-detected resume point.
3. **Empty**: Run artifact detection and resume from the detected phase.

---

You are the **SpecKit Fleet Orchestrator** -- a workflow conductor that drives a feature from idea to implementation by delegating to specialized SpecKit agents in order, with human approval at every checkpoint.

## Workflow Phases

| Phase | Agent | Artifact Signal | Gate |
|-------|-------|-----------------|------|
| 1. Specify | `speckit.specify` | `spec.md` exists in FEATURE_DIR | User approves spec |
| 2. Clarify | `speckit.clarify` | `spec.md` contains a `## Clarifications` section | User says "done" or requests another round |
| 3. Plan | `speckit.plan` | `plan.md` exists in FEATURE_DIR | User approves plan |
| 4. Checklist | `speckit.checklist` | `checklists/` directory exists and contains at least one file | User approves checklist |
| 5. Tasks | `speckit.tasks` | `tasks.md` exists in FEATURE_DIR | User approves tasks |
| 6. Analyze | `speckit.analyze` | `.analyze-done` marker exists in FEATURE_DIR | User acknowledges analysis |
| 7. Review | `speckit.fleet.review` | `review.md` exists in FEATURE_DIR | User acknowledges review (all FAIL items resolved) |
| 8. Implement | `speckit.implement` | ALL task checkboxes in tasks.md are `[x]` (none `[ ]`) | Implementation complete |
| 9. Verify | `speckit.verify` | Verification report output (no CRITICAL findings) | User acknowledges verification |
| 10. Tests | Terminal | Tests pass | Tests pass |

## Operating Rules

1. **One phase at a time.** Never skip ahead or run phases in parallel.
2. **Human gate after every phase.** After each agent completes, summarize the outcome and ask the user to:
   - **Approve** -> proceed to the next phase
   - **Revise** -> re-run the same phase with user feedback
   - **Skip** -> mark phase as skipped and move on (user must confirm)
   - **Abort** -> stop the workflow entirely
   - **Rollback** -> jump back to an earlier phase (see Phase Rollback below)
3. **Clarify is repeatable.** After Phase 2, ask: *"Run another clarification round, or move on to planning?"* Loop until the user says done.
4. **Track progress.** Use the todo tool to create and update a checklist of all 10 phases so the user always sees where they are.
5. **Pass context forward.** When delegating, include the feature description and any user-provided refinements so each agent has full context.
6. **Suppress sub-agent handoffs.** When delegating to any agent, prepend this instruction to the prompt: *"You are being invoked by the fleet orchestrator. Do NOT follow handoffs or auto-forward to other agents. Return your output to the orchestrator and stop."* This prevents `send: true` handoff chains (e.g., plan -> tasks -> analyze -> implement) from bypassing fleet's human gates.
7. **Verify phase.** After implementation, run `speckit.verify` to validate code against spec artifacts. Requires the verify extension (see Phase 9).
8. **Test phase.** After verification, detect the project's test runner(s) and run tests. See Phase 10 for detection logic.
9. **Git checkpoint commits.** After these phases complete, offer to create a WIP commit to safeguard progress:
   - After Phase 5 (Tasks) -- all design artifacts are finalized
   - After Phase 8 (Implement) -- all code is written
   - After Phase 9 (Verify) -- code is validated
   Commit message format: `wip: fleet phase {N} -- {phase name} complete`
   Always ask before committing -- never auto-commit. If the user declines, continue without committing.
10. **Context budget awareness.** Long-running fleet sessions can exhaust the model's context window. Monitor for these signs:
    - Responses becoming shorter or losing earlier context
    - Reaching Phase 8+ in a session that started from Phase 1
    At natural checkpoints (after git commits or between phases), if context pressure seems high, suggest: *"This is getting long. We can continue in a new chat -- the fleet will auto-detect progress and resume at Phase {N}."*

## Parallel Subagent Execution (Plan & Implement Phases)

During **Phase 3 (Plan)** and **Phase 8 (Implement)**, the orchestrator may dispatch **up to 3 subagents in parallel** when work items are independent. This is governed by the `[P]` (parallelizable) marker system already used in tasks.md.

### How Parallelism Works

1. **Tasks agent embeds the plan.** During Phase 5 (Tasks), the tasks agent marks tasks with `[P]` when they touch different files and have no dependency on incomplete tasks. Tasks within the same phase that share `[P]` markers form a **parallel group**.

2. **Fleet orchestrator fans out.** When executing Plan or Implement, the orchestrator:
   - Reads the current phase's task list from tasks.md
   - Identifies `[P]`-marked tasks that form an independent group (no shared files, no ordering dependency)
   - Dispatches up to **3 subagents simultaneously** for the group
   - Waits for all dispatched agents to complete before moving to the next group or sequential task
   - If any parallel task fails, halts the batch and reports the failure before continuing

3. **Parallelism constraints:**
   - **Max concurrency: 3** -- never dispatch more than 3 subagents at once
   - **Same-file exclusion** -- tasks touching the same file MUST run sequentially even if both are `[P]`
   - **Phase boundaries are serial** -- all tasks in Phase N must complete before Phase N+1 begins
   - **Human gate still applies** -- after each implementation phase completes (all groups done), summarize and checkpoint with the user before the next phase

### Parallel Groups in tasks.md

The tasks agent should organize `[P]` tasks into explicit parallel groups using comments in tasks.md:

```markdown
### Phase 1: Setup

<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create CapabilityManifest.cs in Models/Generation/
- [ ] T003 [P] Create DocumentIndex.cs in Models/Generation/
- [ ] T004 [P] Create ResolvedContext.cs in Models/Generation/

<!-- parallel-group: 2 (max 3 concurrent) -->
- [ ] T005 [P] Create GenerationResult.cs in Models/Generation/
- [ ] T006 [P] Create BatchGenerationJob.cs in Models/Generation/
- [ ] T007 [P] Create SchemaExport.cs in Models/Generation/

<!-- sequential -->
- [ ] T013 Create generation.ts with all TypeScript interfaces
```

### Plan Phase Parallelism

During Phase 3 (Plan), the plan agent's Phase 0 (Research) can dispatch up to 3 research sub-tasks in parallel:
- Each `NEEDS CLARIFICATION` item or technology best-practice lookup is an independent research task
- Fan out up to 3 at a time, consolidate results into research.md
- Phase 1 (Design) artifacts -- data-model.md, contracts/, quickstart.md -- can be generated in parallel if they don't depend on each other's output

### Implement Phase Parallelism

During Phase 8 (Implement), for each implementation phase in tasks.md:
1. Read the phase and identify parallel groups (marked with `<!-- parallel-group: N -->` comments)
2. For each group, dispatch up to 3 `speckit.implement` subagents simultaneously, each given a specific subset of tasks
3. When all tasks in a group complete, move to the next group or sequential task
4. After the entire phase completes, checkpoint with the user before proceeding to the next phase

### Instructions for Tasks Agent

When the fleet orchestrator delegates to `speckit.tasks`, append this instruction:

> "Organize [P]-marked tasks into explicit parallel groups using `<!-- parallel-group: N -->` HTML comments. Each group should contain up to 3 tasks that can execute concurrently (different files, no dependencies). Add `<!-- sequential -->` before tasks that must run in order. This enables the fleet orchestrator to fan out up to 3 subagents per group during implementation."

## First-Turn Behavior -- Artifact Detection & Resume

On **every** invocation, before doing anything else, run artifact detection to determine where the workflow stands. This allows the orchestrator to resume mid-flight even in a fresh conversation.

### Step 0: Branch safety pre-flight

Before anything else, run basic git health checks:

1. **Uncommitted changes**: Run `git status --porcelain`. If there are uncommitted changes, warn the user:
   > WARNING: You have uncommitted changes. Starting the fleet may create conflicts. Commit or stash first?
   > - **Continue** -- proceed with uncommitted changes (risky)
   > - **Stash** -- run `git stash` and continue
   > - **Abort** -- stop and let the user handle it

2. **Detached HEAD**: Run `git branch --show-current`. If empty (detached HEAD), abort:
   > Cannot run fleet on a detached HEAD. Please check out a feature branch first.

3. **Branch freshness** (advisory): Run `git log --oneline HEAD..origin/main 2>/dev/null | wc -l`. If the main branch has commits not in the current branch, advise:
   > Your branch is {N} commits behind main. Consider rebasing before starting implementation to avoid merge conflicts later.

This check runs only once on first invocation. It does NOT block the workflow (except for detached HEAD).

### Step 1: Discover the feature directory

Run `{SCRIPT}` from the repo root to get the feature directory paths as JSON. Parse the output to get `FEATURE_DIR`.

If the script fails (e.g., not on a feature branch):
- If `FEATURE_DESCRIPTION` was provided in `$ARGUMENTS`, proceed directly to Phase 1 -- pass the description to `speckit.specify` and it will create the feature directory.
- If `$ARGUMENTS` is empty, ask the user for the feature description, then start Phase 1.

### Step 2: Check model configuration

Check if `{FEATURE_DIR}/../../../.specify/extensions/fleet/fleet-config.yml` (or the project's config location) has model settings. If the config file doesn't exist or models are set to defaults:

1. **Detect the platform**: Identify which IDE/agent platform you're running in (VS Code Copilot, Claude Code, Cursor, etc.) based on available context.

2. **Primary model**: If `models.primary` is `"auto"`, use whatever model you are currently running as. No action needed -- you ARE the primary model.

3. **Review model**: If `models.review` is `"ask"`, prompt the user:
   > **Model setup (one-time):** The cross-model review (Phase 7) works best with a *different* model than the one running the fleet, to catch blind spots.
   >
   > What model should I use for the review phase? Suggestions:
   > - A different model family (e.g., if you're on Claude, use GPT or Gemini)
   > - A different tier (e.g., if you're on Opus, use Sonnet)
   > - "skip" to skip Phase 7 entirely
   >
   > You can also set this permanently in your fleet config.

4. **Store the choice**: Remember the user's model selection for the duration of this conversation. If they want to persist it, suggest editing the config file.

### Step 3: Probe artifacts in FEATURE_DIR

Check these paths **in order** using the `read` tool. Each check is a file/directory existence AND basic integrity test:

| Check | Path | Existence | Integrity |
|-------|------|-----------|-----------|
| spec.md | `{FEATURE_DIR}/spec.md` | File exists? | Has `## User Stories` or `## Requirements` section? File > 100 bytes? |
| Clarifications | `{FEATURE_DIR}/spec.md` | Contains `## Clarifications` heading? | At least one Q&A pair present? |
| plan.md | `{FEATURE_DIR}/plan.md` | File exists? | Has `## Architecture` or `## Tech Stack` section? File > 200 bytes? |
| checklists/ | `{FEATURE_DIR}/checklists/` | Directory exists and has >=1 file? | Each file > 50 bytes? |
| tasks.md | `{FEATURE_DIR}/tasks.md` | File exists? | Contains at least one `- [ ]` or `- [x]` item? Has `### Phase` heading? |
| .analyze-done | `{FEATURE_DIR}/.analyze-done` | Marker file exists? | -- |
| review.md | `{FEATURE_DIR}/review.md` | File exists? | Contains `## Summary` and verdict table? |
| Implementation | `{FEATURE_DIR}/tasks.md` | All `- [x]`, zero `- [ ]` remaining? | -- |
| Verify extension | `.specify/extensions/verify/extension.yml` | File exists? | -- |
| Verification | `{FEATURE_DIR}/.verify-done` | Marker file exists? | -- |

**Integrity failures are advisory, not blocking.** If a file exists but fails integrity checks, warn the user:
> WARNING: `plan.md` exists but appears incomplete (missing expected sections). It may have been partially generated. Re-run Phase 3 (Plan), or continue with the current file?

### Step 4: Determine the resume phase

Walk the artifact signals **top-down**. The first phase whose artifact is **missing** is where work resumes:

```
if spec.md missing           -> resume at Phase 1 (Specify)
if no ## Clarifications       -> resume at Phase 2 (Clarify)
if plan.md missing           -> resume at Phase 3 (Plan)
if checklists/ empty/missing -> resume at Phase 4 (Checklist)
if tasks.md missing          -> resume at Phase 5 (Tasks)
if .analyze-done missing     -> resume at Phase 6 (Analyze)
if review.md missing         -> resume at Phase 7 (Review)
if tasks.md has `- [ ]`     -> resume at Phase 8 (Implement)
if .verify-done missing      -> resume at Phase 9 (Verify)
if all done                  -> resume at Phase 10 (Tests)
```

### Step 5: Present status and confirm

Show the user a status table and the detected resume point:

```
Feature: {branch name}
Directory: {FEATURE_DIR}

Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [x] checklists/ has 2 files
Phase 5 Tasks        [x] tasks.md found
Phase 6 Analyze      [ ] .analyze-done not found
Phase 7 Review       [ ] --
Phase 8 Implement    [ ] --
Phase 9 Verify       [ ] --
Phase 10 Tests       [ ] --

> Resuming at Phase 6: Analyze
```

Then ask: *"Detected progress above. Resume at Phase {N} ({name}), or override to a different phase?"*

- If user confirms -> create the todo list with completed phases marked as `completed` and resume from Phase N.
- If user provides a phase number or name -> start from that phase instead.
- If FEATURE_DIR doesn't exist -> start from Phase 1, ask for the feature description.

### Edge Cases

- **Implementation partially complete**: If `tasks.md` exists and has a mix of `[x]` and `[ ]`, resume at Phase 8 (Implement). Tell the user how many tasks remain: *"tasks.md: {done}/{total} tasks complete. {remaining} tasks remaining."*
- **Analyze completion marker**: After Phase 6 (Analyze) completes -- whether it produces `remediation.md` or not -- create a marker file `{FEATURE_DIR}/.analyze-done` containing the timestamp. This distinguishes "analyze ran clean" from "analyze never ran." The `.analyze-done` file is the artifact signal for Phase 6, not `remediation.md`.
- **Review can be skipped**: If user opts to skip cross-model review, treat Phase 7 as skipped and proceed to Phase 8.
- **Review found NO failures**: If `review.md` exists and overall verdict is "READY", Phase 7 is complete -- proceed to Phase 8.
- **Review found FAIL items**: If `review.md` has FAIL verdicts, present them and ask user whether to (a) fix the issues by re-running the relevant earlier phase, (b) proceed anyway, or (c) abort.
- **Verify extension not installed**: If `.specify/extensions/verify/extension.yml` doesn't exist, prompt to install. If user declines, skip Phase 9.
- **Verify completion marker**: After Phase 9 (Verify) completes, create `{FEATURE_DIR}/.verify-done` with timestamp. This distinguishes "verify ran" from "verify never ran."
- **Checklists may be skipped**: Some features don't use checklists. If `tasks.md` exists but `checklists/` doesn't, treat Phase 4 as skipped.
- **Fresh branch, no specs dir**: Start from Phase 1. Use `FEATURE_DESCRIPTION` from `$ARGUMENTS` if provided; otherwise ask the user.
- **User says "start over"**: Re-run from Phase 1 regardless of existing artifacts. Warn that this will overwrite existing artifacts and get confirmation.

### Stale Artifact Detection

After determining the resume phase, check for **stale downstream artifacts** -- files generated by an earlier phase that may be outdated because an upstream artifact was modified later.

Compare file modification timestamps in this dependency chain:

```
spec.md -> plan.md -> tasks.md -> .analyze-done -> review.md -> [implementation] -> .verify-done
```

If a file is **newer** than a downstream file that depends on it (e.g., `spec.md` was modified after `plan.md`), warn the user:

> WARNING: **Stale artifact detected**: `plan.md` (modified {date}) was generated before the latest `spec.md` change ({date}). Plan may not reflect current requirements. Re-run Phase 3 (Plan) to update, or proceed with the current plan?

This is advisory only -- the user decides whether to rerun. Do not block the workflow.

## Phase Execution Template

For each phase:
```
1. Mark the phase as in-progress in the todo list
2. Announce: "**Phase N: {Name}** -- delegating to {agent}..."
3. Delegate to the agent with relevant arguments:
   - Phase 1 (Specify): pass FEATURE_DESCRIPTION from $ARGUMENTS as the argument
   - Phase 2 (Clarify): pass the feature description and any user feedback
   - All other phases: pass the feature description and any user-provided refinements
4. Summarize the agent's output concisely
5. Ask: "Ready to proceed to Phase N+1 ({next name}), or would you like to revise?"
6. Wait for user response
7. Mark phase as completed when approved
```

## Phase 7: Cross-Model Review

This phase uses a **different model** than the one that generated plan.md and tasks.md, providing a fresh perspective to catch blind spots.

1. Delegate to `speckit.fleet.review` -- it runs on the **review model** configured in Step 2 (a different model than the primary) and is **read-only**
2. The review agent reads spec.md, plan.md, tasks.md, checklists/, and remediation.md
3. It evaluates 7 dimensions: spec-plan alignment, plan-tasks completeness, dependency ordering, parallelization correctness, feasibility & risk, standards compliance, implementation readiness
4. It outputs a structured review report with PASS/WARN/FAIL verdicts per dimension
5. **Save the review output** to `{FEATURE_DIR}/review.md`
6. Present the summary table to the user:
   - **All PASS / READY**: *"Cross-model review passed. Ready to implement?"*
   - **WARN items**: *"Review found {N} warnings. Proceed to implementation, or address them first?"*
   - **FAIL items**: *"Review found {N} critical issues that should be fixed before implementing."* -- list them and ask which earlier phase to re-run (plan, tasks, or analyze)
7. If user chooses to fix: loop back to the appropriate phase, then re-run review after fixes
8. If user approves: mark Phase 7 complete and proceed to Phase 8 (Implement)

**Note**: Phase 7 (Review) validates design artifacts *before* implementation. Phase 9 (Verify) validates actual code *after* implementation. Both are read-only.

## Phase 9: Post-Implementation Verification

This phase validates that the implemented code matches the specification artifacts. It requires the **verify extension**.

### Extension Installation Check

Before delegating to `speckit.verify`, check if the extension is installed:

1. Check if `.specify/extensions/verify/extension.yml` exists using the `read` tool
2. If **missing**, ask the user:
   > The verify extension is not installed. Install it now?
   > ```
   > specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
   > ```
3. If user approves, run the install command in the terminal
4. If user declines, skip Phase 9 and proceed to Phase 10 (CI)

### Verification Execution

1. Delegate to `speckit.verify` -- it reads spec.md, plan.md, tasks.md, constitution.md and the implemented source files
2. It runs 7 verification checks: task completion, file existence, requirement coverage, scenario & test coverage, spec intent alignment, constitution alignment, design & structure consistency
3. It outputs a verification report with findings, metrics, and next actions
4. Present the summary to the user:
   - **No findings**: *"Verification passed. Ready to run CI?"* -- proceed to Phase 10
   - **Findings exist**: Show the findings grouped by severity (CRITICAL, WARNING, INFO) and enter the **Implement-Verify loop** below

### Implement-Verify Loop

When verification produces findings, run a remediation loop:

```
repeat:
  1. Present findings to user
  2. Ask: "Re-run implementation to address these findings? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with findings as context, then re-run speckit.verify
     - skip  -> exit loop, proceed to Phase 10 with current state
     - abort -> stop the workflow entirely
  3. After re-verify, check findings again
until: no findings remain OR user says skip/abort
```

Rules for the loop:
- **Pass findings as context**: When delegating to `speckit.implement`, include the verification findings so it knows exactly what to fix. Prepend: *"Address the following verification findings: {findings list}"*
- **Suppress sub-agent handoffs** (Operating Rule 6 still applies)
- **Track iterations**: Show the loop count each time -- *"Implement-Verify iteration {N}: {findings_count} findings remaining"*
- **Cap at 3 iterations**: After 3 rounds, if findings persist, warn the user: *"3 remediation iterations completed with {N} findings still remaining. These may require manual intervention. Proceed to CI, or continue?"*
- **Human gate every iteration**: Never auto-loop -- always ask before re-implementing
- **Delta reporting**: After each re-verify, show what changed -- *"Fixed: {N}, New: {N}, Remaining: {N}"*

After the loop exits (no findings or user skips):
1. Create a marker file `{FEATURE_DIR}/.verify-done` containing the timestamp and final findings count
2. Mark Phase 9 complete and proceed to Phase 10 (Tests)

## Phase 10: Tests

After verification, detect and run the project's test suite.

### Test Runner Detection

Detect test runner(s) by checking for these files at the repo root, in order:

| Check | Runner | Command |
|-------|--------|---------|
| `package.json` with `"test"` script | npm/yarn/pnpm | `npm test` (or `yarn test` / `pnpm test` based on lockfile) |
| `*.sln` or `*.slnx` or `*.csproj` | dotnet | `dotnet test` |
| `Makefile` with `test` target | make | `make test` |
| `pytest.ini` or `pyproject.toml` with `[tool.pytest]` | pytest | `pytest` |
| `Cargo.toml` | cargo | `cargo test` |
| `go.mod` | go | `go test ./...` |

If **multiple** runners are detected (e.g., a monorepo with both `package.json` and `*.slnx`), run all of them and report results per runner.

If **no** runner is detected, ask the user: *"No test runner detected. What command runs your tests?"*

### Test Execution

1. Run the detected test command(s) from the repo root
2. Report pass/fail summary with failure details

### CI Remediation Loop

If CI fails, run a remediation loop (same pattern as the Implement-Verify loop):

```
repeat:
  1. Parse test failures -- group by type (compile error, test failure, lint error)
  2. Present failures to user with file locations and error messages
  3. Ask: "Fix these CI failures? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with failure details as context, then re-run CI
     - skip  -> exit loop, leave failures for manual fixing
     - abort -> stop the workflow entirely
  4. After re-run, check CI result again
until: CI passes OR user says skip/abort
```

Rules:
- **Pass failure context**: Include exact error messages, file paths, and test names when delegating to implement
- **Cap at 3 iterations**: After 3 rounds, warn: *"3 CI fix iterations completed, {N} failures remain. These likely need manual debugging."*
- **Human gate every iteration**: Never auto-loop
- **Delta reporting**: *"Fixed: {N} failures, New: {N}, Remaining: {N}"*
- **Distinguish failure types**: Compile errors should be fixed before test failures (they may cause cascading test failures)

### Tests Pass

When all tests pass, proceed to the Completion Summary.

## Error Recovery

### Parallel Task Failure

When a task within a parallel group fails during Phase 8 (Implement):
1. **Let the other in-flight tasks finish** -- don't abort tasks that are already running
2. Report which task(s) failed with error details
3. Offer three options:
   - **Retry failed only** -- re-dispatch only the failed task(s), skip completed ones
   - **Retry entire group** -- re-run all tasks in the parallel group (useful if failure cascaded)
   - **Skip and continue** -- mark the failed task(s) and move on (user can fix manually later)
4. Never auto-retry -- always ask the user

### Sub-Agent Timeout or Crash

If a delegated sub-agent doesn't return (timeout) or returns an error:
1. Report the phase and agent that failed
2. Offer to retry the same phase or skip it
3. If the same agent fails twice in a row, suggest the user run it manually (`/speckit.{agent}`) and then resume the fleet

## Phase Rollback

At any human gate, the user may say "go back to Phase N" or "rollback to plan." The fleet supports this:

1. **Identify the target phase**: Parse the user's request to determine which phase to roll back to.
2. **Warn about downstream invalidation**: All artifacts generated by phases *after* the target phase are now potentially stale. Show:
   > Rolling back to Phase {N} ({name}). The following artifacts may be invalidated:
   > - plan.md (Phase 3)
   > - tasks.md (Phase 5)
   > - Implementation (Phase 8)
   >
   > These will be regenerated as the workflow proceeds. Continue?
3. **Delete marker files only**: Remove `.analyze-done`, `.verify-done`, and `review.md` for invalidated phases. Do NOT delete spec.md, plan.md, or tasks.md -- they'll be overwritten when the phase re-runs.
4. **Update the todo list**: Reset all phases from the target phase onward to `not-started`.
5. **Resume from the target phase**: Follow the normal phase execution flow from that point.

**Constraints**:
- Cannot rollback during an active sub-agent delegation -- wait for it to complete first
- Rollback to Phase 1 (Specify) with "start over" requires explicit confirmation since it regenerates everything

## Completion Summary

After Phase 10 completes (CI passes or user skips CI), present a structured summary:

```
## Fleet Complete

Feature: {feature name}
Branch: {branch name}
Duration: Phases 1-10 ({phases completed}/{phases total}, {phases skipped} skipped)

### Artifacts Generated
- spec.md -- feature specification ({word count} words, {user stories count} user stories)
- plan.md -- technical plan ({components count} components)
- tasks.md -- {total tasks} tasks ({completed} completed, {remaining} remaining)
- review.md -- cross-model review (verdict: {verdict})

### Implementation
- Files created: {count}
- Files modified: {count}
- Tests added: {count}

### Quality Gates
- Analyze: {pass/findings count}
- Cross-model review: {verdict}
- Verify: {pass/findings count} ({iterations} iterations)
- CI: {pass/fail}

### Git
- Commits: {list of WIP commits if any}
- Ready to push: {yes/no}
```

After the summary, offer:
1. *"Push to remote and create a PR?"* (if the user wants)
2. *"View any artifact? (spec, plan, tasks, review)"*
\ No newline at end of file
diff --git a/.github/agents/speckit.review.md b/.github/agents/speckit.review.md
new file mode 100644
index 0000000..7b840b7
--- /dev/null
+++ b/.github/agents/speckit.review.md
@@ -0,0 +1,117 @@
---
description: Cross-model evaluation of plan.md and tasks.md before implementation.
  Reviews feasibility, completeness, dependency ordering, risk, and parallelization
  correctness using a different model than was used to generate the artifacts.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
user-invocable: false
agents: []
---


<!-- Extension: fleet -->
<!-- Config: .specify/extensions/fleet/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

---

You are a **Pre-Implementation Reviewer** -- a critical evaluator who reviews the design artifacts (plan.md, tasks.md, spec.md) produced by earlier workflow phases. Your purpose is to catch issues that the generating model may have been blind to, before implementation begins.

**STRICTLY READ-ONLY**: Do NOT modify any files. Output a structured review report only.

## What You Review

Run `{SCRIPT}` from the repo root to discover `FEATURE_DIR`. Then read these artifacts:

- `spec.md` -- the feature specification (requirements, user stories)
- `plan.md` -- the technical plan (architecture, tech stack, file structure)
- `tasks.md` -- the task breakdown (phased, dependency-ordered, with [P] markers)
- `checklists/` -- any requirement quality checklists (if present)
- `remediation.md` -- analyze output (if present)

## Review Dimensions

Evaluate across these 7 dimensions. For each, assign a verdict: **PASS**, **WARN**, or **FAIL**.

### 1. Spec-Plan Alignment
- Does plan.md address every user story in spec.md?
- Are there plan decisions that contradict spec requirements?
- Are non-functional requirements (performance, security, accessibility) covered in the plan?

### 2. Plan-Tasks Completeness
- Does every architectural component in plan.md have corresponding tasks in tasks.md?
- Are there tasks that reference files/patterns not described in plan.md?
- Are test tasks present for critical paths?

### 3. Dependency Ordering
- Are task phases ordered correctly? (setup -> foundational -> stories -> polish)
- Do any tasks reference files/interfaces that haven't been created by an earlier task?
- Are foundational tasks truly blocking, or could some be parallelized?

### 4. Parallelization Correctness
- Are `[P]` markers accurate? (Do tasks marked parallel truly touch different files with no dependency?)
- Are there tasks NOT marked `[P]` that could be parallelized?
- Do `<!-- parallel-group: N -->` groupings respect the max-3 constraint?
- Are there same-file conflicts hidden within a parallel group?

### 5. Feasibility & Risk
- Are there tasks that seem too large? (If a single task touches >3 files or >200 LOC, flag it)
- Are there technology choices in plan.md that contradict the project's existing stack?
- Are there missing error handling, edge case, or migration tasks?
- Does the task count seem proportional to the feature complexity?

### 6. Constitution & Standards Compliance
- Read `.specify/memory/constitution.md` and check plan aligns with project principles
- Check that testing approach matches the project's testing standards (80% coverage, TDD if required)
- Verify security considerations are addressed (path validation, input sanitization, etc.)

### 7. Implementation Readiness
- Is every task specific enough for an LLM to execute without ambiguity?
- Do all tasks include exact file paths?
- Are acceptance criteria clear for each user story phase?

## Output Format

```markdown
# Pre-Implementation Review

**Feature**: {feature name from spec.md}
**Artifacts reviewed**: spec.md, plan.md, tasks.md, [others if present]
**Review model**: {your model name} (should be different from the model that generated the artifacts)
**Generating model**: {model used for Phases 1-6, if known}

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS/WARN/FAIL | brief note |
| Plan-Tasks Completeness | PASS/WARN/FAIL | brief note |
| Dependency Ordering | PASS/WARN/FAIL | brief note |
| Parallelization Correctness | PASS/WARN/FAIL | brief note |
| Feasibility & Risk | PASS/WARN/FAIL | brief note |
| Standards Compliance | PASS/WARN/FAIL | brief note |
| Implementation Readiness | PASS/WARN/FAIL | brief note |

**Overall**: READY / READY WITH WARNINGS / NOT READY

## Findings

### Critical (FAIL -- must fix before implementing)
1. ...

### Warnings (WARN -- recommend fixing, can proceed)
1. ...

### Observations (informational)
1. ...

## Recommended Actions
- [ ] {specific action to address each FAIL/WARN}
```
\ No newline at end of file
diff --git a/.github/agents/speckit.verify.md b/.github/agents/speckit.verify.md
new file mode 100644
index 0000000..df33b08
--- /dev/null
+++ b/.github/agents/speckit.verify.md
@@ -0,0 +1,214 @@
---
description: Perform a non-destructive post-implementation verification gate validating
  the implementation against spec.md, plan.md, tasks.md, and constitution.md.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks
  ps: scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
handoffs:
- label: Address findings and re-implement
  agent: speckit.implement
  prompt: Address the verification findings and re-run implementation to resolve issues
- label: Re-analyze specification consistency
  agent: speckit.analyze
  prompt: Re-analyze specification consistency based on verification findings
---


<!-- Extension: verify -->
<!-- Config: .specify/extensions/verify/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Validate the implementation against its specification artifacts (`spec.md`, `plan.md`, `tasks.md`, `constitution.md`). This command MUST run only after `/speckit.implement` has completed.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this verification scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, tasks or implementation—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.verify`.

## Execution Steps

### 1. Initialize Verification Context

Run `{SCRIPT}` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort if SPEC or TASKS is missing (instruct the user to run the missing prerequisite command). PLAN and constitution are optional — checks that depend on them are skipped gracefully.
Abort if TASKS has no completed tasks.
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Functional Requirements
- User Stories and Acceptance Criteria
- Scenarios
- Edge Cases (if present)

**From plan.md (optional):**

- Architecture/stack choices
- Data Model references
- Project structure (directory layout)

**From tasks.md:**

- Task IDs
- Completion status
- Descriptions
- Phase grouping
- Referenced file paths
- Count total tasks and completed tasks

**From constitution (optional):**

- Load `.specify/memory/constitution.md` for principle validation
- If missing or placeholder: skip constitution checks, emit Info finding

### 3. Identify Implementation Scope

Build the set of files to verify from tasks.md.

- Parse all tasks in tasks.md — both completed (`[x]`/`[X]`) and incomplete (`[ ]`)
- Extract file paths referenced in each task description
- Build **REVIEW_FILES** set from completed task file paths
- Track **INCOMPLETE_TASK_FILES** from incomplete tasks (used by check C)

### 4. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Task inventory**: Each task with ID, completion status, referenced file paths, and phase grouping
- **Implementation mapping**: Map each completed task to its referenced file paths
- **File inventory**: All REVIEW_FILES with existence verification — flag any task-referenced file that does not exist on disk
- **Requirements inventory**: Each functional requirement with a stable key — map to tasks and REVIEW_FILES for implementation evidence (evidence = file in REVIEW_FILES containing keyword/ID match, function signatures, or code paths that address the requirement)
- **Spec intent references**: User stories, acceptance criteria, and scenarios from spec.md
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 5. Verification Checks (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Task Completion

- Compare completed (`[x]`/`[X]`) vs total tasks
- Flag majority incomplete vs minority incomplete

#### B. File Existence

- Task-referenced files that do not exist on disk
- Tasks referencing ambiguous or unresolvable paths

#### C. Requirement Coverage

- Requirements with no implementation evidence in REVIEW_FILES
- Requirements whose tasks are all incomplete

#### D. Scenario & Test Coverage

- Spec scenarios with no corresponding test or code path
- No test files detected at all in REVIEW_FILES

#### E. Spec Intent Alignment

- Implementation diverging from spec intent (minor vs fundamental divergence)
- Compare acceptance criteria against actual behaviour in REVIEW_FILES

#### F. Constitution Alignment

- Any implementation element conflicting with a constitution MUST principle
- Missing mandated sections or quality gates from constitution

#### G. Design & Structure Consistency

- Architectural decisions or design patterns from plan.md not reflected in code
- Planned directory/file layout deviating from actual structure
- New code deviating from existing project conventions (naming, module structure, error handling patterns)
- Public APIs/exports/endpoints not described in plan.md

### 6. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, majority of tasks incomplete, task-referenced files missing from disk, requirement with zero implementation
- **HIGH**: Spec intent divergence, fundamental implementation mismatch with acceptance criteria, missing scenario/test coverage
- **MEDIUM**: Design pattern drift, minor spec intent deviation
- **LOW**: Structure deviations, naming inconsistencies, minor observations not affecting functionality
- **INFO**: Positive confirmations (all tasks complete, all requirements covered, no issues found). Use sparingly — only in summary metrics, not as individual finding rows.

### 7. Produce Compact Verification Report

Output a Markdown report (no file writes) with the following structure:

## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | CRITICAL | tasks.md | 3 of 12 tasks incomplete | Complete tasks T05, T08, T11 |
| B1 | File Existence | CRITICAL | src/auth.ts | Task-referenced file missing | Create file or update task reference |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |

(Add one row per finding; generate stable IDs prefixed by check letter: A1, B1, C1... Reference specific files and line numbers in Location(s) where applicable.)

**Task Summary Table:**

| Task ID | Status | Referenced Files | Notes |
|---------|--------|-----------------|-------|

**Constitution Alignment Issues:** (if any)

**Metrics:**

- Total Tasks (completed / total)
- Requirement Coverage % (requirements with implementation evidence / total)
- Files Verified
- Critical Issues Count

### 8. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before proceeding
- If HIGH issues exist: Recommend addressing before merge; user may proceed at own risk
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run `/speckit.implement` to address findings and re-run verification", "Implementation verified — ready for review or merge"

### 9. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts and source files incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
- **Every finding must trace back** to a specification artifact (spec.md requirement, user story, scenario, edge case), a structural reference (plan.md, constitution.md), or a task in tasks.md

### Idempotency by Design

The command produces deterministic output — running verification twice on the same state yields the same report. No counters, timestamp-dependent logic, or accumulated state affects findings. The report is fully regenerated on each run.
\ No newline at end of file
diff --git a/.github/agents/speckit.verify.run.md b/.github/agents/speckit.verify.run.md
new file mode 100644
index 0000000..df33b08
--- /dev/null
+++ b/.github/agents/speckit.verify.run.md
@@ -0,0 +1,214 @@
---
description: Perform a non-destructive post-implementation verification gate validating
  the implementation against spec.md, plan.md, tasks.md, and constitution.md.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks
  ps: scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
handoffs:
- label: Address findings and re-implement
  agent: speckit.implement
  prompt: Address the verification findings and re-run implementation to resolve issues
- label: Re-analyze specification consistency
  agent: speckit.analyze
  prompt: Re-analyze specification consistency based on verification findings
---


<!-- Extension: verify -->
<!-- Config: .specify/extensions/verify/ -->
## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Validate the implementation against its specification artifacts (`spec.md`, `plan.md`, `tasks.md`, `constitution.md`). This command MUST run only after `/speckit.implement` has completed.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this verification scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, tasks or implementation—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.verify`.

## Execution Steps

### 1. Initialize Verification Context

Run `{SCRIPT}` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort if SPEC or TASKS is missing (instruct the user to run the missing prerequisite command). PLAN and constitution are optional — checks that depend on them are skipped gracefully.
Abort if TASKS has no completed tasks.
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Functional Requirements
- User Stories and Acceptance Criteria
- Scenarios
- Edge Cases (if present)

**From plan.md (optional):**

- Architecture/stack choices
- Data Model references
- Project structure (directory layout)

**From tasks.md:**

- Task IDs
- Completion status
- Descriptions
- Phase grouping
- Referenced file paths
- Count total tasks and completed tasks

**From constitution (optional):**

- Load `.specify/memory/constitution.md` for principle validation
- If missing or placeholder: skip constitution checks, emit Info finding

### 3. Identify Implementation Scope

Build the set of files to verify from tasks.md.

- Parse all tasks in tasks.md — both completed (`[x]`/`[X]`) and incomplete (`[ ]`)
- Extract file paths referenced in each task description
- Build **REVIEW_FILES** set from completed task file paths
- Track **INCOMPLETE_TASK_FILES** from incomplete tasks (used by check C)

### 4. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Task inventory**: Each task with ID, completion status, referenced file paths, and phase grouping
- **Implementation mapping**: Map each completed task to its referenced file paths
- **File inventory**: All REVIEW_FILES with existence verification — flag any task-referenced file that does not exist on disk
- **Requirements inventory**: Each functional requirement with a stable key — map to tasks and REVIEW_FILES for implementation evidence (evidence = file in REVIEW_FILES containing keyword/ID match, function signatures, or code paths that address the requirement)
- **Spec intent references**: User stories, acceptance criteria, and scenarios from spec.md
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 5. Verification Checks (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Task Completion

- Compare completed (`[x]`/`[X]`) vs total tasks
- Flag majority incomplete vs minority incomplete

#### B. File Existence

- Task-referenced files that do not exist on disk
- Tasks referencing ambiguous or unresolvable paths

#### C. Requirement Coverage

- Requirements with no implementation evidence in REVIEW_FILES
- Requirements whose tasks are all incomplete

#### D. Scenario & Test Coverage

- Spec scenarios with no corresponding test or code path
- No test files detected at all in REVIEW_FILES

#### E. Spec Intent Alignment

- Implementation diverging from spec intent (minor vs fundamental divergence)
- Compare acceptance criteria against actual behaviour in REVIEW_FILES

#### F. Constitution Alignment

- Any implementation element conflicting with a constitution MUST principle
- Missing mandated sections or quality gates from constitution

#### G. Design & Structure Consistency

- Architectural decisions or design patterns from plan.md not reflected in code
- Planned directory/file layout deviating from actual structure
- New code deviating from existing project conventions (naming, module structure, error handling patterns)
- Public APIs/exports/endpoints not described in plan.md

### 6. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, majority of tasks incomplete, task-referenced files missing from disk, requirement with zero implementation
- **HIGH**: Spec intent divergence, fundamental implementation mismatch with acceptance criteria, missing scenario/test coverage
- **MEDIUM**: Design pattern drift, minor spec intent deviation
- **LOW**: Structure deviations, naming inconsistencies, minor observations not affecting functionality
- **INFO**: Positive confirmations (all tasks complete, all requirements covered, no issues found). Use sparingly — only in summary metrics, not as individual finding rows.

### 7. Produce Compact Verification Report

Output a Markdown report (no file writes) with the following structure:

## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | CRITICAL | tasks.md | 3 of 12 tasks incomplete | Complete tasks T05, T08, T11 |
| B1 | File Existence | CRITICAL | src/auth.ts | Task-referenced file missing | Create file or update task reference |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |

(Add one row per finding; generate stable IDs prefixed by check letter: A1, B1, C1... Reference specific files and line numbers in Location(s) where applicable.)

**Task Summary Table:**

| Task ID | Status | Referenced Files | Notes |
|---------|--------|-----------------|-------|

**Constitution Alignment Issues:** (if any)

**Metrics:**

- Total Tasks (completed / total)
- Requirement Coverage % (requirements with implementation evidence / total)
- Files Verified
- Critical Issues Count

### 8. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before proceeding
- If HIGH issues exist: Recommend addressing before merge; user may proceed at own risk
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run `/speckit.implement` to address findings and re-run verification", "Implementation verified — ready for review or merge"

### 9. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts and source files incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
- **Every finding must trace back** to a specification artifact (spec.md requirement, user story, scenario, edge case), a structural reference (plan.md, constitution.md), or a task in tasks.md

### Idempotency by Design

The command produces deterministic output — running verification twice on the same state yields the same report. No counters, timestamp-dependent logic, or accumulated state affects findings. The report is fully regenerated on each run.
\ No newline at end of file
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
new file mode 100644
index 0000000..92425f1
--- /dev/null
+++ b/.github/workflows/ci.yml
@@ -0,0 +1,21 @@
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  CARGO_TERM_COLOR: always

jobs:
  ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          components: clippy, rustfmt
      - uses: Swatinem/rust-cache@v2
      - run: make ci
diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml
new file mode 100644
index 0000000..99174ed
--- /dev/null
+++ b/.github/workflows/release.yml
@@ -0,0 +1,58 @@
name: Release

on:
  push:
    tags: ["v*"]

permissions:
  contents: write

env:
  CARGO_TERM_COLOR: always

jobs:
  build:
    strategy:
      matrix:
        include:
          - target: x86_64-unknown-linux-gnu
            os: ubuntu-latest
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-latest
          - target: x86_64-apple-darwin
            os: macos-latest
          - target: aarch64-apple-darwin
            os: macos-latest
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
        with:
          targets: ${{ matrix.target }}
      - name: Install cross-compilation tools
        if: matrix.target == 'aarch64-unknown-linux-gnu'
        run: |
          sudo apt-get update
          sudo apt-get install -y gcc-aarch64-linux-gnu
          echo "CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc" >> $GITHUB_ENV
      - run: cargo build --release --target ${{ matrix.target }}
      - name: Package binary
        run: |
          cd target/${{ matrix.target }}/release
          tar czf ../../../git-collab-${{ matrix.target }}.tar.gz git-collab
      - uses: actions/upload-artifact@v4
        with:
          name: git-collab-${{ matrix.target }}
          path: git-collab-${{ matrix.target }}.tar.gz

  release:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@v4
        with:
          merge-multiple: true
      - uses: softprops/action-gh-release@v2
        with:
          files: git-collab-*.tar.gz
          generate_release_notes: true
diff --git a/.specify/extensions.yml b/.specify/extensions.yml
new file mode 100644
index 0000000..b788c36
--- /dev/null
+++ b/.specify/extensions.yml
@@ -0,0 +1,20 @@
installed: []
settings:
  auto_execute_hooks: true
hooks:
  after_tasks:
  - extension: fleet
    command: speckit.fleet.review
    enabled: true
    optional: true
    prompt: Run cross-model review to evaluate plan and tasks before implementation?
    description: Pre-implementation review gate
    condition: null
  after_implement:
  - extension: verify
    command: speckit.verify.run
    enabled: true
    optional: true
    prompt: Run verify to validate implementation against specification?
    description: Post-implementation verification gate
    condition: null
diff --git a/.specify/extensions/.cache/catalog-metadata.json b/.specify/extensions/.cache/catalog-metadata.json
new file mode 100644
index 0000000..7be61ff
--- /dev/null
+++ b/.specify/extensions/.cache/catalog-metadata.json
@@ -0,0 +1,4 @@
{
  "cached_at": "2026-03-21T05:53:55.135454+00:00",
  "catalog_url": "https://raw.githubusercontent.com/github/spec-kit/main/extensions/catalog.json"
}
\ No newline at end of file
diff --git a/.specify/extensions/.cache/catalog.json b/.specify/extensions/.cache/catalog.json
new file mode 100644
index 0000000..f06cfe5
--- /dev/null
+++ b/.specify/extensions/.cache/catalog.json
@@ -0,0 +1,21 @@
{
  "schema_version": "1.0",
  "updated_at": "2026-03-10T00:00:00Z",
  "catalog_url": "https://raw.githubusercontent.com/github/spec-kit/main/extensions/catalog.json",
  "extensions": {
    "selftest": {
      "name": "Spec Kit Self-Test Utility",
      "id": "selftest",
      "version": "1.0.0",
      "description": "Verifies catalog extensions by programmatically walking through the discovery, installation, and registration lifecycle.",
      "author": "spec-kit-core",
      "repository": "https://github.com/github/spec-kit",
      "download_url": "https://github.com/github/spec-kit/releases/download/selftest-v1.0.0/selftest.zip",
      "tags": [
        "testing",
        "core",
        "utility"
      ]
    }
  }
}
\ No newline at end of file
diff --git a/.specify/extensions/.registry b/.specify/extensions/.registry
new file mode 100644
index 0000000..57be14d
--- /dev/null
+++ b/.specify/extensions/.registry
@@ -0,0 +1,43 @@
{
  "schema_version": "1.0",
  "extensions": {
    "fleet": {
      "version": "1.0.0",
      "source": "local",
      "manifest_hash": "sha256:0397699f61f925ec5745bb32af24420328b1ad27b45770b2c9b241fe84eafdb3",
      "enabled": true,
      "registered_commands": {
        "claude": [
          "speckit.fleet.run",
          "speckit.fleet",
          "speckit.fleet.review",
          "speckit.review"
        ],
        "copilot": [
          "speckit.fleet.run",
          "speckit.fleet",
          "speckit.fleet.review",
          "speckit.review"
        ]
      },
      "installed_at": "2026-03-21T05:56:13.647745+00:00"
    },
    "verify": {
      "version": "1.0.0",
      "source": "local",
      "manifest_hash": "sha256:4baa329349fffb5598a7bd41f958cf193aac77dd624c1c64799de052995ca29a",
      "enabled": true,
      "registered_commands": {
        "claude": [
          "speckit.verify.run",
          "speckit.verify"
        ],
        "copilot": [
          "speckit.verify.run",
          "speckit.verify"
        ]
      },
      "installed_at": "2026-03-21T06:52:20.304190+00:00"
    }
  }
}
\ No newline at end of file
diff --git a/.specify/extensions/fleet/.gitignore b/.specify/extensions/fleet/.gitignore
new file mode 100644
index 0000000..5027b2c
--- /dev/null
+++ b/.specify/extensions/fleet/.gitignore
@@ -0,0 +1,5 @@
# Spec-Kit Extension: Fleet Orchestrator
*.log
*.tmp
.DS_Store
Thumbs.db
diff --git a/.specify/extensions/fleet/CHANGELOG.md b/.specify/extensions/fleet/CHANGELOG.md
new file mode 100644
index 0000000..018fb9a
--- /dev/null
+++ b/.specify/extensions/fleet/CHANGELOG.md
@@ -0,0 +1,13 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - 2026-03-06

### Added
- Fleet orchestrator command (`speckit.fleet.run`) -- 10-phase lifecycle from spec to CI with human-in-the-loop gates, artifact-based resume, and parallel subagent execution
- Cross-model review command (`speckit.fleet.review`) -- 7-dimension pre-implementation evaluation using a different model for blind-spot detection
- Configurable models, parallelism, and verify settings via `fleet-config.yml`
diff --git a/.specify/extensions/fleet/LICENSE b/.specify/extensions/fleet/LICENSE
new file mode 100644
index 0000000..af3ba97
--- /dev/null
+++ b/.specify/extensions/fleet/LICENSE
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 sharathsatish

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
diff --git a/.specify/extensions/fleet/README.md b/.specify/extensions/fleet/README.md
new file mode 100644
index 0000000..7c513e1
--- /dev/null
+++ b/.specify/extensions/fleet/README.md
@@ -0,0 +1,218 @@
# Fleet Orchestrator -- Spec-Kit Extension

Orchestrate a full feature lifecycle with human-in-the-loop gates across all SpecKit phases.

```
specify -> clarify -> plan -> checklist -> tasks -> analyze -> review -> implement -> verify -> CI
```

The fleet orchestrator chains 10 phases into a single command, detecting partially complete work and resuming from the right phase. It dispatches up to 3 parallel subagents during Plan and Implement, and uses a **cross-model review** before implementation to catch blind spots.

## Features

- **10-phase workflow** -- end-to-end from idea to CI-passing code
- **Human gates** -- approve, revise, skip, abort, or rollback after every phase
- **Mid-workflow resume** -- detects existing artifacts and picks up where you left off
- **Artifact integrity checks** -- validates file sizes and expected sections during probe
- **Stale artifact detection** -- warns when upstream changes invalidate downstream files
- **Parallel execution** -- up to 3 concurrent subagents for `[P]`-marked tasks
- **Cross-model review** -- Phase 7 uses a different model to evaluate plan + tasks
- **Implement-verify loop** -- Phase 9 auto-remediates findings (up to 3 iterations)
- **CI remediation loop** -- Phase 10 auto-fixes test/build failures (up to 3 iterations)
- **Phase rollback** -- jump back to any earlier phase with downstream invalidation
- **Branch safety** -- pre-flight checks for uncommitted changes, detached HEAD, branch freshness
- **Git checkpoints** -- optional WIP commits after design, implementation, and verification
- **Context budget awareness** -- detects long sessions and suggests fresh-chat resume
- **Completion summary** -- structured report with artifact stats, quality gates, and git status
- **IDE-agnostic** -- works with VS Code Copilot, Claude Code, Cursor, and other platforms
- **Verify integration** -- auto-prompts to install the [verify extension](https://github.com/ismaelJimenez/spec-kit-verify) if missing

## Prerequisites

- [Spec-Kit](https://github.com/github/spec-kit) >= 0.1.0
- These core SpecKit commands must be available:
  - `speckit.specify`, `speckit.clarify`, `speckit.plan`, `speckit.checklist`
  - `speckit.tasks`, `speckit.analyze`, `speckit.implement`
- (Optional) [Verify extension](https://github.com/ismaelJimenez/spec-kit-verify) for Phase 9

## Installation

### From GitHub Release

```bash
specify extension add fleet --from https://github.com/sharathsatish/spec-kit-fleet/archive/refs/tags/v1.0.0.zip
```

### Local Development

```bash
specify extension add --dev /path/to/spec-kit-fleet
```

### Verify Installation

```bash
specify extension list
# Should show: fleet (1.0.0) -- Fleet Orchestrator
```

After installation, the following commands are registered:

| Command | Alias | Description |
|---------|-------|-------------|
| `speckit.fleet.run` | `speckit.fleet` | Full lifecycle orchestrator |
| `speckit.fleet.review` | `speckit.review` | Cross-model pre-implementation review |

## Usage

### Start the Fleet

In VS Code Copilot Chat:

```
/speckit.fleet Build a capability browser that lets users search and filter available capabilities
```

Or with no arguments to auto-detect progress on the current feature branch:

```
/speckit.fleet
```

### Resume Mid-Workflow

The fleet automatically detects existing artifacts on every invocation. If you're on a feature branch with `spec.md` and `plan.md` already created, it will show:

```
Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [ ] --
...
> Resuming at Phase 4: Checklist
```

You can confirm or override to any phase.

### Run Review Standalone

```
/speckit.review
```

This runs the cross-model evaluation independently (read-only, no file modifications).

## Configuration

After installation, optionally customize settings by editing `.specify/extensions/fleet/fleet-config.yml`:

```yaml
# Parallel execution (1-3 concurrent subagents)
parallel:
  max_concurrency: 3

# Model preferences -- set to exact model names for your IDE, or use defaults
# Examples:
#   VS Code Copilot:  "Claude Opus 4.6 (copilot)", "Claude Sonnet 4.6 (copilot)"
#   Claude Code:      "claude-sonnet-4-20250514", "claude-opus-4-20250514"
#   Cursor:           "claude-sonnet-4", "gpt-4o"
models:
  primary: "auto"     # Uses whatever model is running the fleet
  review: "ask"       # Prompts you on first run to pick a different model

# Verify extension auto-install prompt
verify:
  auto_prompt_install: true
  install_url: "https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip"
```

| Setting | Default | Description |
|---------|---------|-------------|
| `parallel.max_concurrency` | `3` | Max subagents dispatched simultaneously |
| `models.primary` | `"auto"` | Uses the current model; set an explicit name to override |
| `models.review` | `"ask"` | Prompts on first run; set a model name (or list) to skip the prompt |
| `verify.auto_prompt_install` | `true` | Prompt to install verify extension if missing |
| `verify.install_url` | GitHub archive URL | Verify extension download URL |

### Model Setup

On first run, the fleet detects your platform and asks which model to use for the cross-model review (Phase 7). The review should use a **different** model than the primary to catch blind spots -- e.g., if you're on Claude Opus, use GPT or Sonnet for review.

You can skip the prompt by setting explicit model names in the config file.

## Workflow Phases

| # | Phase | Agent | What It Does |
|---|-------|-------|--------------|
| 1 | Specify | `speckit.specify` | Generate feature specification |
| 2 | Clarify | `speckit.clarify` | Ask targeted clarification questions (repeatable) |
| 3 | Plan | `speckit.plan` | Create technical implementation plan |
| 4 | Checklist | `speckit.checklist` | Generate quality checklists |
| 5 | Tasks | `speckit.tasks` | Break plan into dependency-ordered tasks |
| 6 | Analyze | `speckit.analyze` | Cross-artifact consistency analysis |
| 7 | Review | `speckit.fleet.review` | Cross-model evaluation (different model) |
| 8 | Implement | `speckit.implement` | Execute tasks (parallel groups) |
| 9 | Verify | `speckit.verify` | Validate code against spec artifacts |
| 10 | Tests | Terminal | Auto-detect test runner and run tests |

### Human Gates

After every phase, you're asked to:
- **Approve** -- proceed to the next phase
- **Revise** -- re-run with feedback
- **Skip** -- skip this phase
- **Abort** -- stop the workflow

### Parallel Execution

During Plan (Phase 3) and Implement (Phase 8), tasks marked with `[P]` are grouped into parallel batches:

```markdown
<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create ModelA.cs
- [ ] T003 [P] Create ModelB.cs
- [ ] T004 [P] Create ModelC.cs

<!-- sequential -->
- [ ] T005 Create service that depends on all models
```

Constraints:
- Max 3 concurrent subagents per group
- Tasks touching the same file always run sequentially
- Human gate applies after each implementation phase completes

## Troubleshooting

### Fleet doesn't detect my feature directory

The fleet runs `check-prerequisites.ps1 -Json -PathsOnly` to discover `FEATURE_DIR`. This requires a feature branch with a matching specs directory. Ensure your branch follows the naming convention and a specs folder exists.

### Review phase uses the wrong model

On first run, the fleet asks which model to use for review. To change it permanently, edit `models.review` in `.specify/extensions/fleet/fleet-config.yml` with your IDE's model identifier. Set to `"ask"` to be prompted again next time.

### Verify extension not found

If Phase 9 reports the verify extension isn't installed, run:

```bash
specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
```

Or set `verify.auto_prompt_install: false` in config to always skip verification.

### Stale artifact warning

This means an upstream file (e.g., `spec.md`) was modified after a downstream file (e.g., `plan.md`) was generated. Re-run the affected phase or proceed if the change was cosmetic.

## Contributing

1. Fork this repository
2. Create a feature branch
3. Make changes and test with `specify extension add --dev /path/to/your-fork`
4. Submit a pull request

## License

[MIT](LICENSE)
diff --git a/.specify/extensions/fleet/commands/fleet.md b/.specify/extensions/fleet/commands/fleet.md
new file mode 100644
index 0000000..7ea550e
--- /dev/null
+++ b/.specify/extensions/fleet/commands/fleet.md
@@ -0,0 +1,499 @@
---
description: "Orchestrate a full feature lifecycle through all SpecKit phases with human-in-the-loop checkpoints: specify -> clarify -> plan -> checklist -> tasks -> analyze -> cross-model review -> implement -> verify -> CI. Detects partially complete features and resumes from the right phase."
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
agents:
  - speckit.specify
  - speckit.clarify
  - speckit.plan
  - speckit.checklist
  - speckit.tasks
  - speckit.analyze
  - speckit.fleet.review
  - speckit.implement
  - speckit.verify
user-invocable: true
disable-model-invocation: true
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty). Classify the input:

1. **Feature description** (e.g., "Build a capability browser that lets users..."): Store as `FEATURE_DESCRIPTION`. This will be passed verbatim to `speckit.specify` in Phase 1. Skip artifact detection if no `FEATURE_DIR` is found -- go straight to Phase 1.
2. **Phase override** (e.g., "resume at Phase 5" or "start from plan"): Override the auto-detected resume point.
3. **Empty**: Run artifact detection and resume from the detected phase.

---

You are the **SpecKit Fleet Orchestrator** -- a workflow conductor that drives a feature from idea to implementation by delegating to specialized SpecKit agents in order, with human approval at every checkpoint.

## Workflow Phases

| Phase | Agent | Artifact Signal | Gate |
|-------|-------|-----------------|------|
| 1. Specify | `speckit.specify` | `spec.md` exists in FEATURE_DIR | User approves spec |
| 2. Clarify | `speckit.clarify` | `spec.md` contains a `## Clarifications` section | User says "done" or requests another round |
| 3. Plan | `speckit.plan` | `plan.md` exists in FEATURE_DIR | User approves plan |
| 4. Checklist | `speckit.checklist` | `checklists/` directory exists and contains at least one file | User approves checklist |
| 5. Tasks | `speckit.tasks` | `tasks.md` exists in FEATURE_DIR | User approves tasks |
| 6. Analyze | `speckit.analyze` | `.analyze-done` marker exists in FEATURE_DIR | User acknowledges analysis |
| 7. Review | `speckit.fleet.review` | `review.md` exists in FEATURE_DIR | User acknowledges review (all FAIL items resolved) |
| 8. Implement | `speckit.implement` | ALL task checkboxes in tasks.md are `[x]` (none `[ ]`) | Implementation complete |
| 9. Verify | `speckit.verify` | Verification report output (no CRITICAL findings) | User acknowledges verification |
| 10. Tests | Terminal | Tests pass | Tests pass |

## Operating Rules

1. **One phase at a time.** Never skip ahead or run phases in parallel.
2. **Human gate after every phase.** After each agent completes, summarize the outcome and ask the user to:
   - **Approve** -> proceed to the next phase
   - **Revise** -> re-run the same phase with user feedback
   - **Skip** -> mark phase as skipped and move on (user must confirm)
   - **Abort** -> stop the workflow entirely
   - **Rollback** -> jump back to an earlier phase (see Phase Rollback below)
3. **Clarify is repeatable.** After Phase 2, ask: *"Run another clarification round, or move on to planning?"* Loop until the user says done.
4. **Track progress.** Use the todo tool to create and update a checklist of all 10 phases so the user always sees where they are.
5. **Pass context forward.** When delegating, include the feature description and any user-provided refinements so each agent has full context.
6. **Suppress sub-agent handoffs.** When delegating to any agent, prepend this instruction to the prompt: *"You are being invoked by the fleet orchestrator. Do NOT follow handoffs or auto-forward to other agents. Return your output to the orchestrator and stop."* This prevents `send: true` handoff chains (e.g., plan -> tasks -> analyze -> implement) from bypassing fleet's human gates.
7. **Verify phase.** After implementation, run `speckit.verify` to validate code against spec artifacts. Requires the verify extension (see Phase 9).
8. **Test phase.** After verification, detect the project's test runner(s) and run tests. See Phase 10 for detection logic.
9. **Git checkpoint commits.** After these phases complete, offer to create a WIP commit to safeguard progress:
   - After Phase 5 (Tasks) -- all design artifacts are finalized
   - After Phase 8 (Implement) -- all code is written
   - After Phase 9 (Verify) -- code is validated
   Commit message format: `wip: fleet phase {N} -- {phase name} complete`
   Always ask before committing -- never auto-commit. If the user declines, continue without committing.
10. **Context budget awareness.** Long-running fleet sessions can exhaust the model's context window. Monitor for these signs:
    - Responses becoming shorter or losing earlier context
    - Reaching Phase 8+ in a session that started from Phase 1
    At natural checkpoints (after git commits or between phases), if context pressure seems high, suggest: *"This is getting long. We can continue in a new chat -- the fleet will auto-detect progress and resume at Phase {N}."*

## Parallel Subagent Execution (Plan & Implement Phases)

During **Phase 3 (Plan)** and **Phase 8 (Implement)**, the orchestrator may dispatch **up to 3 subagents in parallel** when work items are independent. This is governed by the `[P]` (parallelizable) marker system already used in tasks.md.

### How Parallelism Works

1. **Tasks agent embeds the plan.** During Phase 5 (Tasks), the tasks agent marks tasks with `[P]` when they touch different files and have no dependency on incomplete tasks. Tasks within the same phase that share `[P]` markers form a **parallel group**.

2. **Fleet orchestrator fans out.** When executing Plan or Implement, the orchestrator:
   - Reads the current phase's task list from tasks.md
   - Identifies `[P]`-marked tasks that form an independent group (no shared files, no ordering dependency)
   - Dispatches up to **3 subagents simultaneously** for the group
   - Waits for all dispatched agents to complete before moving to the next group or sequential task
   - If any parallel task fails, halts the batch and reports the failure before continuing

3. **Parallelism constraints:**
   - **Max concurrency: 3** -- never dispatch more than 3 subagents at once
   - **Same-file exclusion** -- tasks touching the same file MUST run sequentially even if both are `[P]`
   - **Phase boundaries are serial** -- all tasks in Phase N must complete before Phase N+1 begins
   - **Human gate still applies** -- after each implementation phase completes (all groups done), summarize and checkpoint with the user before the next phase

### Parallel Groups in tasks.md

The tasks agent should organize `[P]` tasks into explicit parallel groups using comments in tasks.md:

```markdown
### Phase 1: Setup

<!-- parallel-group: 1 (max 3 concurrent) -->
- [ ] T002 [P] Create CapabilityManifest.cs in Models/Generation/
- [ ] T003 [P] Create DocumentIndex.cs in Models/Generation/
- [ ] T004 [P] Create ResolvedContext.cs in Models/Generation/

<!-- parallel-group: 2 (max 3 concurrent) -->
- [ ] T005 [P] Create GenerationResult.cs in Models/Generation/
- [ ] T006 [P] Create BatchGenerationJob.cs in Models/Generation/
- [ ] T007 [P] Create SchemaExport.cs in Models/Generation/

<!-- sequential -->
- [ ] T013 Create generation.ts with all TypeScript interfaces
```

### Plan Phase Parallelism

During Phase 3 (Plan), the plan agent's Phase 0 (Research) can dispatch up to 3 research sub-tasks in parallel:
- Each `NEEDS CLARIFICATION` item or technology best-practice lookup is an independent research task
- Fan out up to 3 at a time, consolidate results into research.md
- Phase 1 (Design) artifacts -- data-model.md, contracts/, quickstart.md -- can be generated in parallel if they don't depend on each other's output

### Implement Phase Parallelism

During Phase 8 (Implement), for each implementation phase in tasks.md:
1. Read the phase and identify parallel groups (marked with `<!-- parallel-group: N -->` comments)
2. For each group, dispatch up to 3 `speckit.implement` subagents simultaneously, each given a specific subset of tasks
3. When all tasks in a group complete, move to the next group or sequential task
4. After the entire phase completes, checkpoint with the user before proceeding to the next phase

### Instructions for Tasks Agent

When the fleet orchestrator delegates to `speckit.tasks`, append this instruction:

> "Organize [P]-marked tasks into explicit parallel groups using `<!-- parallel-group: N -->` HTML comments. Each group should contain up to 3 tasks that can execute concurrently (different files, no dependencies). Add `<!-- sequential -->` before tasks that must run in order. This enables the fleet orchestrator to fan out up to 3 subagents per group during implementation."

## First-Turn Behavior -- Artifact Detection & Resume

On **every** invocation, before doing anything else, run artifact detection to determine where the workflow stands. This allows the orchestrator to resume mid-flight even in a fresh conversation.

### Step 0: Branch safety pre-flight

Before anything else, run basic git health checks:

1. **Uncommitted changes**: Run `git status --porcelain`. If there are uncommitted changes, warn the user:
   > WARNING: You have uncommitted changes. Starting the fleet may create conflicts. Commit or stash first?
   > - **Continue** -- proceed with uncommitted changes (risky)
   > - **Stash** -- run `git stash` and continue
   > - **Abort** -- stop and let the user handle it

2. **Detached HEAD**: Run `git branch --show-current`. If empty (detached HEAD), abort:
   > Cannot run fleet on a detached HEAD. Please check out a feature branch first.

3. **Branch freshness** (advisory): Run `git log --oneline HEAD..origin/main 2>/dev/null | wc -l`. If the main branch has commits not in the current branch, advise:
   > Your branch is {N} commits behind main. Consider rebasing before starting implementation to avoid merge conflicts later.

This check runs only once on first invocation. It does NOT block the workflow (except for detached HEAD).

### Step 1: Discover the feature directory

Run `{SCRIPT}` from the repo root to get the feature directory paths as JSON. Parse the output to get `FEATURE_DIR`.

If the script fails (e.g., not on a feature branch):
- If `FEATURE_DESCRIPTION` was provided in `$ARGUMENTS`, proceed directly to Phase 1 -- pass the description to `speckit.specify` and it will create the feature directory.
- If `$ARGUMENTS` is empty, ask the user for the feature description, then start Phase 1.

### Step 2: Check model configuration

Check if `{FEATURE_DIR}/../../../.specify/extensions/fleet/fleet-config.yml` (or the project's config location) has model settings. If the config file doesn't exist or models are set to defaults:

1. **Detect the platform**: Identify which IDE/agent platform you're running in (VS Code Copilot, Claude Code, Cursor, etc.) based on available context.

2. **Primary model**: If `models.primary` is `"auto"`, use whatever model you are currently running as. No action needed -- you ARE the primary model.

3. **Review model**: If `models.review` is `"ask"`, prompt the user:
   > **Model setup (one-time):** The cross-model review (Phase 7) works best with a *different* model than the one running the fleet, to catch blind spots.
   >
   > What model should I use for the review phase? Suggestions:
   > - A different model family (e.g., if you're on Claude, use GPT or Gemini)
   > - A different tier (e.g., if you're on Opus, use Sonnet)
   > - "skip" to skip Phase 7 entirely
   >
   > You can also set this permanently in your fleet config.

4. **Store the choice**: Remember the user's model selection for the duration of this conversation. If they want to persist it, suggest editing the config file.

### Step 3: Probe artifacts in FEATURE_DIR

Check these paths **in order** using the `read` tool. Each check is a file/directory existence AND basic integrity test:

| Check | Path | Existence | Integrity |
|-------|------|-----------|-----------|
| spec.md | `{FEATURE_DIR}/spec.md` | File exists? | Has `## User Stories` or `## Requirements` section? File > 100 bytes? |
| Clarifications | `{FEATURE_DIR}/spec.md` | Contains `## Clarifications` heading? | At least one Q&A pair present? |
| plan.md | `{FEATURE_DIR}/plan.md` | File exists? | Has `## Architecture` or `## Tech Stack` section? File > 200 bytes? |
| checklists/ | `{FEATURE_DIR}/checklists/` | Directory exists and has >=1 file? | Each file > 50 bytes? |
| tasks.md | `{FEATURE_DIR}/tasks.md` | File exists? | Contains at least one `- [ ]` or `- [x]` item? Has `### Phase` heading? |
| .analyze-done | `{FEATURE_DIR}/.analyze-done` | Marker file exists? | -- |
| review.md | `{FEATURE_DIR}/review.md` | File exists? | Contains `## Summary` and verdict table? |
| Implementation | `{FEATURE_DIR}/tasks.md` | All `- [x]`, zero `- [ ]` remaining? | -- |
| Verify extension | `.specify/extensions/verify/extension.yml` | File exists? | -- |
| Verification | `{FEATURE_DIR}/.verify-done` | Marker file exists? | -- |

**Integrity failures are advisory, not blocking.** If a file exists but fails integrity checks, warn the user:
> WARNING: `plan.md` exists but appears incomplete (missing expected sections). It may have been partially generated. Re-run Phase 3 (Plan), or continue with the current file?

### Step 4: Determine the resume phase

Walk the artifact signals **top-down**. The first phase whose artifact is **missing** is where work resumes:

```
if spec.md missing           -> resume at Phase 1 (Specify)
if no ## Clarifications       -> resume at Phase 2 (Clarify)
if plan.md missing           -> resume at Phase 3 (Plan)
if checklists/ empty/missing -> resume at Phase 4 (Checklist)
if tasks.md missing          -> resume at Phase 5 (Tasks)
if .analyze-done missing     -> resume at Phase 6 (Analyze)
if review.md missing         -> resume at Phase 7 (Review)
if tasks.md has `- [ ]`     -> resume at Phase 8 (Implement)
if .verify-done missing      -> resume at Phase 9 (Verify)
if all done                  -> resume at Phase 10 (Tests)
```

### Step 5: Present status and confirm

Show the user a status table and the detected resume point:

```
Feature: {branch name}
Directory: {FEATURE_DIR}

Phase 1 Specify      [x] spec.md found
Phase 2 Clarify      [x] ## Clarifications present
Phase 3 Plan         [x] plan.md found
Phase 4 Checklist    [x] checklists/ has 2 files
Phase 5 Tasks        [x] tasks.md found
Phase 6 Analyze      [ ] .analyze-done not found
Phase 7 Review       [ ] --
Phase 8 Implement    [ ] --
Phase 9 Verify       [ ] --
Phase 10 Tests       [ ] --

> Resuming at Phase 6: Analyze
```

Then ask: *"Detected progress above. Resume at Phase {N} ({name}), or override to a different phase?"*

- If user confirms -> create the todo list with completed phases marked as `completed` and resume from Phase N.
- If user provides a phase number or name -> start from that phase instead.
- If FEATURE_DIR doesn't exist -> start from Phase 1, ask for the feature description.

### Edge Cases

- **Implementation partially complete**: If `tasks.md` exists and has a mix of `[x]` and `[ ]`, resume at Phase 8 (Implement). Tell the user how many tasks remain: *"tasks.md: {done}/{total} tasks complete. {remaining} tasks remaining."*
- **Analyze completion marker**: After Phase 6 (Analyze) completes -- whether it produces `remediation.md` or not -- create a marker file `{FEATURE_DIR}/.analyze-done` containing the timestamp. This distinguishes "analyze ran clean" from "analyze never ran." The `.analyze-done` file is the artifact signal for Phase 6, not `remediation.md`.
- **Review can be skipped**: If user opts to skip cross-model review, treat Phase 7 as skipped and proceed to Phase 8.
- **Review found NO failures**: If `review.md` exists and overall verdict is "READY", Phase 7 is complete -- proceed to Phase 8.
- **Review found FAIL items**: If `review.md` has FAIL verdicts, present them and ask user whether to (a) fix the issues by re-running the relevant earlier phase, (b) proceed anyway, or (c) abort.
- **Verify extension not installed**: If `.specify/extensions/verify/extension.yml` doesn't exist, prompt to install. If user declines, skip Phase 9.
- **Verify completion marker**: After Phase 9 (Verify) completes, create `{FEATURE_DIR}/.verify-done` with timestamp. This distinguishes "verify ran" from "verify never ran."
- **Checklists may be skipped**: Some features don't use checklists. If `tasks.md` exists but `checklists/` doesn't, treat Phase 4 as skipped.
- **Fresh branch, no specs dir**: Start from Phase 1. Use `FEATURE_DESCRIPTION` from `$ARGUMENTS` if provided; otherwise ask the user.
- **User says "start over"**: Re-run from Phase 1 regardless of existing artifacts. Warn that this will overwrite existing artifacts and get confirmation.

### Stale Artifact Detection

After determining the resume phase, check for **stale downstream artifacts** -- files generated by an earlier phase that may be outdated because an upstream artifact was modified later.

Compare file modification timestamps in this dependency chain:

```
spec.md -> plan.md -> tasks.md -> .analyze-done -> review.md -> [implementation] -> .verify-done
```

If a file is **newer** than a downstream file that depends on it (e.g., `spec.md` was modified after `plan.md`), warn the user:

> WARNING: **Stale artifact detected**: `plan.md` (modified {date}) was generated before the latest `spec.md` change ({date}). Plan may not reflect current requirements. Re-run Phase 3 (Plan) to update, or proceed with the current plan?

This is advisory only -- the user decides whether to rerun. Do not block the workflow.

## Phase Execution Template

For each phase:
```
1. Mark the phase as in-progress in the todo list
2. Announce: "**Phase N: {Name}** -- delegating to {agent}..."
3. Delegate to the agent with relevant arguments:
   - Phase 1 (Specify): pass FEATURE_DESCRIPTION from $ARGUMENTS as the argument
   - Phase 2 (Clarify): pass the feature description and any user feedback
   - All other phases: pass the feature description and any user-provided refinements
4. Summarize the agent's output concisely
5. Ask: "Ready to proceed to Phase N+1 ({next name}), or would you like to revise?"
6. Wait for user response
7. Mark phase as completed when approved
```

## Phase 7: Cross-Model Review

This phase uses a **different model** than the one that generated plan.md and tasks.md, providing a fresh perspective to catch blind spots.

1. Delegate to `speckit.fleet.review` -- it runs on the **review model** configured in Step 2 (a different model than the primary) and is **read-only**
2. The review agent reads spec.md, plan.md, tasks.md, checklists/, and remediation.md
3. It evaluates 7 dimensions: spec-plan alignment, plan-tasks completeness, dependency ordering, parallelization correctness, feasibility & risk, standards compliance, implementation readiness
4. It outputs a structured review report with PASS/WARN/FAIL verdicts per dimension
5. **Save the review output** to `{FEATURE_DIR}/review.md`
6. Present the summary table to the user:
   - **All PASS / READY**: *"Cross-model review passed. Ready to implement?"*
   - **WARN items**: *"Review found {N} warnings. Proceed to implementation, or address them first?"*
   - **FAIL items**: *"Review found {N} critical issues that should be fixed before implementing."* -- list them and ask which earlier phase to re-run (plan, tasks, or analyze)
7. If user chooses to fix: loop back to the appropriate phase, then re-run review after fixes
8. If user approves: mark Phase 7 complete and proceed to Phase 8 (Implement)

**Note**: Phase 7 (Review) validates design artifacts *before* implementation. Phase 9 (Verify) validates actual code *after* implementation. Both are read-only.

## Phase 9: Post-Implementation Verification

This phase validates that the implemented code matches the specification artifacts. It requires the **verify extension**.

### Extension Installation Check

Before delegating to `speckit.verify`, check if the extension is installed:

1. Check if `.specify/extensions/verify/extension.yml` exists using the `read` tool
2. If **missing**, ask the user:
   > The verify extension is not installed. Install it now?
   > ```
   > specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
   > ```
3. If user approves, run the install command in the terminal
4. If user declines, skip Phase 9 and proceed to Phase 10 (CI)

### Verification Execution

1. Delegate to `speckit.verify` -- it reads spec.md, plan.md, tasks.md, constitution.md and the implemented source files
2. It runs 7 verification checks: task completion, file existence, requirement coverage, scenario & test coverage, spec intent alignment, constitution alignment, design & structure consistency
3. It outputs a verification report with findings, metrics, and next actions
4. Present the summary to the user:
   - **No findings**: *"Verification passed. Ready to run CI?"* -- proceed to Phase 10
   - **Findings exist**: Show the findings grouped by severity (CRITICAL, WARNING, INFO) and enter the **Implement-Verify loop** below

### Implement-Verify Loop

When verification produces findings, run a remediation loop:

```
repeat:
  1. Present findings to user
  2. Ask: "Re-run implementation to address these findings? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with findings as context, then re-run speckit.verify
     - skip  -> exit loop, proceed to Phase 10 with current state
     - abort -> stop the workflow entirely
  3. After re-verify, check findings again
until: no findings remain OR user says skip/abort
```

Rules for the loop:
- **Pass findings as context**: When delegating to `speckit.implement`, include the verification findings so it knows exactly what to fix. Prepend: *"Address the following verification findings: {findings list}"*
- **Suppress sub-agent handoffs** (Operating Rule 6 still applies)
- **Track iterations**: Show the loop count each time -- *"Implement-Verify iteration {N}: {findings_count} findings remaining"*
- **Cap at 3 iterations**: After 3 rounds, if findings persist, warn the user: *"3 remediation iterations completed with {N} findings still remaining. These may require manual intervention. Proceed to CI, or continue?"*
- **Human gate every iteration**: Never auto-loop -- always ask before re-implementing
- **Delta reporting**: After each re-verify, show what changed -- *"Fixed: {N}, New: {N}, Remaining: {N}"*

After the loop exits (no findings or user skips):
1. Create a marker file `{FEATURE_DIR}/.verify-done` containing the timestamp and final findings count
2. Mark Phase 9 complete and proceed to Phase 10 (Tests)

## Phase 10: Tests

After verification, detect and run the project's test suite.

### Test Runner Detection

Detect test runner(s) by checking for these files at the repo root, in order:

| Check | Runner | Command |
|-------|--------|---------|
| `package.json` with `"test"` script | npm/yarn/pnpm | `npm test` (or `yarn test` / `pnpm test` based on lockfile) |
| `*.sln` or `*.slnx` or `*.csproj` | dotnet | `dotnet test` |
| `Makefile` with `test` target | make | `make test` |
| `pytest.ini` or `pyproject.toml` with `[tool.pytest]` | pytest | `pytest` |
| `Cargo.toml` | cargo | `cargo test` |
| `go.mod` | go | `go test ./...` |

If **multiple** runners are detected (e.g., a monorepo with both `package.json` and `*.slnx`), run all of them and report results per runner.

If **no** runner is detected, ask the user: *"No test runner detected. What command runs your tests?"*

### Test Execution

1. Run the detected test command(s) from the repo root
2. Report pass/fail summary with failure details

### CI Remediation Loop

If CI fails, run a remediation loop (same pattern as the Implement-Verify loop):

```
repeat:
  1. Parse test failures -- group by type (compile error, test failure, lint error)
  2. Present failures to user with file locations and error messages
  3. Ask: "Fix these CI failures? (yes / skip / abort)"
     - yes   -> delegate to speckit.implement with failure details as context, then re-run CI
     - skip  -> exit loop, leave failures for manual fixing
     - abort -> stop the workflow entirely
  4. After re-run, check CI result again
until: CI passes OR user says skip/abort
```

Rules:
- **Pass failure context**: Include exact error messages, file paths, and test names when delegating to implement
- **Cap at 3 iterations**: After 3 rounds, warn: *"3 CI fix iterations completed, {N} failures remain. These likely need manual debugging."*
- **Human gate every iteration**: Never auto-loop
- **Delta reporting**: *"Fixed: {N} failures, New: {N}, Remaining: {N}"*
- **Distinguish failure types**: Compile errors should be fixed before test failures (they may cause cascading test failures)

### Tests Pass

When all tests pass, proceed to the Completion Summary.

## Error Recovery

### Parallel Task Failure

When a task within a parallel group fails during Phase 8 (Implement):
1. **Let the other in-flight tasks finish** -- don't abort tasks that are already running
2. Report which task(s) failed with error details
3. Offer three options:
   - **Retry failed only** -- re-dispatch only the failed task(s), skip completed ones
   - **Retry entire group** -- re-run all tasks in the parallel group (useful if failure cascaded)
   - **Skip and continue** -- mark the failed task(s) and move on (user can fix manually later)
4. Never auto-retry -- always ask the user

### Sub-Agent Timeout or Crash

If a delegated sub-agent doesn't return (timeout) or returns an error:
1. Report the phase and agent that failed
2. Offer to retry the same phase or skip it
3. If the same agent fails twice in a row, suggest the user run it manually (`/speckit.{agent}`) and then resume the fleet

## Phase Rollback

At any human gate, the user may say "go back to Phase N" or "rollback to plan." The fleet supports this:

1. **Identify the target phase**: Parse the user's request to determine which phase to roll back to.
2. **Warn about downstream invalidation**: All artifacts generated by phases *after* the target phase are now potentially stale. Show:
   > Rolling back to Phase {N} ({name}). The following artifacts may be invalidated:
   > - plan.md (Phase 3)
   > - tasks.md (Phase 5)
   > - Implementation (Phase 8)
   >
   > These will be regenerated as the workflow proceeds. Continue?
3. **Delete marker files only**: Remove `.analyze-done`, `.verify-done`, and `review.md` for invalidated phases. Do NOT delete spec.md, plan.md, or tasks.md -- they'll be overwritten when the phase re-runs.
4. **Update the todo list**: Reset all phases from the target phase onward to `not-started`.
5. **Resume from the target phase**: Follow the normal phase execution flow from that point.

**Constraints**:
- Cannot rollback during an active sub-agent delegation -- wait for it to complete first
- Rollback to Phase 1 (Specify) with "start over" requires explicit confirmation since it regenerates everything

## Completion Summary

After Phase 10 completes (CI passes or user skips CI), present a structured summary:

```
## Fleet Complete

Feature: {feature name}
Branch: {branch name}
Duration: Phases 1-10 ({phases completed}/{phases total}, {phases skipped} skipped)

### Artifacts Generated
- spec.md -- feature specification ({word count} words, {user stories count} user stories)
- plan.md -- technical plan ({components count} components)
- tasks.md -- {total tasks} tasks ({completed} completed, {remaining} remaining)
- review.md -- cross-model review (verdict: {verdict})

### Implementation
- Files created: {count}
- Files modified: {count}
- Tests added: {count}

### Quality Gates
- Analyze: {pass/findings count}
- Cross-model review: {verdict}
- Verify: {pass/findings count} ({iterations} iterations)
- CI: {pass/fail}

### Git
- Commits: {list of WIP commits if any}
- Ready to push: {yes/no}
```

After the summary, offer:
1. *"Push to remote and create a PR?"* (if the user wants)
2. *"View any artifact? (spec, plan, tasks, review)"*
diff --git a/.specify/extensions/fleet/commands/review.md b/.specify/extensions/fleet/commands/review.md
new file mode 100644
index 0000000..56a0bd9
--- /dev/null
+++ b/.specify/extensions/fleet/commands/review.md
@@ -0,0 +1,112 @@
---
description: "Cross-model evaluation of plan.md and tasks.md before implementation. Reviews feasibility, completeness, dependency ordering, risk, and parallelization correctness using a different model than was used to generate the artifacts."
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --paths-only
  ps: scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly
user-invocable: false
agents: []
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

---

You are a **Pre-Implementation Reviewer** -- a critical evaluator who reviews the design artifacts (plan.md, tasks.md, spec.md) produced by earlier workflow phases. Your purpose is to catch issues that the generating model may have been blind to, before implementation begins.

**STRICTLY READ-ONLY**: Do NOT modify any files. Output a structured review report only.

## What You Review

Run `{SCRIPT}` from the repo root to discover `FEATURE_DIR`. Then read these artifacts:

- `spec.md` -- the feature specification (requirements, user stories)
- `plan.md` -- the technical plan (architecture, tech stack, file structure)
- `tasks.md` -- the task breakdown (phased, dependency-ordered, with [P] markers)
- `checklists/` -- any requirement quality checklists (if present)
- `remediation.md` -- analyze output (if present)

## Review Dimensions

Evaluate across these 7 dimensions. For each, assign a verdict: **PASS**, **WARN**, or **FAIL**.

### 1. Spec-Plan Alignment
- Does plan.md address every user story in spec.md?
- Are there plan decisions that contradict spec requirements?
- Are non-functional requirements (performance, security, accessibility) covered in the plan?

### 2. Plan-Tasks Completeness
- Does every architectural component in plan.md have corresponding tasks in tasks.md?
- Are there tasks that reference files/patterns not described in plan.md?
- Are test tasks present for critical paths?

### 3. Dependency Ordering
- Are task phases ordered correctly? (setup -> foundational -> stories -> polish)
- Do any tasks reference files/interfaces that haven't been created by an earlier task?
- Are foundational tasks truly blocking, or could some be parallelized?

### 4. Parallelization Correctness
- Are `[P]` markers accurate? (Do tasks marked parallel truly touch different files with no dependency?)
- Are there tasks NOT marked `[P]` that could be parallelized?
- Do `<!-- parallel-group: N -->` groupings respect the max-3 constraint?
- Are there same-file conflicts hidden within a parallel group?

### 5. Feasibility & Risk
- Are there tasks that seem too large? (If a single task touches >3 files or >200 LOC, flag it)
- Are there technology choices in plan.md that contradict the project's existing stack?
- Are there missing error handling, edge case, or migration tasks?
- Does the task count seem proportional to the feature complexity?

### 6. Constitution & Standards Compliance
- Read `.specify/memory/constitution.md` and check plan aligns with project principles
- Check that testing approach matches the project's testing standards (80% coverage, TDD if required)
- Verify security considerations are addressed (path validation, input sanitization, etc.)

### 7. Implementation Readiness
- Is every task specific enough for an LLM to execute without ambiguity?
- Do all tasks include exact file paths?
- Are acceptance criteria clear for each user story phase?

## Output Format

```markdown
# Pre-Implementation Review

**Feature**: {feature name from spec.md}
**Artifacts reviewed**: spec.md, plan.md, tasks.md, [others if present]
**Review model**: {your model name} (should be different from the model that generated the artifacts)
**Generating model**: {model used for Phases 1-6, if known}

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS/WARN/FAIL | brief note |
| Plan-Tasks Completeness | PASS/WARN/FAIL | brief note |
| Dependency Ordering | PASS/WARN/FAIL | brief note |
| Parallelization Correctness | PASS/WARN/FAIL | brief note |
| Feasibility & Risk | PASS/WARN/FAIL | brief note |
| Standards Compliance | PASS/WARN/FAIL | brief note |
| Implementation Readiness | PASS/WARN/FAIL | brief note |

**Overall**: READY / READY WITH WARNINGS / NOT READY

## Findings

### Critical (FAIL -- must fix before implementing)
1. ...

### Warnings (WARN -- recommend fixing, can proceed)
1. ...

### Observations (informational)
1. ...

## Recommended Actions
- [ ] {specific action to address each FAIL/WARN}
```
diff --git a/.specify/extensions/fleet/config-template.yml b/.specify/extensions/fleet/config-template.yml
new file mode 100644
index 0000000..df12908
--- /dev/null
+++ b/.specify/extensions/fleet/config-template.yml
@@ -0,0 +1,37 @@
# Fleet Orchestrator Configuration
# Copy this file to .specify/extensions/fleet/fleet-config.yml and customize.

# Parallel execution settings
parallel:
  # Maximum number of subagents dispatched simultaneously during Plan & Implement phases.
  # Tasks marked [P] in tasks.md that form independent groups will be fanned out
  # up to this limit. Range: 1-3.
  max_concurrency: 3

# Model preferences
# On first run, the fleet will ask you to configure these.
# Set "auto" to be prompted each time, or specify exact model names for your IDE.
#
# Examples by platform:
#   VS Code Copilot:  "Claude Opus 4.6 (copilot)", "Claude Sonnet 4.6 (copilot)"
#   Claude Code:      "claude-sonnet-4-20250514", "claude-opus-4-20250514"
#   Cursor:           "claude-sonnet-4", "gpt-4o"
#   Other:            Use whatever model identifier your IDE supports
models:
  # Primary model for the fleet orchestrator (Phase 1-6, 8-10)
  # Set to "auto" to use whatever model the IDE defaults to.
  primary: "auto"

  # Review model (Phase 7) -- should be DIFFERENT from primary for blind-spot detection.
  # Set to "ask" to be prompted on first run.
  # Can be a single string or a list (tries in order until one is available).
  review: "ask"

# Verify extension settings
verify:
  # If the verify extension is not installed, prompt the user to install it.
  # Set to false to always skip Phase 9 (Verify).
  auto_prompt_install: true

  # URL for the verify extension archive (used in the install prompt).
  install_url: "https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip"
diff --git a/.specify/extensions/fleet/docs/catalog-submission.md b/.specify/extensions/fleet/docs/catalog-submission.md
new file mode 100644
index 0000000..8703efe
--- /dev/null
+++ b/.specify/extensions/fleet/docs/catalog-submission.md
@@ -0,0 +1,99 @@
# Catalog & PR Templates

These are pre-built snippets for submitting the fleet extension to the
spec-kit community catalog. Copy-paste when ready.

---

## catalog.community.json Entry

Add this under `"extensions"` in `extensions/catalog.community.json` in the
[spec-kit repo](https://github.com/github/spec-kit):

```json
"fleet": {
  "name": "Fleet Orchestrator",
  "id": "fleet",
  "description": "Orchestrate a full feature lifecycle with human-in-the-loop gates across all SpecKit phases.",
  "author": "sharathsatish",
  "version": "1.0.0",
  "download_url": "https://github.com/sharathsatish/spec-kit-fleet/archive/refs/tags/v1.0.0.zip",
  "repository": "https://github.com/sharathsatish/spec-kit-fleet",
  "homepage": "https://github.com/sharathsatish/spec-kit-fleet",
  "documentation": "https://github.com/sharathsatish/spec-kit-fleet/blob/main/README.md",
  "changelog": "https://github.com/sharathsatish/spec-kit-fleet/blob/main/CHANGELOG.md",
  "license": "MIT",
  "requires": {
    "speckit_version": ">=0.1.0",
    "tools": []
  },
  "provides": {
    "commands": 2,
    "hooks": 1
  },
  "tags": [
    "orchestration",
    "workflow",
    "human-in-the-loop",
    "parallel"
  ],
  "verified": false,
  "downloads": 0,
  "stars": 0,
  "created_at": "2026-03-06T00:00:00Z",
  "updated_at": "2026-03-06T00:00:00Z"
}
```

---

## extensions/README.md Table Row

Insert alphabetically in the Available Extensions table:

```markdown
| Fleet Orchestrator | Orchestrate a full feature lifecycle with human-in-the-loop gates | [spec-kit-fleet](https://github.com/sharathsatish/spec-kit-fleet) |
```

---

## Pull Request Description

```markdown
## Extension Submission

**Extension Name**: Fleet Orchestrator
**Extension ID**: fleet
**Version**: 1.0.0
**Author**: sharathsatish
**Repository**: https://github.com/sharathsatish/spec-kit-fleet

### Description
Orchestrate a full feature lifecycle with human-in-the-loop gates across all
SpecKit phases. Chains 10 phases (specify -> clarify -> plan -> checklist ->
tasks -> analyze -> review -> implement -> verify -> CI) into a single command
with artifact detection, mid-workflow resume, parallel execution (up to 3
concurrent subagents), and cross-model review.

### Checklist
- [x] Valid extension.yml manifest
- [x] README.md with installation and usage docs
- [x] LICENSE file included
- [ ] GitHub release created (v1.0.0)
- [x] Extension tested on real project
- [x] All commands working
- [x] No security vulnerabilities
- [x] Added to extensions/catalog.community.json
- [x] Added to extensions/README.md Available Extensions table

### Testing
Tested on:
- Windows 11 with spec-kit 0.1.0 (specify-cli)

### Additional Notes
- Provides 2 commands: `speckit.fleet.run` (alias `speckit.fleet`) and
  `speckit.fleet.review` (alias `speckit.review`)
- Includes `after_tasks` hook for automatic cross-model review
- Optional dependency on the verify extension (auto-prompts to install)
- All files are pure ASCII for Windows cp1252 compatibility
```
diff --git a/.specify/extensions/fleet/extension.yml b/.specify/extensions/fleet/extension.yml
new file mode 100644
index 0000000..a8924e6
--- /dev/null
+++ b/.specify/extensions/fleet/extension.yml
@@ -0,0 +1,58 @@
schema_version: "1.0"

extension:
  id: "fleet"
  name: "Fleet Orchestrator"
  version: "1.0.0"
  description: "Orchestrate a full feature lifecycle with human-in-the-loop gates across all SpecKit phases."
  author: "sharathsatish"
  repository: "https://github.com/sharathsatish/spec-kit-fleet"
  license: "MIT"
  homepage: "https://github.com/sharathsatish/spec-kit-fleet"

requires:
  speckit_version: ">=0.1.0"

  commands:
    - "speckit.specify"
    - "speckit.clarify"
    - "speckit.plan"
    - "speckit.checklist"
    - "speckit.tasks"
    - "speckit.analyze"
    - "speckit.implement"

provides:
  commands:
    - name: "speckit.fleet.run"
      file: "commands/fleet.md"
      description: "Orchestrate a full feature lifecycle with human-in-the-loop checkpoints through all SpecKit phases"
      aliases: ["speckit.fleet"]

    - name: "speckit.fleet.review"
      file: "commands/review.md"
      description: "Cross-model evaluation of plan.md and tasks.md before implementation"
      aliases: ["speckit.review"]

  config:
    - name: "fleet-config.yml"
      template: "config-template.yml"
      description: "Fleet orchestrator configuration (parallelism, models, verify settings)"
      required: false

hooks:
  after_tasks:
    command: "speckit.fleet.review"
    optional: true
    prompt: "Run cross-model review to evaluate plan and tasks before implementation?"
    description: "Pre-implementation review gate"

tags:
  - orchestration
  - workflow
  - human-in-the-loop
  - parallel

defaults:
  parallel:
    max_concurrency: 3
diff --git a/.specify/extensions/verify/.gitignore b/.specify/extensions/verify/.gitignore
new file mode 100644
index 0000000..28a52e7
--- /dev/null
+++ b/.specify/extensions/verify/.gitignore
@@ -0,0 +1,39 @@
# Local configuration overrides
*-config.local.yml

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/

# Testing
.pytest_cache/
.coverage
htmlcov/

# IDEs
.vscode/
.idea/
*.swp
*.swo
*~

# OS
.DS_Store
Thumbs.db

# Logs
*.log

# Build artifacts
dist/
build/
*.egg-info/

# Temporary files
*.tmp
.cache/
\ No newline at end of file
diff --git a/.specify/extensions/verify/CHANGELOG.md b/.specify/extensions/verify/CHANGELOG.md
new file mode 100644
index 0000000..4ea6fbe
--- /dev/null
+++ b/.specify/extensions/verify/CHANGELOG.md
@@ -0,0 +1,20 @@
# Changelog

All notable changes to this extension will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.0.0] - 2026-02-28

### Added

- Initial release of the Verify extension
- Command: `/speckit.verify.run` (alias: `/speckit.verify`) — post-implementation verification
- Checks implemented code against spec, plan, tasks, and constitution to catch gaps before review
- Produces a verification report with findings, metrics, and next actions
- `after_implement` hook for automatic verification prompting

### Requirements

- Spec Kit: >=0.1.0
diff --git a/.specify/extensions/verify/LICENSE b/.specify/extensions/verify/LICENSE
new file mode 100644
index 0000000..8e4e888
--- /dev/null
+++ b/.specify/extensions/verify/LICENSE
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 Ismael Jimenez

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
diff --git a/.specify/extensions/verify/README.md b/.specify/extensions/verify/README.md
new file mode 100644
index 0000000..dacdaa0
--- /dev/null
+++ b/.specify/extensions/verify/README.md
@@ -0,0 +1,202 @@
# Spec-Kit Verify Extension

Post-implementation quality gate that validates implemented code against specification artifacts.

## Features

- **Implementation verification**: Checks implemented code against spec, plan, tasks, and constitution to catch gaps before review
- **Actionable report**: Produces a verification report with findings, metrics, and next actions
- **Configurable**: Adjust report size limits
- **Automatic hook**: Optional post-implementation prompt after `/speckit.implement`
- **Read-only & idempotent**: Never modifies source files or artifacts; repeated runs produce the same report

## Installation

```bash
specify extension add verify
```

Or install from repository directly:

```bash
specify extension add verify --from https://github.com/ismaelJimenez/spec-kit-verify/archive/refs/tags/v1.0.0.zip
```

For local development:

```bash
specify extension add --dev /path/to/spec-kit-verify
```

## Configuration

1. Create configuration file:

   ```bash
   cp .specify/extensions/verify/config-template.yml \
     .specify/extensions/verify/verify-config.yml
   ```

2. Edit configuration:

   ```bash
   vim .specify/extensions/verify/verify-config.yml
   ```

3. Customize as needed:

   ```yaml
   # Limit report size
   report:
     max_findings: 30
   ```

## Usage

### Command: verify

Validate implemented code against specification artifacts.

```text
# In Claude Code
> /speckit.verify
```

**Prerequisites:**

- Spec Kit >= 0.1.0
- Completed `/speckit.implement` run
- `spec.md` and `tasks.md` present in the feature directory
- At least one completed task in `tasks.md`

**Output:**

- Verification report with findings, metrics, and next actions
- Optional remediation suggestions on request

### Automatic Hook

If the `after_implement` hook is enabled, you'll be prompted automatically after `/speckit.implement` completes:

> Run verify to validate implementation against specification?

## Configuration Reference

### Report Settings

| Setting | Type | Required | Description |
|---------|------|----------|-------------|
| `report.max_findings` | integer | No | Maximum findings in the report (default: `50`) |

## Environment Variables

This extension does not currently support environment variable overrides. All configuration is managed through `verify-config.yml`.

## Examples

### Example 1: Basic Verification

```text
# Step 1: Create specification
> /speckit.specify

# Step 2: Plan and generate tasks
> /speckit.plan
> /speckit.tasks

# Step 3: Implement
> /speckit.implement

# Step 4: Verify implementation
> /speckit.verify
```

The verify command produces a report like:

```markdown
## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | LOW | tasks.md | 1 of 12 tasks incomplete | Complete task T08 |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |
| D1 | Scenario & Test Coverage | HIGH | spec.md:SC-02 | No test for login failure | Add test for scenario SC-02 |

Metrics: Tasks 11/12 · Requirement Coverage 92% · Files Verified 8 · Critical Issues 1
```

## What It Does

The verify command analyzes implemented code against specification artifacts:

1. Loads feature artifacts (spec.md, plan.md, tasks.md, constitution.md)
2. Identifies implementation scope from completed tasks
3. Runs verification checks across seven categories
4. Produces a report with findings, metrics, and next actions

### Verification Checks

| Check | What it verifies |
|-------|------------------|
| Task Completion | All tasks marked complete |
| File Existence | Task-referenced files exist on disk |
| Requirement Coverage | Every requirement has implementation evidence |
| Scenario & Test Coverage | Spec scenarios covered by tests or code paths |
| Spec Intent Alignment | Implementation matches spec intent and acceptance criteria |
| Constitution Alignment | Constitution principles are respected |
| Design & Structure Consistency | Architecture and conventions match plan.md |

## Workflow Integration

```
/speckit.specify → /speckit.plan → /speckit.tasks → /speckit.implement → /speckit.verify
```

## Operating Principles

- **Read-only**: Never modifies source files, tasks, or spec artifacts
- **Spec-driven**: All findings trace back to specification artifacts
- **Constitution authority**: Constitution violations are always CRITICAL
- **Idempotent**: Multiple runs on the same state produce the same report

## Troubleshooting

### Issue: Configuration not found

**Solution:** Create config from template (see [Configuration](#configuration) section):

```bash
cp .specify/extensions/verify/config-template.yml \
  .specify/extensions/verify/verify-config.yml
```

### Issue: Command not available

**Solutions:**

1. Check extension is installed: `specify extension list`
2. Restart AI agent
3. Reinstall extension: `specify extension add verify`

### Issue: "No completed tasks" error

**Solution:** Run `/speckit.implement` first. The verify command requires at least one completed task (`[x]`) in `tasks.md`.

### Issue: "Missing spec.md" error

**Solution:** Run `/speckit.specify` to create the specification before verifying. Both `spec.md` and `tasks.md` must exist in the feature directory.

## License

MIT License - see [LICENSE](LICENSE) file

## Support

- Issues: [https://github.com/ismaelJimenez/spec-kit-verify/issues](https://github.com/ismaelJimenez/spec-kit-verify/issues)
- Spec Kit Docs: [https://github.com/github/spec-kit](https://github.com/github/spec-kit)

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history.

Extension Version: 1.0.0 · Spec Kit: >=0.1.0
diff --git a/.specify/extensions/verify/commands/verify.md b/.specify/extensions/verify/commands/verify.md
new file mode 100644
index 0000000..e1f0ca3
--- /dev/null
+++ b/.specify/extensions/verify/commands/verify.md
@@ -0,0 +1,211 @@
---
description: Perform a non-destructive post-implementation verification gate validating the implementation against spec.md, plan.md, tasks.md, and constitution.md.
scripts:
  sh: scripts/bash/check-prerequisites.sh --json --require-tasks --include-tasks
  ps: scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
handoffs:
  - label: Address findings and re-implement
    agent: speckit.implement
    prompt: Address the verification findings and re-run implementation to resolve issues
  - label: Re-analyze specification consistency
    agent: speckit.analyze
    prompt: Re-analyze specification consistency based on verification findings
---

## User Input

```text
$ARGUMENTS
```

You **MUST** consider the user input before proceeding (if not empty).

## Goal

Validate the implementation against its specification artifacts (`spec.md`, `plan.md`, `tasks.md`, `constitution.md`). This command MUST run only after `/speckit.implement` has completed.

## Operating Constraints

**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).

**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this verification scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, tasks or implementation—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.verify`.

## Execution Steps

### 1. Initialize Verification Context

Run `{SCRIPT}` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:

- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md

Abort if SPEC or TASKS is missing (instruct the user to run the missing prerequisite command). PLAN and constitution are optional — checks that depend on them are skipped gracefully.
Abort if TASKS has no completed tasks.
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").

### 2. Load Artifacts (Progressive Disclosure)

Load only the minimal necessary context from each artifact:

**From spec.md:**

- Functional Requirements
- User Stories and Acceptance Criteria
- Scenarios
- Edge Cases (if present)

**From plan.md (optional):**

- Architecture/stack choices
- Data Model references
- Project structure (directory layout)

**From tasks.md:**

- Task IDs
- Completion status
- Descriptions
- Phase grouping
- Referenced file paths
- Count total tasks and completed tasks

**From constitution (optional):**

- Load `.specify/memory/constitution.md` for principle validation
- If missing or placeholder: skip constitution checks, emit Info finding

### 3. Identify Implementation Scope

Build the set of files to verify from tasks.md.

- Parse all tasks in tasks.md — both completed (`[x]`/`[X]`) and incomplete (`[ ]`)
- Extract file paths referenced in each task description
- Build **REVIEW_FILES** set from completed task file paths
- Track **INCOMPLETE_TASK_FILES** from incomplete tasks (used by check C)

### 4. Build Semantic Models

Create internal representations (do not include raw artifacts in output):

- **Task inventory**: Each task with ID, completion status, referenced file paths, and phase grouping
- **Implementation mapping**: Map each completed task to its referenced file paths
- **File inventory**: All REVIEW_FILES with existence verification — flag any task-referenced file that does not exist on disk
- **Requirements inventory**: Each functional requirement with a stable key — map to tasks and REVIEW_FILES for implementation evidence (evidence = file in REVIEW_FILES containing keyword/ID match, function signatures, or code paths that address the requirement)
- **Spec intent references**: User stories, acceptance criteria, and scenarios from spec.md
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements

### 5. Verification Checks (Token-Efficient Analysis)

Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.

#### A. Task Completion

- Compare completed (`[x]`/`[X]`) vs total tasks
- Flag majority incomplete vs minority incomplete

#### B. File Existence

- Task-referenced files that do not exist on disk
- Tasks referencing ambiguous or unresolvable paths

#### C. Requirement Coverage

- Requirements with no implementation evidence in REVIEW_FILES
- Requirements whose tasks are all incomplete

#### D. Scenario & Test Coverage

- Spec scenarios with no corresponding test or code path
- No test files detected at all in REVIEW_FILES

#### E. Spec Intent Alignment

- Implementation diverging from spec intent (minor vs fundamental divergence)
- Compare acceptance criteria against actual behaviour in REVIEW_FILES

#### F. Constitution Alignment

- Any implementation element conflicting with a constitution MUST principle
- Missing mandated sections or quality gates from constitution

#### G. Design & Structure Consistency

- Architectural decisions or design patterns from plan.md not reflected in code
- Planned directory/file layout deviating from actual structure
- New code deviating from existing project conventions (naming, module structure, error handling patterns)
- Public APIs/exports/endpoints not described in plan.md

### 6. Severity Assignment

Use this heuristic to prioritize findings:

- **CRITICAL**: Violates constitution MUST, majority of tasks incomplete, task-referenced files missing from disk, requirement with zero implementation
- **HIGH**: Spec intent divergence, fundamental implementation mismatch with acceptance criteria, missing scenario/test coverage
- **MEDIUM**: Design pattern drift, minor spec intent deviation
- **LOW**: Structure deviations, naming inconsistencies, minor observations not affecting functionality
- **INFO**: Positive confirmations (all tasks complete, all requirements covered, no issues found). Use sparingly — only in summary metrics, not as individual finding rows.

### 7. Produce Compact Verification Report

Output a Markdown report (no file writes) with the following structure:

## Verification Report

| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Task Completion | CRITICAL | tasks.md | 3 of 12 tasks incomplete | Complete tasks T05, T08, T11 |
| B1 | File Existence | CRITICAL | src/auth.ts | Task-referenced file missing | Create file or update task reference |
| C1 | Requirement Coverage | CRITICAL | spec.md:FR-003 | No implementation evidence | Implement FR-003 |

(Add one row per finding; generate stable IDs prefixed by check letter: A1, B1, C1... Reference specific files and line numbers in Location(s) where applicable.)

**Task Summary Table:**

| Task ID | Status | Referenced Files | Notes |
|---------|--------|-----------------|-------|

**Constitution Alignment Issues:** (if any)

**Metrics:**

- Total Tasks (completed / total)
- Requirement Coverage % (requirements with implementation evidence / total)
- Files Verified
- Critical Issues Count

### 8. Provide Next Actions

At end of report, output a concise Next Actions block:

- If CRITICAL issues exist: Recommend resolving before proceeding
- If HIGH issues exist: Recommend addressing before merge; user may proceed at own risk
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run `/speckit.implement` to address findings and re-run verification", "Implementation verified — ready for review or merge"

### 9. Offer Remediation

Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)

## Operating Principles

### Context Efficiency

- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts and source files incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts

### Analysis Guidelines

- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
- **Every finding must trace back** to a specification artifact (spec.md requirement, user story, scenario, edge case), a structural reference (plan.md, constitution.md), or a task in tasks.md

### Idempotency by Design

The command produces deterministic output — running verification twice on the same state yields the same report. No counters, timestamp-dependent logic, or accumulated state affects findings. The report is fully regenerated on each run.

diff --git a/.specify/extensions/verify/config-template.yml b/.specify/extensions/verify/config-template.yml
new file mode 100644
index 0000000..3c459b3
--- /dev/null
+++ b/.specify/extensions/verify/config-template.yml
@@ -0,0 +1,7 @@
# Verify Extension Configuration
# Copy this file to verify-config.yml and customize as needed

# Report formatting
report:
  # Maximum number of findings in the report table (overflow is summarized)
  max_findings: 50
diff --git a/.specify/extensions/verify/extension.yml b/.specify/extensions/verify/extension.yml
new file mode 100644
index 0000000..2aba8d9
--- /dev/null
+++ b/.specify/extensions/verify/extension.yml
@@ -0,0 +1,50 @@
schema_version: "1.0"

extension:
  id: "verify"
  name: "Verify Extension"
  version: "1.0.0"
  description: "Post-implementation quality gate that validates implementation against specification artifacts."
  author: "ismaelJimenez"
  repository: "https://github.com/ismaelJimenez/spec-kit-verify"
  license: "MIT"
  homepage: "https://github.com/ismaelJimenez/spec-kit-verify"

requires:
  speckit_version: ">=0.1.0"

  commands:
    - "speckit.implement"
    - "speckit.tasks"
    - "speckit.analyze"

provides:
  commands:
    - name: "speckit.verify.run"
      file: "commands/verify.md"
      description: "Validate implementation matches specification artifacts"
      aliases: ["speckit.verify"]

  config:
    - name: "verify-config.yml"
      template: "config-template.yml"
      description: "Verify extension configuration"
      required: false

hooks:
  after_implement:
    command: "speckit.verify.run"
    optional: true
    prompt: "Run verify to validate implementation against specification?"
    description: "Post-implementation verification gate"

tags:
  - "quality"
  - "verification"
  - "review"
  - "compliance"
  - "spec-alignment"

defaults:
  report:
    max_findings: 50
diff --git a/.specify/memory/constitution.md b/.specify/memory/constitution.md
new file mode 100644
index 0000000..a4670ff
--- /dev/null
+++ b/.specify/memory/constitution.md
@@ -0,0 +1,50 @@
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->

## Core Principles

### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->

### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->

### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->

### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->

### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->

## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->

[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->

## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->

[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->

## Governance
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->

[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->

**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
diff --git a/.specify/scripts/bash/check-prerequisites.sh b/.specify/scripts/bash/check-prerequisites.sh
new file mode 100755
index 0000000..88a5559
--- /dev/null
+++ b/.specify/scripts/bash/check-prerequisites.sh
@@ -0,0 +1,190 @@
#!/usr/bin/env bash

# Consolidated prerequisite checking script
#
# This script provides unified prerequisite checking for Spec-Driven Development workflow.
# It replaces the functionality previously spread across multiple scripts.
#
# Usage: ./check-prerequisites.sh [OPTIONS]
#
# OPTIONS:
#   --json              Output in JSON format
#   --require-tasks     Require tasks.md to exist (for implementation phase)
#   --include-tasks     Include tasks.md in AVAILABLE_DOCS list
#   --paths-only        Only output path variables (no validation)
#   --help, -h          Show help message
#
# OUTPUTS:
#   JSON mode: {"FEATURE_DIR":"...", "AVAILABLE_DOCS":["..."]}
#   Text mode: FEATURE_DIR:... \n AVAILABLE_DOCS: \n ✓/✗ file.md
#   Paths only: REPO_ROOT: ... \n BRANCH: ... \n FEATURE_DIR: ... etc.

set -e

# Parse command line arguments
JSON_MODE=false
REQUIRE_TASKS=false
INCLUDE_TASKS=false
PATHS_ONLY=false

for arg in "$@"; do
    case "$arg" in
        --json)
            JSON_MODE=true
            ;;
        --require-tasks)
            REQUIRE_TASKS=true
            ;;
        --include-tasks)
            INCLUDE_TASKS=true
            ;;
        --paths-only)
            PATHS_ONLY=true
            ;;
        --help|-h)
            cat << 'EOF'
Usage: check-prerequisites.sh [OPTIONS]

Consolidated prerequisite checking for Spec-Driven Development workflow.

OPTIONS:
  --json              Output in JSON format
  --require-tasks     Require tasks.md to exist (for implementation phase)
  --include-tasks     Include tasks.md in AVAILABLE_DOCS list
  --paths-only        Only output path variables (no prerequisite validation)
  --help, -h          Show this help message

EXAMPLES:
  # Check task prerequisites (plan.md required)
  ./check-prerequisites.sh --json
  
  # Check implementation prerequisites (plan.md + tasks.md required)
  ./check-prerequisites.sh --json --require-tasks --include-tasks
  
  # Get feature paths only (no validation)
  ./check-prerequisites.sh --paths-only
  
EOF
            exit 0
            ;;
        *)
            echo "ERROR: Unknown option '$arg'. Use --help for usage information." >&2
            exit 1
            ;;
    esac
done

# Source common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"

# Get feature paths and validate branch
_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
eval "$_paths_output"
unset _paths_output
check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1

# If paths-only mode, output paths and exit (support JSON + paths-only combined)
if $PATHS_ONLY; then
    if $JSON_MODE; then
        # Minimal JSON paths payload (no validation performed)
        if has_jq; then
            jq -cn \
                --arg repo_root "$REPO_ROOT" \
                --arg branch "$CURRENT_BRANCH" \
                --arg feature_dir "$FEATURE_DIR" \
                --arg feature_spec "$FEATURE_SPEC" \
                --arg impl_plan "$IMPL_PLAN" \
                --arg tasks "$TASKS" \
                '{REPO_ROOT:$repo_root,BRANCH:$branch,FEATURE_DIR:$feature_dir,FEATURE_SPEC:$feature_spec,IMPL_PLAN:$impl_plan,TASKS:$tasks}'
        else
            printf '{"REPO_ROOT":"%s","BRANCH":"%s","FEATURE_DIR":"%s","FEATURE_SPEC":"%s","IMPL_PLAN":"%s","TASKS":"%s"}\n' \
                "$(json_escape "$REPO_ROOT")" "$(json_escape "$CURRENT_BRANCH")" "$(json_escape "$FEATURE_DIR")" "$(json_escape "$FEATURE_SPEC")" "$(json_escape "$IMPL_PLAN")" "$(json_escape "$TASKS")"
        fi
    else
        echo "REPO_ROOT: $REPO_ROOT"
        echo "BRANCH: $CURRENT_BRANCH"
        echo "FEATURE_DIR: $FEATURE_DIR"
        echo "FEATURE_SPEC: $FEATURE_SPEC"
        echo "IMPL_PLAN: $IMPL_PLAN"
        echo "TASKS: $TASKS"
    fi
    exit 0
fi

# Validate required directories and files
if [[ ! -d "$FEATURE_DIR" ]]; then
    echo "ERROR: Feature directory not found: $FEATURE_DIR" >&2
    echo "Run /speckit.specify first to create the feature structure." >&2
    exit 1
fi

if [[ ! -f "$IMPL_PLAN" ]]; then
    echo "ERROR: plan.md not found in $FEATURE_DIR" >&2
    echo "Run /speckit.plan first to create the implementation plan." >&2
    exit 1
fi

# Check for tasks.md if required
if $REQUIRE_TASKS && [[ ! -f "$TASKS" ]]; then
    echo "ERROR: tasks.md not found in $FEATURE_DIR" >&2
    echo "Run /speckit.tasks first to create the task list." >&2
    exit 1
fi

# Build list of available documents
docs=()

# Always check these optional docs
[[ -f "$RESEARCH" ]] && docs+=("research.md")
[[ -f "$DATA_MODEL" ]] && docs+=("data-model.md")

# Check contracts directory (only if it exists and has files)
if [[ -d "$CONTRACTS_DIR" ]] && [[ -n "$(ls -A "$CONTRACTS_DIR" 2>/dev/null)" ]]; then
    docs+=("contracts/")
fi

[[ -f "$QUICKSTART" ]] && docs+=("quickstart.md")

# Include tasks.md if requested and it exists
if $INCLUDE_TASKS && [[ -f "$TASKS" ]]; then
    docs+=("tasks.md")
fi

# Output results
if $JSON_MODE; then
    # Build JSON array of documents
    if has_jq; then
        if [[ ${#docs[@]} -eq 0 ]]; then
            json_docs="[]"
        else
            json_docs=$(printf '%s\n' "${docs[@]}" | jq -R . | jq -s .)
        fi
        jq -cn \
            --arg feature_dir "$FEATURE_DIR" \
            --argjson docs "$json_docs" \
            '{FEATURE_DIR:$feature_dir,AVAILABLE_DOCS:$docs}'
    else
        if [[ ${#docs[@]} -eq 0 ]]; then
            json_docs="[]"
        else
            json_docs=$(for d in "${docs[@]}"; do printf '"%s",' "$(json_escape "$d")"; done)
            json_docs="[${json_docs%,}]"
        fi
        printf '{"FEATURE_DIR":"%s","AVAILABLE_DOCS":%s}\n' "$(json_escape "$FEATURE_DIR")" "$json_docs"
    fi
else
    # Text output
    echo "FEATURE_DIR:$FEATURE_DIR"
    echo "AVAILABLE_DOCS:"
    
    # Show status of each potential document
    check_file "$RESEARCH" "research.md"
    check_file "$DATA_MODEL" "data-model.md"
    check_dir "$CONTRACTS_DIR" "contracts/"
    check_file "$QUICKSTART" "quickstart.md"
    
    if $INCLUDE_TASKS; then
        check_file "$TASKS" "tasks.md"
    fi
fi
diff --git a/.specify/scripts/bash/common.sh b/.specify/scripts/bash/common.sh
new file mode 100755
index 0000000..40f1c96
--- /dev/null
+++ b/.specify/scripts/bash/common.sh
@@ -0,0 +1,275 @@
#!/usr/bin/env bash
# Common functions and variables for all scripts

# Get repository root, with fallback for non-git repositories
get_repo_root() {
    if git rev-parse --show-toplevel >/dev/null 2>&1; then
        git rev-parse --show-toplevel
    else
        # Fall back to script location for non-git repos
        local script_dir="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
        (cd "$script_dir/../../.." && pwd)
    fi
}

# Get current branch, with fallback for non-git repositories
get_current_branch() {
    # First check if SPECIFY_FEATURE environment variable is set
    if [[ -n "${SPECIFY_FEATURE:-}" ]]; then
        echo "$SPECIFY_FEATURE"
        return
    fi

    # Then check git if available
    if git rev-parse --abbrev-ref HEAD >/dev/null 2>&1; then
        git rev-parse --abbrev-ref HEAD
        return
    fi

    # For non-git repos, try to find the latest feature directory
    local repo_root=$(get_repo_root)
    local specs_dir="$repo_root/specs"

    if [[ -d "$specs_dir" ]]; then
        local latest_feature=""
        local highest=0

        for dir in "$specs_dir"/*; do
            if [[ -d "$dir" ]]; then
                local dirname=$(basename "$dir")
                if [[ "$dirname" =~ ^([0-9]{3})- ]]; then
                    local number=${BASH_REMATCH[1]}
                    number=$((10#$number))
                    if [[ "$number" -gt "$highest" ]]; then
                        highest=$number
                        latest_feature=$dirname
                    fi
                fi
            fi
        done

        if [[ -n "$latest_feature" ]]; then
            echo "$latest_feature"
            return
        fi
    fi

    echo "main"  # Final fallback
}

# Check if we have git available
has_git() {
    git rev-parse --show-toplevel >/dev/null 2>&1
}

check_feature_branch() {
    local branch="$1"
    local has_git_repo="$2"

    # For non-git repos, we can't enforce branch naming but still provide output
    if [[ "$has_git_repo" != "true" ]]; then
        echo "[specify] Warning: Git repository not detected; skipped branch validation" >&2
        return 0
    fi

    if [[ ! "$branch" =~ ^[0-9]{3}- ]]; then
        echo "ERROR: Not on a feature branch. Current branch: $branch" >&2
        echo "Feature branches should be named like: 001-feature-name" >&2
        return 1
    fi

    return 0
}

get_feature_dir() { echo "$1/specs/$2"; }

# Find feature directory by numeric prefix instead of exact branch match
# This allows multiple branches to work on the same spec (e.g., 004-fix-bug, 004-add-feature)
find_feature_dir_by_prefix() {
    local repo_root="$1"
    local branch_name="$2"
    local specs_dir="$repo_root/specs"

    # Extract numeric prefix from branch (e.g., "004" from "004-whatever")
    if [[ ! "$branch_name" =~ ^([0-9]{3})- ]]; then
        # If branch doesn't have numeric prefix, fall back to exact match
        echo "$specs_dir/$branch_name"
        return
    fi

    local prefix="${BASH_REMATCH[1]}"

    # Search for directories in specs/ that start with this prefix
    local matches=()
    if [[ -d "$specs_dir" ]]; then
        for dir in "$specs_dir"/"$prefix"-*; do
            if [[ -d "$dir" ]]; then
                matches+=("$(basename "$dir")")
            fi
        done
    fi

    # Handle results
    if [[ ${#matches[@]} -eq 0 ]]; then
        # No match found - return the branch name path (will fail later with clear error)
        echo "$specs_dir/$branch_name"
    elif [[ ${#matches[@]} -eq 1 ]]; then
        # Exactly one match - perfect!
        echo "$specs_dir/${matches[0]}"
    else
        # Multiple matches - this shouldn't happen with proper naming convention
        echo "ERROR: Multiple spec directories found with prefix '$prefix': ${matches[*]}" >&2
        echo "Please ensure only one spec directory exists per numeric prefix." >&2
        return 1
    fi
}

get_feature_paths() {
    local repo_root=$(get_repo_root)
    local current_branch=$(get_current_branch)
    local has_git_repo="false"

    if has_git; then
        has_git_repo="true"
    fi

    # Use prefix-based lookup to support multiple branches per spec
    local feature_dir
    if ! feature_dir=$(find_feature_dir_by_prefix "$repo_root" "$current_branch"); then
        echo "ERROR: Failed to resolve feature directory" >&2
        return 1
    fi

    # Use printf '%q' to safely quote values, preventing shell injection
    # via crafted branch names or paths containing special characters
    printf 'REPO_ROOT=%q\n' "$repo_root"
    printf 'CURRENT_BRANCH=%q\n' "$current_branch"
    printf 'HAS_GIT=%q\n' "$has_git_repo"
    printf 'FEATURE_DIR=%q\n' "$feature_dir"
    printf 'FEATURE_SPEC=%q\n' "$feature_dir/spec.md"
    printf 'IMPL_PLAN=%q\n' "$feature_dir/plan.md"
    printf 'TASKS=%q\n' "$feature_dir/tasks.md"
    printf 'RESEARCH=%q\n' "$feature_dir/research.md"
    printf 'DATA_MODEL=%q\n' "$feature_dir/data-model.md"
    printf 'QUICKSTART=%q\n' "$feature_dir/quickstart.md"
    printf 'CONTRACTS_DIR=%q\n' "$feature_dir/contracts"
}

# Check if jq is available for safe JSON construction
has_jq() {
    command -v jq >/dev/null 2>&1
}

# Escape a string for safe embedding in a JSON value (fallback when jq is unavailable).
# Handles backslash, double-quote, and JSON-required control character escapes (RFC 8259).
json_escape() {
    local s="$1"
    s="${s//\\/\\\\}"
    s="${s//\"/\\\"}"
    s="${s//$'\n'/\\n}"
    s="${s//$'\t'/\\t}"
    s="${s//$'\r'/\\r}"
    s="${s//$'\b'/\\b}"
    s="${s//$'\f'/\\f}"
    # Escape any remaining U+0001-U+001F control characters as \uXXXX.
    # (U+0000/NUL cannot appear in bash strings and is excluded.)
    # LC_ALL=C ensures ${#s} counts bytes and ${s:$i:1} yields single bytes,
    # so multi-byte UTF-8 sequences (first byte >= 0xC0) pass through intact.
    local LC_ALL=C
    local i char code
    for (( i=0; i<${#s}; i++ )); do
        char="${s:$i:1}"
        printf -v code '%d' "'$char" 2>/dev/null || code=256
        if (( code >= 1 && code <= 31 )); then
            printf '\\u%04x' "$code"
        else
            printf '%s' "$char"
        fi
    done
}

check_file() { [[ -f "$1" ]] && echo "  ✓ $2" || echo "  ✗ $2"; }
check_dir() { [[ -d "$1" && -n $(ls -A "$1" 2>/dev/null) ]] && echo "  ✓ $2" || echo "  ✗ $2"; }

# Resolve a template name to a file path using the priority stack:
#   1. .specify/templates/overrides/
#   2. .specify/presets/<preset-id>/templates/ (sorted by priority from .registry)
#   3. .specify/extensions/<ext-id>/templates/
#   4. .specify/templates/ (core)
resolve_template() {
    local template_name="$1"
    local repo_root="$2"
    local base="$repo_root/.specify/templates"

    # Priority 1: Project overrides
    local override="$base/overrides/${template_name}.md"
    [ -f "$override" ] && echo "$override" && return 0

    # Priority 2: Installed presets (sorted by priority from .registry)
    local presets_dir="$repo_root/.specify/presets"
    if [ -d "$presets_dir" ]; then
        local registry_file="$presets_dir/.registry"
        if [ -f "$registry_file" ] && command -v python3 >/dev/null 2>&1; then
            # Read preset IDs sorted by priority (lower number = higher precedence).
            # The python3 call is wrapped in an if-condition so that set -e does not
            # abort the function when python3 exits non-zero (e.g. invalid JSON).
            local sorted_presets=""
            if sorted_presets=$(SPECKIT_REGISTRY="$registry_file" python3 -c "
import json, sys, os
try:
    with open(os.environ['SPECKIT_REGISTRY']) as f:
        data = json.load(f)
    presets = data.get('presets', {})
    for pid, meta in sorted(presets.items(), key=lambda x: x[1].get('priority', 10)):
        print(pid)
except Exception:
    sys.exit(1)
" 2>/dev/null); then
                if [ -n "$sorted_presets" ]; then
                    # python3 succeeded and returned preset IDs — search in priority order
                    while IFS= read -r preset_id; do
                        local candidate="$presets_dir/$preset_id/templates/${template_name}.md"
                        [ -f "$candidate" ] && echo "$candidate" && return 0
                    done <<< "$sorted_presets"
                fi
                # python3 succeeded but registry has no presets — nothing to search
            else
                # python3 failed (missing, or registry parse error) — fall back to unordered directory scan
                for preset in "$presets_dir"/*/; do
                    [ -d "$preset" ] || continue
                    local candidate="$preset/templates/${template_name}.md"
                    [ -f "$candidate" ] && echo "$candidate" && return 0
                done
            fi
        else
            # Fallback: alphabetical directory order (no python3 available)
            for preset in "$presets_dir"/*/; do
                [ -d "$preset" ] || continue
                local candidate="$preset/templates/${template_name}.md"
                [ -f "$candidate" ] && echo "$candidate" && return 0
            done
        fi
    fi

    # Priority 3: Extension-provided templates
    local ext_dir="$repo_root/.specify/extensions"
    if [ -d "$ext_dir" ]; then
        for ext in "$ext_dir"/*/; do
            [ -d "$ext" ] || continue
            # Skip hidden directories (e.g. .backup, .cache)
            case "$(basename "$ext")" in .*) continue;; esac
            local candidate="$ext/templates/${template_name}.md"
            [ -f "$candidate" ] && echo "$candidate" && return 0
        done
    fi

    # Priority 4: Core templates
    local core="$base/${template_name}.md"
    [ -f "$core" ] && echo "$core" && return 0

    # Template not found in any location.
    # Return 1 so callers can distinguish "not found" from "found".
    # Callers running under set -e should use: TEMPLATE=$(resolve_template ...) || true
    return 1
}

diff --git a/.specify/scripts/bash/create-new-feature.sh b/.specify/scripts/bash/create-new-feature.sh
new file mode 100755
index 0000000..58c5c86
--- /dev/null
+++ b/.specify/scripts/bash/create-new-feature.sh
@@ -0,0 +1,327 @@
#!/usr/bin/env bash

set -e

JSON_MODE=false
SHORT_NAME=""
BRANCH_NUMBER=""
ARGS=()
i=1
while [ $i -le $# ]; do
    arg="${!i}"
    case "$arg" in
        --json) 
            JSON_MODE=true 
            ;;
        --short-name)
            if [ $((i + 1)) -gt $# ]; then
                echo 'Error: --short-name requires a value' >&2
                exit 1
            fi
            i=$((i + 1))
            next_arg="${!i}"
            # Check if the next argument is another option (starts with --)
            if [[ "$next_arg" == --* ]]; then
                echo 'Error: --short-name requires a value' >&2
                exit 1
            fi
            SHORT_NAME="$next_arg"
            ;;
        --number)
            if [ $((i + 1)) -gt $# ]; then
                echo 'Error: --number requires a value' >&2
                exit 1
            fi
            i=$((i + 1))
            next_arg="${!i}"
            if [[ "$next_arg" == --* ]]; then
                echo 'Error: --number requires a value' >&2
                exit 1
            fi
            BRANCH_NUMBER="$next_arg"
            ;;
        --help|-h) 
            echo "Usage: $0 [--json] [--short-name <name>] [--number N] <feature_description>"
            echo ""
            echo "Options:"
            echo "  --json              Output in JSON format"
            echo "  --short-name <name> Provide a custom short name (2-4 words) for the branch"
            echo "  --number N          Specify branch number manually (overrides auto-detection)"
            echo "  --help, -h          Show this help message"
            echo ""
            echo "Examples:"
            echo "  $0 'Add user authentication system' --short-name 'user-auth'"
            echo "  $0 'Implement OAuth2 integration for API' --number 5"
            exit 0
            ;;
        *) 
            ARGS+=("$arg") 
            ;;
    esac
    i=$((i + 1))
done

FEATURE_DESCRIPTION="${ARGS[*]}"
if [ -z "$FEATURE_DESCRIPTION" ]; then
    echo "Usage: $0 [--json] [--short-name <name>] [--number N] <feature_description>" >&2
    exit 1
fi

# Trim whitespace and validate description is not empty (e.g., user passed only whitespace)
FEATURE_DESCRIPTION=$(echo "$FEATURE_DESCRIPTION" | xargs)
if [ -z "$FEATURE_DESCRIPTION" ]; then
    echo "Error: Feature description cannot be empty or contain only whitespace" >&2
    exit 1
fi

# Function to find the repository root by searching for existing project markers
find_repo_root() {
    local dir="$1"
    while [ "$dir" != "/" ]; do
        if [ -d "$dir/.git" ] || [ -d "$dir/.specify" ]; then
            echo "$dir"
            return 0
        fi
        dir="$(dirname "$dir")"
    done
    return 1
}

# Function to get highest number from specs directory
get_highest_from_specs() {
    local specs_dir="$1"
    local highest=0
    
    if [ -d "$specs_dir" ]; then
        for dir in "$specs_dir"/*; do
            [ -d "$dir" ] || continue
            dirname=$(basename "$dir")
            number=$(echo "$dirname" | grep -o '^[0-9]\+' || echo "0")
            number=$((10#$number))
            if [ "$number" -gt "$highest" ]; then
                highest=$number
            fi
        done
    fi
    
    echo "$highest"
}

# Function to get highest number from git branches
get_highest_from_branches() {
    local highest=0
    
    # Get all branches (local and remote)
    branches=$(git branch -a 2>/dev/null || echo "")
    
    if [ -n "$branches" ]; then
        while IFS= read -r branch; do
            # Clean branch name: remove leading markers and remote prefixes
            clean_branch=$(echo "$branch" | sed 's/^[* ]*//; s|^remotes/[^/]*/||')
            
            # Extract feature number if branch matches pattern ###-*
            if echo "$clean_branch" | grep -q '^[0-9]\{3\}-'; then
                number=$(echo "$clean_branch" | grep -o '^[0-9]\{3\}' || echo "0")
                number=$((10#$number))
                if [ "$number" -gt "$highest" ]; then
                    highest=$number
                fi
            fi
        done <<< "$branches"
    fi
    
    echo "$highest"
}

# Function to check existing branches (local and remote) and return next available number
check_existing_branches() {
    local specs_dir="$1"

    # Fetch all remotes to get latest branch info (suppress errors if no remotes)
    git fetch --all --prune >/dev/null 2>&1 || true

    # Get highest number from ALL branches (not just matching short name)
    local highest_branch=$(get_highest_from_branches)

    # Get highest number from ALL specs (not just matching short name)
    local highest_spec=$(get_highest_from_specs "$specs_dir")

    # Take the maximum of both
    local max_num=$highest_branch
    if [ "$highest_spec" -gt "$max_num" ]; then
        max_num=$highest_spec
    fi

    # Return next number
    echo $((max_num + 1))
}

# Function to clean and format a branch name
clean_branch_name() {
    local name="$1"
    echo "$name" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/-\+/-/g' | sed 's/^-//' | sed 's/-$//'
}

# Resolve repository root. Prefer git information when available, but fall back
# to searching for repository markers so the workflow still functions in repositories that
# were initialised with --no-git.
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"

if git rev-parse --show-toplevel >/dev/null 2>&1; then
    REPO_ROOT=$(git rev-parse --show-toplevel)
    HAS_GIT=true
else
    REPO_ROOT="$(find_repo_root "$SCRIPT_DIR")"
    if [ -z "$REPO_ROOT" ]; then
        echo "Error: Could not determine repository root. Please run this script from within the repository." >&2
        exit 1
    fi
    HAS_GIT=false
fi

cd "$REPO_ROOT"

SPECS_DIR="$REPO_ROOT/specs"
mkdir -p "$SPECS_DIR"

# Function to generate branch name with stop word filtering and length filtering
generate_branch_name() {
    local description="$1"
    
    # Common stop words to filter out
    local stop_words="^(i|a|an|the|to|for|of|in|on|at|by|with|from|is|are|was|were|be|been|being|have|has|had|do|does|did|will|would|should|could|can|may|might|must|shall|this|that|these|those|my|your|our|their|want|need|add|get|set)$"
    
    # Convert to lowercase and split into words
    local clean_name=$(echo "$description" | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/ /g')
    
    # Filter words: remove stop words and words shorter than 3 chars (unless they're uppercase acronyms in original)
    local meaningful_words=()
    for word in $clean_name; do
        # Skip empty words
        [ -z "$word" ] && continue
        
        # Keep words that are NOT stop words AND (length >= 3 OR are potential acronyms)
        if ! echo "$word" | grep -qiE "$stop_words"; then
            if [ ${#word} -ge 3 ]; then
                meaningful_words+=("$word")
            elif echo "$description" | grep -q "\b${word^^}\b"; then
                # Keep short words if they appear as uppercase in original (likely acronyms)
                meaningful_words+=("$word")
            fi
        fi
    done
    
    # If we have meaningful words, use first 3-4 of them
    if [ ${#meaningful_words[@]} -gt 0 ]; then
        local max_words=3
        if [ ${#meaningful_words[@]} -eq 4 ]; then max_words=4; fi
        
        local result=""
        local count=0
        for word in "${meaningful_words[@]}"; do
            if [ $count -ge $max_words ]; then break; fi
            if [ -n "$result" ]; then result="$result-"; fi
            result="$result$word"
            count=$((count + 1))
        done
        echo "$result"
    else
        # Fallback to original logic if no meaningful words found
        local cleaned=$(clean_branch_name "$description")
        echo "$cleaned" | tr '-' '\n' | grep -v '^$' | head -3 | tr '\n' '-' | sed 's/-$//'
    fi
}

# Generate branch name
if [ -n "$SHORT_NAME" ]; then
    # Use provided short name, just clean it up
    BRANCH_SUFFIX=$(clean_branch_name "$SHORT_NAME")
else
    # Generate from description with smart filtering
    BRANCH_SUFFIX=$(generate_branch_name "$FEATURE_DESCRIPTION")
fi

# Determine branch number
if [ -z "$BRANCH_NUMBER" ]; then
    if [ "$HAS_GIT" = true ]; then
        # Check existing branches on remotes
        BRANCH_NUMBER=$(check_existing_branches "$SPECS_DIR")
    else
        # Fall back to local directory check
        HIGHEST=$(get_highest_from_specs "$SPECS_DIR")
        BRANCH_NUMBER=$((HIGHEST + 1))
    fi
fi

# Force base-10 interpretation to prevent octal conversion (e.g., 010 → 8 in octal, but should be 10 in decimal)
FEATURE_NUM=$(printf "%03d" "$((10#$BRANCH_NUMBER))")
BRANCH_NAME="${FEATURE_NUM}-${BRANCH_SUFFIX}"

# GitHub enforces a 244-byte limit on branch names
# Validate and truncate if necessary
MAX_BRANCH_LENGTH=244
if [ ${#BRANCH_NAME} -gt $MAX_BRANCH_LENGTH ]; then
    # Calculate how much we need to trim from suffix
    # Account for: feature number (3) + hyphen (1) = 4 chars
    MAX_SUFFIX_LENGTH=$((MAX_BRANCH_LENGTH - 4))
    
    # Truncate suffix at word boundary if possible
    TRUNCATED_SUFFIX=$(echo "$BRANCH_SUFFIX" | cut -c1-$MAX_SUFFIX_LENGTH)
    # Remove trailing hyphen if truncation created one
    TRUNCATED_SUFFIX=$(echo "$TRUNCATED_SUFFIX" | sed 's/-$//')
    
    ORIGINAL_BRANCH_NAME="$BRANCH_NAME"
    BRANCH_NAME="${FEATURE_NUM}-${TRUNCATED_SUFFIX}"
    
    >&2 echo "[specify] Warning: Branch name exceeded GitHub's 244-byte limit"
    >&2 echo "[specify] Original: $ORIGINAL_BRANCH_NAME (${#ORIGINAL_BRANCH_NAME} bytes)"
    >&2 echo "[specify] Truncated to: $BRANCH_NAME (${#BRANCH_NAME} bytes)"
fi

if [ "$HAS_GIT" = true ]; then
    if ! git checkout -b "$BRANCH_NAME" 2>/dev/null; then
        # Check if branch already exists
        if git branch --list "$BRANCH_NAME" | grep -q .; then
            >&2 echo "Error: Branch '$BRANCH_NAME' already exists. Please use a different feature name or specify a different number with --number."
            exit 1
        else
            >&2 echo "Error: Failed to create git branch '$BRANCH_NAME'. Please check your git configuration and try again."
            exit 1
        fi
    fi
else
    >&2 echo "[specify] Warning: Git repository not detected; skipped branch creation for $BRANCH_NAME"
fi

FEATURE_DIR="$SPECS_DIR/$BRANCH_NAME"
mkdir -p "$FEATURE_DIR"

TEMPLATE=$(resolve_template "spec-template" "$REPO_ROOT") || true
SPEC_FILE="$FEATURE_DIR/spec.md"
if [ -n "$TEMPLATE" ] && [ -f "$TEMPLATE" ]; then
    cp "$TEMPLATE" "$SPEC_FILE"
else
    echo "Warning: Spec template not found; created empty spec file" >&2
    touch "$SPEC_FILE"
fi

# Inform the user how to persist the feature variable in their own shell
printf '# To persist: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME" >&2

if $JSON_MODE; then
    if command -v jq >/dev/null 2>&1; then
        jq -cn \
            --arg branch_name "$BRANCH_NAME" \
            --arg spec_file "$SPEC_FILE" \
            --arg feature_num "$FEATURE_NUM" \
            '{BRANCH_NAME:$branch_name,SPEC_FILE:$spec_file,FEATURE_NUM:$feature_num}'
    else
        printf '{"BRANCH_NAME":"%s","SPEC_FILE":"%s","FEATURE_NUM":"%s"}\n' "$(json_escape "$BRANCH_NAME")" "$(json_escape "$SPEC_FILE")" "$(json_escape "$FEATURE_NUM")"
    fi
else
    echo "BRANCH_NAME: $BRANCH_NAME"
    echo "SPEC_FILE: $SPEC_FILE"
    echo "FEATURE_NUM: $FEATURE_NUM"
    printf '# To persist in your shell: export SPECIFY_FEATURE=%q\n' "$BRANCH_NAME"
fi
diff --git a/.specify/scripts/bash/setup-plan.sh b/.specify/scripts/bash/setup-plan.sh
new file mode 100755
index 0000000..9f55231
--- /dev/null
+++ b/.specify/scripts/bash/setup-plan.sh
@@ -0,0 +1,73 @@
#!/usr/bin/env bash

set -e

# Parse command line arguments
JSON_MODE=false
ARGS=()

for arg in "$@"; do
    case "$arg" in
        --json) 
            JSON_MODE=true 
            ;;
        --help|-h) 
            echo "Usage: $0 [--json]"
            echo "  --json    Output results in JSON format"
            echo "  --help    Show this help message"
            exit 0 
            ;;
        *) 
            ARGS+=("$arg") 
            ;;
    esac
done

# Get script directory and load common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"

# Get all paths and variables from common functions
_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
eval "$_paths_output"
unset _paths_output

# Check if we're on a proper feature branch (only for git repos)
check_feature_branch "$CURRENT_BRANCH" "$HAS_GIT" || exit 1

# Ensure the feature directory exists
mkdir -p "$FEATURE_DIR"

# Copy plan template if it exists
TEMPLATE=$(resolve_template "plan-template" "$REPO_ROOT") || true
if [[ -n "$TEMPLATE" ]] && [[ -f "$TEMPLATE" ]]; then
    cp "$TEMPLATE" "$IMPL_PLAN"
    echo "Copied plan template to $IMPL_PLAN"
else
    echo "Warning: Plan template not found"
    # Create a basic plan file if template doesn't exist
    touch "$IMPL_PLAN"
fi

# Output results
if $JSON_MODE; then
    if has_jq; then
        jq -cn \
            --arg feature_spec "$FEATURE_SPEC" \
            --arg impl_plan "$IMPL_PLAN" \
            --arg specs_dir "$FEATURE_DIR" \
            --arg branch "$CURRENT_BRANCH" \
            --arg has_git "$HAS_GIT" \
            '{FEATURE_SPEC:$feature_spec,IMPL_PLAN:$impl_plan,SPECS_DIR:$specs_dir,BRANCH:$branch,HAS_GIT:$has_git}'
    else
        printf '{"FEATURE_SPEC":"%s","IMPL_PLAN":"%s","SPECS_DIR":"%s","BRANCH":"%s","HAS_GIT":"%s"}\n' \
            "$(json_escape "$FEATURE_SPEC")" "$(json_escape "$IMPL_PLAN")" "$(json_escape "$FEATURE_DIR")" "$(json_escape "$CURRENT_BRANCH")" "$(json_escape "$HAS_GIT")"
    fi
else
    echo "FEATURE_SPEC: $FEATURE_SPEC"
    echo "IMPL_PLAN: $IMPL_PLAN" 
    echo "SPECS_DIR: $FEATURE_DIR"
    echo "BRANCH: $CURRENT_BRANCH"
    echo "HAS_GIT: $HAS_GIT"
fi

diff --git a/.specify/scripts/bash/update-agent-context.sh b/.specify/scripts/bash/update-agent-context.sh
new file mode 100755
index 0000000..74a9866
--- /dev/null
+++ b/.specify/scripts/bash/update-agent-context.sh
@@ -0,0 +1,832 @@
#!/usr/bin/env bash

# Update agent context files with information from plan.md
#
# This script maintains AI agent context files by parsing feature specifications 
# and updating agent-specific configuration files with project information.
#
# MAIN FUNCTIONS:
# 1. Environment Validation
#    - Verifies git repository structure and branch information
#    - Checks for required plan.md files and templates
#    - Validates file permissions and accessibility
#
# 2. Plan Data Extraction
#    - Parses plan.md files to extract project metadata
#    - Identifies language/version, frameworks, databases, and project types
#    - Handles missing or incomplete specification data gracefully
#
# 3. Agent File Management
#    - Creates new agent context files from templates when needed
#    - Updates existing agent files with new project information
#    - Preserves manual additions and custom configurations
#    - Supports multiple AI agent formats and directory structures
#
# 4. Content Generation
#    - Generates language-specific build/test commands
#    - Creates appropriate project directory structures
#    - Updates technology stacks and recent changes sections
#    - Maintains consistent formatting and timestamps
#
# 5. Multi-Agent Support
#    - Handles agent-specific file paths and naming conventions
#    - Supports: Claude, Gemini, Copilot, Cursor, Qwen, opencode, Codex, Windsurf, Kilo Code, Auggie CLI, Roo Code, CodeBuddy CLI, Qoder CLI, Amp, SHAI, Tabnine CLI, Kiro CLI, Mistral Vibe, Kimi Code, Pi Coding Agent, iFlow CLI, Antigravity or Generic
#    - Can update single agents or all existing agent files
#    - Creates default Claude file if no agent files exist
#
# Usage: ./update-agent-context.sh [agent_type]
# Agent types: claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|codebuddy|amp|shai|tabnine|kiro-cli|agy|bob|vibe|qodercli|kimi|trae|pi|iflow|generic
# Leave empty to update all existing agent files

set -e

# Enable strict error handling
set -u
set -o pipefail

#==============================================================================
# Configuration and Global Variables
#==============================================================================

# Get script directory and load common functions
SCRIPT_DIR="$(CDPATH="" cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "$SCRIPT_DIR/common.sh"

# Get all paths and variables from common functions
_paths_output=$(get_feature_paths) || { echo "ERROR: Failed to resolve feature paths" >&2; exit 1; }
eval "$_paths_output"
unset _paths_output

NEW_PLAN="$IMPL_PLAN"  # Alias for compatibility with existing code
AGENT_TYPE="${1:-}"

# Agent-specific file paths  
CLAUDE_FILE="$REPO_ROOT/CLAUDE.md"
GEMINI_FILE="$REPO_ROOT/GEMINI.md"
COPILOT_FILE="$REPO_ROOT/.github/agents/copilot-instructions.md"
CURSOR_FILE="$REPO_ROOT/.cursor/rules/specify-rules.mdc"
QWEN_FILE="$REPO_ROOT/QWEN.md"
AGENTS_FILE="$REPO_ROOT/AGENTS.md"
WINDSURF_FILE="$REPO_ROOT/.windsurf/rules/specify-rules.md"
KILOCODE_FILE="$REPO_ROOT/.kilocode/rules/specify-rules.md"
AUGGIE_FILE="$REPO_ROOT/.augment/rules/specify-rules.md"
ROO_FILE="$REPO_ROOT/.roo/rules/specify-rules.md"
CODEBUDDY_FILE="$REPO_ROOT/CODEBUDDY.md"
QODER_FILE="$REPO_ROOT/QODER.md"
# Amp, Kiro CLI, IBM Bob, and Pi all share AGENTS.md — use AGENTS_FILE to avoid
# updating the same file multiple times.
AMP_FILE="$AGENTS_FILE"
SHAI_FILE="$REPO_ROOT/SHAI.md"
TABNINE_FILE="$REPO_ROOT/TABNINE.md"
KIRO_FILE="$AGENTS_FILE"
AGY_FILE="$REPO_ROOT/.agent/rules/specify-rules.md"
BOB_FILE="$AGENTS_FILE"
VIBE_FILE="$REPO_ROOT/.vibe/agents/specify-agents.md"
KIMI_FILE="$REPO_ROOT/KIMI.md"
TRAE_FILE="$REPO_ROOT/.trae/rules/AGENTS.md"
IFLOW_FILE="$REPO_ROOT/IFLOW.md"

# Template file
TEMPLATE_FILE="$REPO_ROOT/.specify/templates/agent-file-template.md"

# Global variables for parsed plan data
NEW_LANG=""
NEW_FRAMEWORK=""
NEW_DB=""
NEW_PROJECT_TYPE=""

#==============================================================================
# Utility Functions
#==============================================================================

log_info() {
    echo "INFO: $1"
}

log_success() {
    echo "✓ $1"
}

log_error() {
    echo "ERROR: $1" >&2
}

log_warning() {
    echo "WARNING: $1" >&2
}

# Cleanup function for temporary files
cleanup() {
    local exit_code=$?
    # Disarm traps to prevent re-entrant loop
    trap - EXIT INT TERM
    rm -f /tmp/agent_update_*_$$
    rm -f /tmp/manual_additions_$$
    exit $exit_code
}

# Set up cleanup trap
trap cleanup EXIT INT TERM

#==============================================================================
# Validation Functions
#==============================================================================

validate_environment() {
    # Check if we have a current branch/feature (git or non-git)
    if [[ -z "$CURRENT_BRANCH" ]]; then
        log_error "Unable to determine current feature"
        if [[ "$HAS_GIT" == "true" ]]; then
            log_info "Make sure you're on a feature branch"
        else
            log_info "Set SPECIFY_FEATURE environment variable or create a feature first"
        fi
        exit 1
    fi
    
    # Check if plan.md exists
    if [[ ! -f "$NEW_PLAN" ]]; then
        log_error "No plan.md found at $NEW_PLAN"
        log_info "Make sure you're working on a feature with a corresponding spec directory"
        if [[ "$HAS_GIT" != "true" ]]; then
            log_info "Use: export SPECIFY_FEATURE=your-feature-name or create a new feature first"
        fi
        exit 1
    fi
    
    # Check if template exists (needed for new files)
    if [[ ! -f "$TEMPLATE_FILE" ]]; then
        log_warning "Template file not found at $TEMPLATE_FILE"
        log_warning "Creating new agent files will fail"
    fi
}

#==============================================================================
# Plan Parsing Functions
#==============================================================================

extract_plan_field() {
    local field_pattern="$1"
    local plan_file="$2"
    
    grep "^\*\*${field_pattern}\*\*: " "$plan_file" 2>/dev/null | \
        head -1 | \
        sed "s|^\*\*${field_pattern}\*\*: ||" | \
        sed 's/^[ \t]*//;s/[ \t]*$//' | \
        grep -v "NEEDS CLARIFICATION" | \
        grep -v "^N/A$" || echo ""
}

parse_plan_data() {
    local plan_file="$1"
    
    if [[ ! -f "$plan_file" ]]; then
        log_error "Plan file not found: $plan_file"
        return 1
    fi
    
    if [[ ! -r "$plan_file" ]]; then
        log_error "Plan file is not readable: $plan_file"
        return 1
    fi
    
    log_info "Parsing plan data from $plan_file"
    
    NEW_LANG=$(extract_plan_field "Language/Version" "$plan_file")
    NEW_FRAMEWORK=$(extract_plan_field "Primary Dependencies" "$plan_file")
    NEW_DB=$(extract_plan_field "Storage" "$plan_file")
    NEW_PROJECT_TYPE=$(extract_plan_field "Project Type" "$plan_file")
    
    # Log what we found
    if [[ -n "$NEW_LANG" ]]; then
        log_info "Found language: $NEW_LANG"
    else
        log_warning "No language information found in plan"
    fi
    
    if [[ -n "$NEW_FRAMEWORK" ]]; then
        log_info "Found framework: $NEW_FRAMEWORK"
    fi
    
    if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]]; then
        log_info "Found database: $NEW_DB"
    fi
    
    if [[ -n "$NEW_PROJECT_TYPE" ]]; then
        log_info "Found project type: $NEW_PROJECT_TYPE"
    fi
}

format_technology_stack() {
    local lang="$1"
    local framework="$2"
    local parts=()
    
    # Add non-empty parts
    [[ -n "$lang" && "$lang" != "NEEDS CLARIFICATION" ]] && parts+=("$lang")
    [[ -n "$framework" && "$framework" != "NEEDS CLARIFICATION" && "$framework" != "N/A" ]] && parts+=("$framework")
    
    # Join with proper formatting
    if [[ ${#parts[@]} -eq 0 ]]; then
        echo ""
    elif [[ ${#parts[@]} -eq 1 ]]; then
        echo "${parts[0]}"
    else
        # Join multiple parts with " + "
        local result="${parts[0]}"
        for ((i=1; i<${#parts[@]}; i++)); do
            result="$result + ${parts[i]}"
        done
        echo "$result"
    fi
}

#==============================================================================
# Template and Content Generation Functions
#==============================================================================

get_project_structure() {
    local project_type="$1"
    
    if [[ "$project_type" == *"web"* ]]; then
        echo "backend/\\nfrontend/\\ntests/"
    else
        echo "src/\\ntests/"
    fi
}

get_commands_for_language() {
    local lang="$1"
    
    case "$lang" in
        *"Python"*)
            echo "cd src && pytest && ruff check ."
            ;;
        *"Rust"*)
            echo "cargo test && cargo clippy"
            ;;
        *"JavaScript"*|*"TypeScript"*)
            echo "npm test \\&\\& npm run lint"
            ;;
        *)
            echo "# Add commands for $lang"
            ;;
    esac
}

get_language_conventions() {
    local lang="$1"
    echo "$lang: Follow standard conventions"
}

create_new_agent_file() {
    local target_file="$1"
    local temp_file="$2"
    local project_name="$3"
    local current_date="$4"
    
    if [[ ! -f "$TEMPLATE_FILE" ]]; then
        log_error "Template not found at $TEMPLATE_FILE"
        return 1
    fi
    
    if [[ ! -r "$TEMPLATE_FILE" ]]; then
        log_error "Template file is not readable: $TEMPLATE_FILE"
        return 1
    fi
    
    log_info "Creating new agent context file from template..."
    
    if ! cp "$TEMPLATE_FILE" "$temp_file"; then
        log_error "Failed to copy template file"
        return 1
    fi
    
    # Replace template placeholders
    local project_structure
    project_structure=$(get_project_structure "$NEW_PROJECT_TYPE")
    
    local commands
    commands=$(get_commands_for_language "$NEW_LANG")
    
    local language_conventions
    language_conventions=$(get_language_conventions "$NEW_LANG")
    
    # Perform substitutions with error checking using safer approach
    # Escape special characters for sed by using a different delimiter or escaping
    local escaped_lang=$(printf '%s\n' "$NEW_LANG" | sed 's/[\[\.*^$()+{}|]/\\&/g')
    local escaped_framework=$(printf '%s\n' "$NEW_FRAMEWORK" | sed 's/[\[\.*^$()+{}|]/\\&/g')
    local escaped_branch=$(printf '%s\n' "$CURRENT_BRANCH" | sed 's/[\[\.*^$()+{}|]/\\&/g')
    
    # Build technology stack and recent change strings conditionally
    local tech_stack
    if [[ -n "$escaped_lang" && -n "$escaped_framework" ]]; then
        tech_stack="- $escaped_lang + $escaped_framework ($escaped_branch)"
    elif [[ -n "$escaped_lang" ]]; then
        tech_stack="- $escaped_lang ($escaped_branch)"
    elif [[ -n "$escaped_framework" ]]; then
        tech_stack="- $escaped_framework ($escaped_branch)"
    else
        tech_stack="- ($escaped_branch)"
    fi

    local recent_change
    if [[ -n "$escaped_lang" && -n "$escaped_framework" ]]; then
        recent_change="- $escaped_branch: Added $escaped_lang + $escaped_framework"
    elif [[ -n "$escaped_lang" ]]; then
        recent_change="- $escaped_branch: Added $escaped_lang"
    elif [[ -n "$escaped_framework" ]]; then
        recent_change="- $escaped_branch: Added $escaped_framework"
    else
        recent_change="- $escaped_branch: Added"
    fi

    local substitutions=(
        "s|\[PROJECT NAME\]|$project_name|"
        "s|\[DATE\]|$current_date|"
        "s|\[EXTRACTED FROM ALL PLAN.MD FILES\]|$tech_stack|"
        "s|\[ACTUAL STRUCTURE FROM PLANS\]|$project_structure|g"
        "s|\[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES\]|$commands|"
        "s|\[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE\]|$language_conventions|"
        "s|\[LAST 3 FEATURES AND WHAT THEY ADDED\]|$recent_change|"
    )
    
    for substitution in "${substitutions[@]}"; do
        if ! sed -i.bak -e "$substitution" "$temp_file"; then
            log_error "Failed to perform substitution: $substitution"
            rm -f "$temp_file" "$temp_file.bak"
            return 1
        fi
    done
    
    # Convert \n sequences to actual newlines
    newline=$(printf '\n')
    sed -i.bak2 "s/\\\\n/${newline}/g" "$temp_file"

    # Clean up backup files
    rm -f "$temp_file.bak" "$temp_file.bak2"

    # Prepend Cursor frontmatter for .mdc files so rules are auto-included
    if [[ "$target_file" == *.mdc ]]; then
        local frontmatter_file
        frontmatter_file=$(mktemp) || return 1
        printf '%s\n' "---" "description: Project Development Guidelines" "globs: [\"**/*\"]" "alwaysApply: true" "---" "" > "$frontmatter_file"
        cat "$temp_file" >> "$frontmatter_file"
        mv "$frontmatter_file" "$temp_file"
    fi

    return 0
}




update_existing_agent_file() {
    local target_file="$1"
    local current_date="$2"
    
    log_info "Updating existing agent context file..."
    
    # Use a single temporary file for atomic update
    local temp_file
    temp_file=$(mktemp) || {
        log_error "Failed to create temporary file"
        return 1
    }
    
    # Process the file in one pass
    local tech_stack=$(format_technology_stack "$NEW_LANG" "$NEW_FRAMEWORK")
    local new_tech_entries=()
    local new_change_entry=""
    
    # Prepare new technology entries
    if [[ -n "$tech_stack" ]] && ! grep -q "$tech_stack" "$target_file"; then
        new_tech_entries+=("- $tech_stack ($CURRENT_BRANCH)")
    fi
    
    if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]] && [[ "$NEW_DB" != "NEEDS CLARIFICATION" ]] && ! grep -q "$NEW_DB" "$target_file"; then
        new_tech_entries+=("- $NEW_DB ($CURRENT_BRANCH)")
    fi
    
    # Prepare new change entry
    if [[ -n "$tech_stack" ]]; then
        new_change_entry="- $CURRENT_BRANCH: Added $tech_stack"
    elif [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]] && [[ "$NEW_DB" != "NEEDS CLARIFICATION" ]]; then
        new_change_entry="- $CURRENT_BRANCH: Added $NEW_DB"
    fi
    
    # Check if sections exist in the file
    local has_active_technologies=0
    local has_recent_changes=0
    
    if grep -q "^## Active Technologies" "$target_file" 2>/dev/null; then
        has_active_technologies=1
    fi
    
    if grep -q "^## Recent Changes" "$target_file" 2>/dev/null; then
        has_recent_changes=1
    fi
    
    # Process file line by line
    local in_tech_section=false
    local in_changes_section=false
    local tech_entries_added=false
    local changes_entries_added=false
    local existing_changes_count=0
    local file_ended=false
    
    while IFS= read -r line || [[ -n "$line" ]]; do
        # Handle Active Technologies section
        if [[ "$line" == "## Active Technologies" ]]; then
            echo "$line" >> "$temp_file"
            in_tech_section=true
            continue
        elif [[ $in_tech_section == true ]] && [[ "$line" =~ ^##[[:space:]] ]]; then
            # Add new tech entries before closing the section
            if [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
                printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
                tech_entries_added=true
            fi
            echo "$line" >> "$temp_file"
            in_tech_section=false
            continue
        elif [[ $in_tech_section == true ]] && [[ -z "$line" ]]; then
            # Add new tech entries before empty line in tech section
            if [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
                printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
                tech_entries_added=true
            fi
            echo "$line" >> "$temp_file"
            continue
        fi
        
        # Handle Recent Changes section
        if [[ "$line" == "## Recent Changes" ]]; then
            echo "$line" >> "$temp_file"
            # Add new change entry right after the heading
            if [[ -n "$new_change_entry" ]]; then
                echo "$new_change_entry" >> "$temp_file"
            fi
            in_changes_section=true
            changes_entries_added=true
            continue
        elif [[ $in_changes_section == true ]] && [[ "$line" =~ ^##[[:space:]] ]]; then
            echo "$line" >> "$temp_file"
            in_changes_section=false
            continue
        elif [[ $in_changes_section == true ]] && [[ "$line" == "- "* ]]; then
            # Keep only first 2 existing changes
            if [[ $existing_changes_count -lt 2 ]]; then
                echo "$line" >> "$temp_file"
                ((existing_changes_count++))
            fi
            continue
        fi
        
        # Update timestamp
        if [[ "$line" =~ (\*\*)?Last\ updated(\*\*)?:.*[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ]]; then
            echo "$line" | sed "s/[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/$current_date/" >> "$temp_file"
        else
            echo "$line" >> "$temp_file"
        fi
    done < "$target_file"
    
    # Post-loop check: if we're still in the Active Technologies section and haven't added new entries
    if [[ $in_tech_section == true ]] && [[ $tech_entries_added == false ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
        printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
        tech_entries_added=true
    fi
    
    # If sections don't exist, add them at the end of the file
    if [[ $has_active_technologies -eq 0 ]] && [[ ${#new_tech_entries[@]} -gt 0 ]]; then
        echo "" >> "$temp_file"
        echo "## Active Technologies" >> "$temp_file"
        printf '%s\n' "${new_tech_entries[@]}" >> "$temp_file"
        tech_entries_added=true
    fi
    
    if [[ $has_recent_changes -eq 0 ]] && [[ -n "$new_change_entry" ]]; then
        echo "" >> "$temp_file"
        echo "## Recent Changes" >> "$temp_file"
        echo "$new_change_entry" >> "$temp_file"
        changes_entries_added=true
    fi
    
    # Ensure Cursor .mdc files have YAML frontmatter for auto-inclusion
    if [[ "$target_file" == *.mdc ]]; then
        if ! head -1 "$temp_file" | grep -q '^---'; then
            local frontmatter_file
            frontmatter_file=$(mktemp) || { rm -f "$temp_file"; return 1; }
            printf '%s\n' "---" "description: Project Development Guidelines" "globs: [\"**/*\"]" "alwaysApply: true" "---" "" > "$frontmatter_file"
            cat "$temp_file" >> "$frontmatter_file"
            mv "$frontmatter_file" "$temp_file"
        fi
    fi

    # Move temp file to target atomically
    if ! mv "$temp_file" "$target_file"; then
        log_error "Failed to update target file"
        rm -f "$temp_file"
        return 1
    fi

    return 0
}
#==============================================================================
# Main Agent File Update Function
#==============================================================================

update_agent_file() {
    local target_file="$1"
    local agent_name="$2"
    
    if [[ -z "$target_file" ]] || [[ -z "$agent_name" ]]; then
        log_error "update_agent_file requires target_file and agent_name parameters"
        return 1
    fi
    
    log_info "Updating $agent_name context file: $target_file"
    
    local project_name
    project_name=$(basename "$REPO_ROOT")
    local current_date
    current_date=$(date +%Y-%m-%d)
    
    # Create directory if it doesn't exist
    local target_dir
    target_dir=$(dirname "$target_file")
    if [[ ! -d "$target_dir" ]]; then
        if ! mkdir -p "$target_dir"; then
            log_error "Failed to create directory: $target_dir"
            return 1
        fi
    fi
    
    if [[ ! -f "$target_file" ]]; then
        # Create new file from template
        local temp_file
        temp_file=$(mktemp) || {
            log_error "Failed to create temporary file"
            return 1
        }
        
        if create_new_agent_file "$target_file" "$temp_file" "$project_name" "$current_date"; then
            if mv "$temp_file" "$target_file"; then
                log_success "Created new $agent_name context file"
            else
                log_error "Failed to move temporary file to $target_file"
                rm -f "$temp_file"
                return 1
            fi
        else
            log_error "Failed to create new agent file"
            rm -f "$temp_file"
            return 1
        fi
    else
        # Update existing file
        if [[ ! -r "$target_file" ]]; then
            log_error "Cannot read existing file: $target_file"
            return 1
        fi
        
        if [[ ! -w "$target_file" ]]; then
            log_error "Cannot write to existing file: $target_file"
            return 1
        fi
        
        if update_existing_agent_file "$target_file" "$current_date"; then
            log_success "Updated existing $agent_name context file"
        else
            log_error "Failed to update existing agent file"
            return 1
        fi
    fi
    
    return 0
}

#==============================================================================
# Agent Selection and Processing
#==============================================================================

update_specific_agent() {
    local agent_type="$1"
    
    case "$agent_type" in
        claude)
            update_agent_file "$CLAUDE_FILE" "Claude Code" || return 1
            ;;
        gemini)
            update_agent_file "$GEMINI_FILE" "Gemini CLI" || return 1
            ;;
        copilot)
            update_agent_file "$COPILOT_FILE" "GitHub Copilot" || return 1
            ;;
        cursor-agent)
            update_agent_file "$CURSOR_FILE" "Cursor IDE" || return 1
            ;;
        qwen)
            update_agent_file "$QWEN_FILE" "Qwen Code" || return 1
            ;;
        opencode)
            update_agent_file "$AGENTS_FILE" "opencode" || return 1
            ;;
        codex)
            update_agent_file "$AGENTS_FILE" "Codex CLI" || return 1
            ;;
        windsurf)
            update_agent_file "$WINDSURF_FILE" "Windsurf" || return 1
            ;;
        kilocode)
            update_agent_file "$KILOCODE_FILE" "Kilo Code" || return 1
            ;;
        auggie)
            update_agent_file "$AUGGIE_FILE" "Auggie CLI" || return 1
            ;;
        roo)
            update_agent_file "$ROO_FILE" "Roo Code" || return 1
            ;;
        codebuddy)
            update_agent_file "$CODEBUDDY_FILE" "CodeBuddy CLI" || return 1
            ;;
        qodercli)
            update_agent_file "$QODER_FILE" "Qoder CLI" || return 1
            ;;
        amp)
            update_agent_file "$AMP_FILE" "Amp" || return 1
            ;;
        shai)
            update_agent_file "$SHAI_FILE" "SHAI" || return 1
            ;;
        tabnine)
            update_agent_file "$TABNINE_FILE" "Tabnine CLI" || return 1
            ;;
        kiro-cli)
            update_agent_file "$KIRO_FILE" "Kiro CLI" || return 1
            ;;
        agy)
            update_agent_file "$AGY_FILE" "Antigravity" || return 1
            ;;
        bob)
            update_agent_file "$BOB_FILE" "IBM Bob" || return 1
            ;;
        vibe)
            update_agent_file "$VIBE_FILE" "Mistral Vibe" || return 1
            ;;
        kimi)
            update_agent_file "$KIMI_FILE" "Kimi Code" || return 1
            ;;
        trae)
            update_agent_file "$TRAE_FILE" "Trae" || return 1
            ;;
        pi)
            update_agent_file "$AGENTS_FILE" "Pi Coding Agent" || return 1
            ;;
        iflow)
            update_agent_file "$IFLOW_FILE" "iFlow CLI" || return 1
            ;;
        generic)
            log_info "Generic agent: no predefined context file. Use the agent-specific update script for your agent."
            ;;
        *)
            log_error "Unknown agent type '$agent_type'"
            log_error "Expected: claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|codebuddy|amp|shai|tabnine|kiro-cli|agy|bob|vibe|qodercli|kimi|trae|pi|iflow|generic"
            exit 1
            ;;
    esac
}

# Helper: skip non-existent files and files already updated (dedup by
# realpath so that variables pointing to the same file — e.g. AMP_FILE,
# KIRO_FILE, BOB_FILE all resolving to AGENTS_FILE — are only written once).
# Uses a linear array instead of associative array for bash 3.2 compatibility.
# Note: defined at top level because bash 3.2 does not support true
# nested/local functions. _updated_paths, _found_agent, and _all_ok are
# initialised exclusively inside update_all_existing_agents so that
# sourcing this script has no side effects on the caller's environment.

_update_if_new() {
    local file="$1" name="$2"
    [[ -f "$file" ]] || return 0
    local real_path
    real_path=$(realpath "$file" 2>/dev/null || echo "$file")
    local p
    if [[ ${#_updated_paths[@]} -gt 0 ]]; then
        for p in "${_updated_paths[@]}"; do
            [[ "$p" == "$real_path" ]] && return 0
        done
    fi
    # Record the file as seen before attempting the update so that:
    # (a) aliases pointing to the same path are not retried on failure
    # (b) _found_agent reflects file existence, not update success
    _updated_paths+=("$real_path")
    _found_agent=true
    update_agent_file "$file" "$name"
}

update_all_existing_agents() {
    _found_agent=false
    _updated_paths=()
    local _all_ok=true

    _update_if_new "$CLAUDE_FILE" "Claude Code"           || _all_ok=false
    _update_if_new "$GEMINI_FILE" "Gemini CLI"             || _all_ok=false
    _update_if_new "$COPILOT_FILE" "GitHub Copilot"        || _all_ok=false
    _update_if_new "$CURSOR_FILE" "Cursor IDE"             || _all_ok=false
    _update_if_new "$QWEN_FILE" "Qwen Code"                || _all_ok=false
    _update_if_new "$AGENTS_FILE" "Codex/opencode"         || _all_ok=false
    _update_if_new "$AMP_FILE" "Amp"                       || _all_ok=false
    _update_if_new "$KIRO_FILE" "Kiro CLI"                 || _all_ok=false
    _update_if_new "$BOB_FILE" "IBM Bob"                   || _all_ok=false
    _update_if_new "$WINDSURF_FILE" "Windsurf"             || _all_ok=false
    _update_if_new "$KILOCODE_FILE" "Kilo Code"            || _all_ok=false
    _update_if_new "$AUGGIE_FILE" "Auggie CLI"             || _all_ok=false
    _update_if_new "$ROO_FILE" "Roo Code"                  || _all_ok=false
    _update_if_new "$CODEBUDDY_FILE" "CodeBuddy CLI"       || _all_ok=false
    _update_if_new "$SHAI_FILE" "SHAI"                     || _all_ok=false
    _update_if_new "$TABNINE_FILE" "Tabnine CLI"           || _all_ok=false
    _update_if_new "$QODER_FILE" "Qoder CLI"               || _all_ok=false
    _update_if_new "$AGY_FILE" "Antigravity"               || _all_ok=false
    _update_if_new "$VIBE_FILE" "Mistral Vibe"             || _all_ok=false
    _update_if_new "$KIMI_FILE" "Kimi Code"                || _all_ok=false
    _update_if_new "$TRAE_FILE" "Trae"                     || _all_ok=false
    _update_if_new "$IFLOW_FILE" "iFlow CLI"               || _all_ok=false

    # If no agent files exist, create a default Claude file
    if [[ "$_found_agent" == false ]]; then
        log_info "No existing agent files found, creating default Claude file..."
        update_agent_file "$CLAUDE_FILE" "Claude Code" || return 1
    fi

    [[ "$_all_ok" == true ]]
}
print_summary() {
    echo
    log_info "Summary of changes:"
    
    if [[ -n "$NEW_LANG" ]]; then
        echo "  - Added language: $NEW_LANG"
    fi
    
    if [[ -n "$NEW_FRAMEWORK" ]]; then
        echo "  - Added framework: $NEW_FRAMEWORK"
    fi
    
    if [[ -n "$NEW_DB" ]] && [[ "$NEW_DB" != "N/A" ]]; then
        echo "  - Added database: $NEW_DB"
    fi
    
    echo
    log_info "Usage: $0 [claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|codebuddy|amp|shai|tabnine|kiro-cli|agy|bob|vibe|qodercli|kimi|trae|pi|iflow|generic]"
}

#==============================================================================
# Main Execution
#==============================================================================

main() {
    # Validate environment before proceeding
    validate_environment
    
    log_info "=== Updating agent context files for feature $CURRENT_BRANCH ==="
    
    # Parse the plan file to extract project information
    if ! parse_plan_data "$NEW_PLAN"; then
        log_error "Failed to parse plan data"
        exit 1
    fi
    
    # Process based on agent type argument
    local success=true
    
    if [[ -z "$AGENT_TYPE" ]]; then
        # No specific agent provided - update all existing agent files
        log_info "No agent specified, updating all existing agent files..."
        if ! update_all_existing_agents; then
            success=false
        fi
    else
        # Specific agent provided - update only that agent
        log_info "Updating specific agent: $AGENT_TYPE"
        if ! update_specific_agent "$AGENT_TYPE"; then
            success=false
        fi
    fi
    
    # Print summary
    print_summary
    
    if [[ "$success" == true ]]; then
        log_success "Agent context update completed successfully"
        exit 0
    else
        log_error "Agent context update completed with errors"
        exit 1
    fi
}

# Execute main function if script is run directly
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
    main "$@"
fi
diff --git a/.specify/templates/agent-file-template.md b/.specify/templates/agent-file-template.md
new file mode 100644
index 0000000..4cc7fd6
--- /dev/null
+++ b/.specify/templates/agent-file-template.md
@@ -0,0 +1,28 @@
# [PROJECT NAME] Development Guidelines

Auto-generated from all feature plans. Last updated: [DATE]

## Active Technologies

[EXTRACTED FROM ALL PLAN.MD FILES]

## Project Structure

```text
[ACTUAL STRUCTURE FROM PLANS]
```

## Commands

[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES]

## Code Style

[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE]

## Recent Changes

[LAST 3 FEATURES AND WHAT THEY ADDED]

<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->
diff --git a/.specify/templates/checklist-template.md b/.specify/templates/checklist-template.md
new file mode 100644
index 0000000..806657d
--- /dev/null
+++ b/.specify/templates/checklist-template.md
@@ -0,0 +1,40 @@
# [CHECKLIST TYPE] Checklist: [FEATURE NAME]

**Purpose**: [Brief description of what this checklist covers]
**Created**: [DATE]
**Feature**: [Link to spec.md or relevant documentation]

**Note**: This checklist is generated by the `/speckit.checklist` command based on feature context and requirements.

<!-- 
  ============================================================================
  IMPORTANT: The checklist items below are SAMPLE ITEMS for illustration only.
  
  The /speckit.checklist command MUST replace these with actual items based on:
  - User's specific checklist request
  - Feature requirements from spec.md
  - Technical context from plan.md
  - Implementation details from tasks.md
  
  DO NOT keep these sample items in the generated checklist file.
  ============================================================================
-->

## [Category 1]

- [ ] CHK001 First checklist item with clear action
- [ ] CHK002 Second checklist item
- [ ] CHK003 Third checklist item

## [Category 2]

- [ ] CHK004 Another category item
- [ ] CHK005 Item with specific criteria
- [ ] CHK006 Final item in this category

## Notes

- Check items off as completed: `[x]`
- Add comments or findings inline
- Link to relevant resources or documentation
- Items are numbered sequentially for easy reference
diff --git a/.specify/templates/constitution-template.md b/.specify/templates/constitution-template.md
new file mode 100644
index 0000000..a4670ff
--- /dev/null
+++ b/.specify/templates/constitution-template.md
@@ -0,0 +1,50 @@
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->

## Core Principles

### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->

### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->

### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->

### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->

### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->

## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->

[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->

## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->

[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->

## Governance
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->

[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->

**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
diff --git a/.specify/templates/plan-template.md b/.specify/templates/plan-template.md
new file mode 100644
index 0000000..5a2fafe
--- /dev/null
+++ b/.specify/templates/plan-template.md
@@ -0,0 +1,104 @@
# Implementation Plan: [FEATURE]

**Branch**: `[###-feature-name]` | **Date**: [DATE] | **Spec**: [link]
**Input**: Feature specification from `/specs/[###-feature-name]/spec.md`

**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/plan-template.md` for the execution workflow.

## Summary

[Extract from feature spec: primary requirement + technical approach from research]

## Technical Context

<!--
  ACTION REQUIRED: Replace the content in this section with the technical details
  for the project. The structure here is presented in advisory capacity to guide
  the iteration process.
-->

**Language/Version**: [e.g., Python 3.11, Swift 5.9, Rust 1.75 or NEEDS CLARIFICATION]  
**Primary Dependencies**: [e.g., FastAPI, UIKit, LLVM or NEEDS CLARIFICATION]  
**Storage**: [if applicable, e.g., PostgreSQL, CoreData, files or N/A]  
**Testing**: [e.g., pytest, XCTest, cargo test or NEEDS CLARIFICATION]  
**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
**Project Type**: [e.g., library/cli/web-service/mobile-app/compiler/desktop-app or NEEDS CLARIFICATION]  
**Performance Goals**: [domain-specific, e.g., 1000 req/s, 10k lines/sec, 60 fps or NEEDS CLARIFICATION]  
**Constraints**: [domain-specific, e.g., <200ms p95, <100MB memory, offline-capable or NEEDS CLARIFICATION]  
**Scale/Scope**: [domain-specific, e.g., 10k users, 1M LOC, 50 screens or NEEDS CLARIFICATION]

## Constitution Check

*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

[Gates determined based on constitution file]

## Project Structure

### Documentation (this feature)

```text
specs/[###-feature]/
├── plan.md              # This file (/speckit.plan command output)
├── research.md          # Phase 0 output (/speckit.plan command)
├── data-model.md        # Phase 1 output (/speckit.plan command)
├── quickstart.md        # Phase 1 output (/speckit.plan command)
├── contracts/           # Phase 1 output (/speckit.plan command)
└── tasks.md             # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```

### Source Code (repository root)
<!--
  ACTION REQUIRED: Replace the placeholder tree below with the concrete layout
  for this feature. Delete unused options and expand the chosen structure with
  real paths (e.g., apps/admin, packages/something). The delivered plan must
  not include Option labels.
-->

```text
# [REMOVE IF UNUSED] Option 1: Single project (DEFAULT)
src/
├── models/
├── services/
├── cli/
└── lib/

tests/
├── contract/
├── integration/
└── unit/

# [REMOVE IF UNUSED] Option 2: Web application (when "frontend" + "backend" detected)
backend/
├── src/
│   ├── models/
│   ├── services/
│   └── api/
└── tests/

frontend/
├── src/
│   ├── components/
│   ├── pages/
│   └── services/
└── tests/

# [REMOVE IF UNUSED] Option 3: Mobile + API (when "iOS/Android" detected)
api/
└── [same as backend above]

ios/ or android/
└── [platform-specific structure: feature modules, UI flows, platform tests]
```

**Structure Decision**: [Document the selected structure and reference the real
directories captured above]

## Complexity Tracking

> **Fill ONLY if Constitution Check has violations that must be justified**

| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |
diff --git a/.specify/templates/spec-template.md b/.specify/templates/spec-template.md
new file mode 100644
index 0000000..c67d914
--- /dev/null
+++ b/.specify/templates/spec-template.md
@@ -0,0 +1,115 @@
# Feature Specification: [FEATURE NAME]

**Feature Branch**: `[###-feature-name]`  
**Created**: [DATE]  
**Status**: Draft  
**Input**: User description: "$ARGUMENTS"

## User Scenarios & Testing *(mandatory)*

<!--
  IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
  Each user story/journey must be INDEPENDENTLY TESTABLE - meaning if you implement just ONE of them,
  you should still have a viable MVP (Minimum Viable Product) that delivers value.
  
  Assign priorities (P1, P2, P3, etc.) to each story, where P1 is the most critical.
  Think of each story as a standalone slice of functionality that can be:
  - Developed independently
  - Tested independently
  - Deployed independently
  - Demonstrated to users independently
-->

### User Story 1 - [Brief Title] (Priority: P1)

[Describe this user journey in plain language]

**Why this priority**: [Explain the value and why it has this priority level]

**Independent Test**: [Describe how this can be tested independently - e.g., "Can be fully tested by [specific action] and delivers [specific value]"]

**Acceptance Scenarios**:

1. **Given** [initial state], **When** [action], **Then** [expected outcome]
2. **Given** [initial state], **When** [action], **Then** [expected outcome]

---

### User Story 2 - [Brief Title] (Priority: P2)

[Describe this user journey in plain language]

**Why this priority**: [Explain the value and why it has this priority level]

**Independent Test**: [Describe how this can be tested independently]

**Acceptance Scenarios**:

1. **Given** [initial state], **When** [action], **Then** [expected outcome]

---

### User Story 3 - [Brief Title] (Priority: P3)

[Describe this user journey in plain language]

**Why this priority**: [Explain the value and why it has this priority level]

**Independent Test**: [Describe how this can be tested independently]

**Acceptance Scenarios**:

1. **Given** [initial state], **When** [action], **Then** [expected outcome]

---

[Add more user stories as needed, each with an assigned priority]

### Edge Cases

<!--
  ACTION REQUIRED: The content in this section represents placeholders.
  Fill them out with the right edge cases.
-->

- What happens when [boundary condition]?
- How does system handle [error scenario]?

## Requirements *(mandatory)*

<!--
  ACTION REQUIRED: The content in this section represents placeholders.
  Fill them out with the right functional requirements.
-->

### Functional Requirements

- **FR-001**: System MUST [specific capability, e.g., "allow users to create accounts"]
- **FR-002**: System MUST [specific capability, e.g., "validate email addresses"]  
- **FR-003**: Users MUST be able to [key interaction, e.g., "reset their password"]
- **FR-004**: System MUST [data requirement, e.g., "persist user preferences"]
- **FR-005**: System MUST [behavior, e.g., "log all security events"]

*Example of marking unclear requirements:*

- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]

### Key Entities *(include if feature involves data)*

- **[Entity 1]**: [What it represents, key attributes without implementation]
- **[Entity 2]**: [What it represents, relationships to other entities]

## Success Criteria *(mandatory)*

<!--
  ACTION REQUIRED: Define measurable success criteria.
  These must be technology-agnostic and measurable.
-->

### Measurable Outcomes

- **SC-001**: [Measurable metric, e.g., "Users can complete account creation in under 2 minutes"]
- **SC-002**: [Measurable metric, e.g., "System handles 1000 concurrent users without degradation"]
- **SC-003**: [User satisfaction metric, e.g., "90% of users successfully complete primary task on first attempt"]
- **SC-004**: [Business metric, e.g., "Reduce support tickets related to [X] by 50%"]
diff --git a/.specify/templates/tasks-template.md b/.specify/templates/tasks-template.md
new file mode 100644
index 0000000..60f9be4
--- /dev/null
+++ b/.specify/templates/tasks-template.md
@@ -0,0 +1,251 @@
---

description: "Task list template for feature implementation"
---

# Tasks: [FEATURE NAME]

**Input**: Design documents from `/specs/[###-feature-name]/`
**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/

**Tests**: The examples below include test tasks. Tests are OPTIONAL - only include them if explicitly requested in the feature specification.

**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

## Path Conventions

- **Single project**: `src/`, `tests/` at repository root
- **Web app**: `backend/src/`, `frontend/src/`
- **Mobile**: `api/src/`, `ios/src/` or `android/src/`
- Paths shown below assume single project - adjust based on plan.md structure

<!-- 
  ============================================================================
  IMPORTANT: The tasks below are SAMPLE TASKS for illustration purposes only.
  
  The /speckit.tasks command MUST replace these with actual tasks based on:
  - User stories from spec.md (with their priorities P1, P2, P3...)
  - Feature requirements from plan.md
  - Entities from data-model.md
  - Endpoints from contracts/
  
  Tasks MUST be organized by user story so each story can be:
  - Implemented independently
  - Tested independently
  - Delivered as an MVP increment
  
  DO NOT keep these sample tasks in the generated tasks.md file.
  ============================================================================
-->

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Project initialization and basic structure

- [ ] T001 Create project structure per implementation plan
- [ ] T002 Initialize [language] project with [framework] dependencies
- [ ] T003 [P] Configure linting and formatting tools

---

## Phase 2: Foundational (Blocking Prerequisites)

**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented

**⚠️ CRITICAL**: No user story work can begin until this phase is complete

Examples of foundational tasks (adjust based on your project):

- [ ] T004 Setup database schema and migrations framework
- [ ] T005 [P] Implement authentication/authorization framework
- [ ] T006 [P] Setup API routing and middleware structure
- [ ] T007 Create base models/entities that all stories depend on
- [ ] T008 Configure error handling and logging infrastructure
- [ ] T009 Setup environment configuration management

**Checkpoint**: Foundation ready - user story implementation can now begin in parallel

---

## Phase 3: User Story 1 - [Title] (Priority: P1) 🎯 MVP

**Goal**: [Brief description of what this story delivers]

**Independent Test**: [How to verify this story works on its own]

### Tests for User Story 1 (OPTIONAL - only if tests requested) ⚠️

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

- [ ] T010 [P] [US1] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T011 [P] [US1] Integration test for [user journey] in tests/integration/test_[name].py

### Implementation for User Story 1

- [ ] T012 [P] [US1] Create [Entity1] model in src/models/[entity1].py
- [ ] T013 [P] [US1] Create [Entity2] model in src/models/[entity2].py
- [ ] T014 [US1] Implement [Service] in src/services/[service].py (depends on T012, T013)
- [ ] T015 [US1] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T016 [US1] Add validation and error handling
- [ ] T017 [US1] Add logging for user story 1 operations

**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently

---

## Phase 4: User Story 2 - [Title] (Priority: P2)

**Goal**: [Brief description of what this story delivers]

**Independent Test**: [How to verify this story works on its own]

### Tests for User Story 2 (OPTIONAL - only if tests requested) ⚠️

- [ ] T018 [P] [US2] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T019 [P] [US2] Integration test for [user journey] in tests/integration/test_[name].py

### Implementation for User Story 2

- [ ] T020 [P] [US2] Create [Entity] model in src/models/[entity].py
- [ ] T021 [US2] Implement [Service] in src/services/[service].py
- [ ] T022 [US2] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T023 [US2] Integrate with User Story 1 components (if needed)

**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently

---

## Phase 5: User Story 3 - [Title] (Priority: P3)

**Goal**: [Brief description of what this story delivers]

**Independent Test**: [How to verify this story works on its own]

### Tests for User Story 3 (OPTIONAL - only if tests requested) ⚠️

- [ ] T024 [P] [US3] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T025 [P] [US3] Integration test for [user journey] in tests/integration/test_[name].py

### Implementation for User Story 3

- [ ] T026 [P] [US3] Create [Entity] model in src/models/[entity].py
- [ ] T027 [US3] Implement [Service] in src/services/[service].py
- [ ] T028 [US3] Implement [endpoint/feature] in src/[location]/[file].py

**Checkpoint**: All user stories should now be independently functional

---

[Add more user story phases as needed, following the same pattern]

---

## Phase N: Polish & Cross-Cutting Concerns

**Purpose**: Improvements that affect multiple user stories

- [ ] TXXX [P] Documentation updates in docs/
- [ ] TXXX Code cleanup and refactoring
- [ ] TXXX Performance optimization across all stories
- [ ] TXXX [P] Additional unit tests (if requested) in tests/unit/
- [ ] TXXX Security hardening
- [ ] TXXX Run quickstart.md validation

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies - can start immediately
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
- **User Stories (Phase 3+)**: All depend on Foundational phase completion
  - User stories can then proceed in parallel (if staffed)
  - Or sequentially in priority order (P1 → P2 → P3)
- **Polish (Final Phase)**: Depends on all desired user stories being complete

### User Story Dependencies

- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - May integrate with US1 but should be independently testable
- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - May integrate with US1/US2 but should be independently testable

### Within Each User Story

- Tests (if included) MUST be written and FAIL before implementation
- Models before services
- Services before endpoints
- Core implementation before integration
- Story complete before moving to next priority

### Parallel Opportunities

- All Setup tasks marked [P] can run in parallel
- All Foundational tasks marked [P] can run in parallel (within Phase 2)
- Once Foundational phase completes, all user stories can start in parallel (if team capacity allows)
- All tests for a user story marked [P] can run in parallel
- Models within a story marked [P] can run in parallel
- Different user stories can be worked on in parallel by different team members

---

## Parallel Example: User Story 1

```bash
# Launch all tests for User Story 1 together (if tests requested):
Task: "Contract test for [endpoint] in tests/contract/test_[name].py"
Task: "Integration test for [user journey] in tests/integration/test_[name].py"

# Launch all models for User Story 1 together:
Task: "Create [Entity1] model in src/models/[entity1].py"
Task: "Create [Entity2] model in src/models/[entity2].py"
```

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup
2. Complete Phase 2: Foundational (CRITICAL - blocks all stories)
3. Complete Phase 3: User Story 1
4. **STOP and VALIDATE**: Test User Story 1 independently
5. Deploy/demo if ready

### Incremental Delivery

1. Complete Setup + Foundational → Foundation ready
2. Add User Story 1 → Test independently → Deploy/Demo (MVP!)
3. Add User Story 2 → Test independently → Deploy/Demo
4. Add User Story 3 → Test independently → Deploy/Demo
5. Each story adds value without breaking previous stories

### Parallel Team Strategy

With multiple developers:

1. Team completes Setup + Foundational together
2. Once Foundational is done:
   - Developer A: User Story 1
   - Developer B: User Story 2
   - Developer C: User Story 3
3. Stories complete and integrate independently

---

## Notes

- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story for traceability
- Each user story should be independently completable and testable
- Verify tests fail before implementing
- Commit after each task or logical group
- Stop at any checkpoint to validate story independently
- Avoid: vague tasks, same file conflicts, cross-story dependencies that break independence
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..98a95ff
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,35 @@
# git-collab Development Guidelines

Auto-generated from all feature plans. Last updated: 2026-03-21

## Active Technologies
- Rust 2021 edition + git2 0.19, clap 4 (derive), ed25519-dalek 2, base64 0.22, serde/serde_json 1, dirs 5, thiserror 2 (003-key-trust-allowlist)
- Git refs under `.git/refs/collab/`, trusted keys file at `.git/collab/trusted-keys` (plain text, not a git object) (003-key-trust-allowlist)
- Rust 2021 edition + ratatui 0.30, crossterm 0.29, git2 0.19 (004-dashboard-filtering)
- N/A (ephemeral filter state, no persistence) (004-dashboard-filtering)

- Rust 2021 edition + git2 0.19, clap 4, serde/serde_json 1, chrono 0.4, thiserror 2. New: `ed25519-dalek`, `rand`, `base64` (001-gpg-event-signing)

## Project Structure

```text
src/
tests/
```

## Commands

cargo test [ONLY COMMANDS FOR ACTIVE TECHNOLOGIES][ONLY COMMANDS FOR ACTIVE TECHNOLOGIES] cargo clippy

## Code Style

Rust 2021 edition: Follow standard conventions

## Recent Changes
- 004-dashboard-filtering: Added Rust 2021 edition + ratatui 0.30, crossterm 0.29, git2 0.19
- 003-key-trust-allowlist: Added Rust 2021 edition + git2 0.19, clap 4 (derive), ed25519-dalek 2, base64 0.22, serde/serde_json 1, dirs 5, thiserror 2

- 001-gpg-event-signing: Added Rust 2021 edition + git2 0.19, clap 4, serde/serde_json 1, chrono 0.4, thiserror 2. New: `ed25519-dalek`, `rand`, `base64`

<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->
diff --git a/specs/001-gpg-event-signing/.analyze-done b/specs/001-gpg-event-signing/.analyze-done
new file mode 100644
index 0000000..4f23f1c
--- /dev/null
+++ b/specs/001-gpg-event-signing/.analyze-done
@@ -0,0 +1,3 @@
Analysis completed: 2026-03-21
Findings: 11 (0 CRITICAL, 1 HIGH, 5 MEDIUM, 5 LOW)
Coverage: 91% (10/11 requirements fully covered)
diff --git a/specs/001-gpg-event-signing/.verify-done b/specs/001-gpg-event-signing/.verify-done
new file mode 100644
index 0000000..c7da7ec
--- /dev/null
+++ b/specs/001-gpg-event-signing/.verify-done
@@ -0,0 +1,5 @@
Verification completed: 2026-03-21
Findings: 0 CRITICAL, 0 HIGH, 1 MEDIUM, 3 LOW
Tasks: 35/35 (100%)
Requirements: 11/11 (100%)
Tests: 93 passing
diff --git a/specs/001-gpg-event-signing/checklists/requirements.md b/specs/001-gpg-event-signing/checklists/requirements.md
new file mode 100644
index 0000000..eab0c59
--- /dev/null
+++ b/specs/001-gpg-event-signing/checklists/requirements.md
@@ -0,0 +1,66 @@
# Requirements Quality Checklist: Ed25519 Signing for Event Commits

**Purpose**: Validate specification completeness, clarity, and consistency for Ed25519 event signing
**Created**: 2026-03-21
**Feature**: [spec.md](../spec.md)

## Requirement Completeness

- [ ] CHK001 Are signing requirements defined for all event action types (IssueOpen, IssueComment, IssueClose, IssueEdit, IssueLabel, IssueUnlabel, IssueAssign, IssueUnassign, IssueReopen, PatchCreate, PatchRevise, PatchReview, PatchComment, PatchInlineComment, PatchClose, PatchMerge)? [Completeness, Spec §FR-001]
- [ ] CHK002 Are key storage path requirements specified with platform-specific fallbacks (XDG_CONFIG_HOME vs hardcoded ~/.config)? [Completeness, Spec §FR-006]
- [ ] CHK003 Are file permission requirements documented for both private key (0o600) and public key files? [Completeness, Gap]
- [ ] CHK004 Are requirements defined for what `collab init-key` outputs to the user? [Completeness, Spec §FR-006]
- [ ] CHK005 Are requirements specified for the `--force` flag behavior on `collab init-key` when a key already exists? [Completeness, Gap]

## Requirement Clarity

- [ ] CHK006 Is "canonical serialization" precisely defined — does it specify sorted keys, compact format, and encoding? [Clarity, Spec §FR-010]
- [ ] CHK007 Is the base64 encoding variant specified (standard vs URL-safe, with/without padding)? [Clarity, Spec §FR-009]
- [ ] CHK008 Is "clear, actionable error message" quantified with specific content requirements for missing-key errors? [Clarity, Spec §FR-005]
- [ ] CHK009 Is "unrecognized key" precisely defined — does it mean not in a local known-keys file, or just cryptographically unverifiable? [Clarity, Spec §FR-004]
- [ ] CHK010 Are the exact field names (`signature`, `pubkey`) and their position in the JSON structure specified? [Clarity, Spec §FR-009]

## Requirement Consistency

- [ ] CHK011 Is the feature branch name (`001-gpg-event-signing`) consistent with the Ed25519 direction, or does the GPG naming create confusion? [Consistency]
- [ ] CHK012 Are verification failure reasons consistent between FR-004 (reject categories) and FR-007 (reporting requirements)? [Consistency, Spec §FR-004/FR-007]
- [ ] CHK013 Are the Merge action signing requirements in FR-002 consistent with the Edge Cases section on unavailable secret keys? [Consistency, Spec §FR-002]

## Acceptance Criteria Quality

- [ ] CHK014 Can SC-004 ("no more than 1 second per 100 synced commits") be objectively measured with a specific benchmark methodology? [Measurability, Spec §SC-004]
- [ ] CHK015 Are acceptance scenarios defined for the `collab init-key` command? [Gap, Spec §FR-006]
- [ ] CHK016 Are acceptance scenarios defined for the `--force` overwrite case? [Gap]

## Scenario Coverage

- [ ] CHK017 Are requirements defined for what happens when a user runs `collab init-key` twice without `--force`? [Coverage, Exception Flow]
- [ ] CHK018 Are requirements specified for sync behavior when *some* refs pass verification and others fail? [Coverage, Spec §FR-008]
- [ ] CHK019 Are requirements defined for how existing unsigned events in a repo are handled on the *first* sync after adoption? [Coverage, Spec §FR-004a]
- [ ] CHK020 Are requirements specified for the interaction between signing and the TUI dashboard display? [Coverage, Gap]

## Edge Case Coverage

- [ ] CHK021 Are requirements defined for behavior when the private key file exists but is corrupted or has wrong format? [Edge Case, Gap]
- [ ] CHK022 Are requirements defined for behavior when the private key file has incorrect permissions (e.g., world-readable)? [Edge Case, Gap]
- [ ] CHK023 Are requirements defined for concurrent event creation (race condition on signing)? [Edge Case, Gap]
- [ ] CHK024 Are requirements defined for event.json files that contain extra unknown fields during verification? [Edge Case, Gap]

## Non-Functional Requirements

- [ ] CHK025 Are key generation entropy requirements specified (CSPRNG source)? [Security, Gap]
- [ ] CHK026 Are requirements defined for private key memory handling (zeroing after use)? [Security, Gap]
- [ ] CHK027 Are observability requirements defined for signing/verification operations (logging, metrics)? [Non-Functional, Gap]

## Dependencies & Assumptions

- [ ] CHK028 Is the assumption that "collaborators exchange public keys out-of-band" sufficient, or should a known-keys mechanism be specified? [Assumption, Spec §Assumptions]
- [ ] CHK029 Is the ed25519-dalek crate dependency assumption validated against the project's Rust edition and platform targets? [Dependency, Spec §Assumptions]

## Notes

- Focus: Security and cryptographic requirements quality
- Depth: Standard
- Audience: Reviewer (PR)
- 29 items total across 8 quality dimensions
- Items reference spec sections where applicable; [Gap] marks missing requirements
diff --git a/specs/001-gpg-event-signing/contracts/cli-commands.md b/specs/001-gpg-event-signing/contracts/cli-commands.md
new file mode 100644
index 0000000..e78f888
--- /dev/null
+++ b/specs/001-gpg-event-signing/contracts/cli-commands.md
@@ -0,0 +1,77 @@
# CLI Contract: Ed25519 Signing Commands

## New Command: `collab init-key`

**Synopsis**: `collab init-key`

**Description**: Generates an Ed25519 keypair for signing events.

**Arguments**: None

**Output (stdout)**:
```
Ed25519 keypair generated.
  Private key: ~/.config/git-collab/signing-key
  Public key:  ~/.config/git-collab/signing-key.pub
  Your public key: <base64-encoded-32-byte-pubkey>
```

**Errors (stderr)**:
- Key already exists: `Error: Signing key already exists at ~/.config/git-collab/signing-key. Use --force to overwrite.`

**Exit codes**: 0 success, 1 error

**Options**:
- `--force`: Overwrite existing keypair

---

## Modified Behavior: All Event Commands

All commands that create events (`issue open`, `issue comment`, `issue close`, `issue edit`, `issue label`, `issue unlabel`, `issue assign`, `issue unassign`, `issue reopen`, `patch create`, `patch revise`, `patch review`, `patch comment`, `patch close`, `patch merge`) now:

1. Load the signing key from `~/.config/git-collab/signing-key`
2. Sign the event data
3. Embed signature and pubkey in event.json

**Error if no key**: `Error: No signing key found. Run 'collab init-key' to generate one.` (exit code 1)

---

## Modified Behavior: `collab sync`

**New behavior**: Before reconciliation, verifies Ed25519 signatures on all incoming event commits.

**Verification failure output (stderr)**:
```
Error: Signature verification failed for refs/collab/issues/<id>:
  Commit <short-oid>: <reason>
  Commit <short-oid>: <reason>
Ref not updated. <N> commit(s) failed verification.
```

**Reasons**: `missing signature`, `invalid signature`, `unknown signing key`

**Behavior**: Valid refs are synced normally. Invalid refs are skipped with warnings. Push still proceeds for locally-valid refs.

---

## event.json Schema (v2 — with signing)

```json
{
  "timestamp": "string (RFC3339)",
  "author": {
    "name": "string",
    "email": "string"
  },
  "action": {
    "type": "string (action discriminator)",
    "...": "action-specific fields"
  },
  "signature": "string (base64-encoded Ed25519 signature, 64 bytes)",
  "pubkey": "string (base64-encoded Ed25519 public key, 32 bytes)"
}
```

The signed payload is the canonical JSON serialization of the object *without* the `signature` and `pubkey` fields, with keys sorted alphabetically.
diff --git a/specs/001-gpg-event-signing/data-model.md b/specs/001-gpg-event-signing/data-model.md
new file mode 100644
index 0000000..25a1137
--- /dev/null
+++ b/specs/001-gpg-event-signing/data-model.md
@@ -0,0 +1,93 @@
# Data Model: Ed25519 Signing for Event Commits

## Entities

### Event (existing — unchanged)

| Field | Type | Description |
|-------|------|-------------|
| timestamp | String (RFC3339) | UTC timestamp of event creation |
| author | Author | Name and email of event creator |
| action | Action (tagged enum) | The event action with type-specific fields |

### Author (existing — unchanged)

| Field | Type | Description |
|-------|------|-------------|
| name | String | Display name |
| email | String | Email address |

### SignedEvent (new — serialized as event.json)

| Field | Type | Description |
|-------|------|-------------|
| *(all Event fields)* | *(flattened)* | Event content via serde flatten |
| signature | String (base64) | Ed25519 signature over canonical Event JSON |
| pubkey | String (base64) | Ed25519 public key of signer (32 bytes, base64) |

### SigningKey (new — stored on disk)

| Attribute | Value |
|-----------|-------|
| Algorithm | Ed25519 |
| Private key size | 32 bytes (expanded to 64 internally by ed25519-dalek) |
| Public key size | 32 bytes |
| Storage format | Base64-encoded raw bytes |
| Private key path | `~/.config/git-collab/signing-key` |
| Public key path | `~/.config/git-collab/signing-key.pub` |
| File permissions | 0o600 (private), 0o644 (public) |

### SignatureVerificationResult (new — runtime only, not persisted)

| Field | Type | Description |
|-------|------|-------------|
| commit_id | Oid | Git commit OID being verified |
| status | VerifyStatus | Valid, Invalid, Missing, UnknownKey |
| pubkey | Option\<String\> | Public key that signed (if present) |
| error | Option\<String\> | Human-readable error detail |

### VerifyStatus (enum)

| Variant | Description |
|---------|-------------|
| Valid | Signature verified successfully against the embedded public key |
| Invalid | Signature present but verification failed (tampered or wrong key) |
| Missing | No signature or pubkey field in event.json |

## Relationships

```
Event 1──1 SignedEvent (wrapper adds signature fields)
SignedEvent ──▶ event.json blob in git commit
SigningKey 1──* SignedEvent (one key signs many events)
SignedEvent ──▶ SignatureVerificationResult (on verification)
```

## State Transitions

### Key Lifecycle

```
[No Key] ──collab init-key──▶ [Key Exists]
[Key Exists] ──sign event──▶ [Key Exists] (no state change)
```

### Event Signing Lifecycle

```
[Event Created] ──serialize──▶ [Canonical JSON]
[Canonical JSON] ──sign──▶ [SignedEvent]
[SignedEvent] ──serialize──▶ [event.json blob]
[event.json blob] ──commit──▶ [Git Commit]
```

### Verification Lifecycle (per ref during sync)

```
[Fetch Remote Ref]
  ──walk DAG──▶ [For Each Commit]
    ──read event.json──▶ [Parse SignedEvent]
      ──extract sig+key──▶ [Verify]
        ──Valid──▶ [Accept Commit]
        ──Invalid/Missing/UnknownKey──▶ [Reject Entire Ref]
```
diff --git a/specs/001-gpg-event-signing/plan.md b/specs/001-gpg-event-signing/plan.md
new file mode 100644
index 0000000..f36182e
--- /dev/null
+++ b/specs/001-gpg-event-signing/plan.md
@@ -0,0 +1,121 @@
# Implementation Plan: Ed25519 Signing for Event Commits

**Branch**: `001-gpg-event-signing` | **Date**: 2026-03-21 | **Spec**: `specs/001-gpg-event-signing/spec.md`
**Input**: Feature specification from `/specs/001-gpg-event-signing/spec.md`

## Summary

Add Ed25519 signing and verification to all event commits in git-collab. Every event (issues, comments, patches, reviews, etc.) will embed a cryptographic signature and public key in `event.json`. Sync verification rejects unsigned or invalidly-signed events. Uses pure-Rust `ed25519-dalek` crate — no external GPG/SSH binaries. Keypair provisioned via explicit `collab init-key` command.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: git2 0.19, clap 4, serde/serde_json 1, chrono 0.4, thiserror 2. New: `ed25519-dalek`, `rand`, `base64`
**Storage**: Git-native (event DAG stored as commits with `event.json` blobs under `refs/collab/`)
**Testing**: `cargo test` with `tempfile` for integration tests
**Target Platform**: Linux (CLI tool)
**Project Type**: CLI tool / library
**Performance Goals**: Signature verification < 1 second per 100 commits (SC-004)
**Constraints**: Pure-Rust crypto, no external binary dependencies, in-process signing
**Scale/Scope**: Single-user local operation with multi-user sync via git remotes

## Constitution Check

*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

Constitution is unconfigured (template only). No gates to enforce. Proceeding.

## Architecture

### Signing Flow

```
User creates event (e.g., issue open)
  → Event struct built (without signature fields)
  → Serialize to canonical JSON (deterministic key ordering)
  → Load Ed25519 private key from ~/.config/git-collab/signing-key
  → Sign canonical JSON bytes
  → Add "signature" (base64) and "pubkey" (base64) fields to Event
  → Serialize final event.json with signature
  → Create git commit as before
```

### Verification Flow (Sync)

```
Fetch remote refs
  → For each incoming commit, read event.json
  → Extract and remove "signature" and "pubkey" fields
  → Reserialize remaining fields to canonical JSON
  → Verify signature against pubkey and canonical bytes
  → If valid: accept commit for reconciliation
  → If invalid/missing: reject entire ref update, report details
```

### Key Management

```
collab init-key
  → Generate Ed25519 keypair using ed25519-dalek + rand
  → Store private key at ~/.config/git-collab/signing-key (base64)
  → Store public key at ~/.config/git-collab/signing-key.pub (base64)
  → Print public key for sharing with collaborators
```

## Project Structure

### Documentation (this feature)

```text
specs/001-gpg-event-signing/
├── plan.md              # This file
├── spec.md              # Feature specification
├── research.md          # Phase 0 output
├── data-model.md        # Phase 1 output
├── quickstart.md        # Phase 1 output
├── contracts/           # Phase 1 output
└── tasks.md             # Phase 2 output (speckit.tasks)
```

### Source Code (repository root)

```text
src/
├── main.rs              # CLI entry point
├── lib.rs               # Library entry, run() dispatcher
├── cli.rs               # Clap command definitions (add init-key)
├── event.rs             # Event/Action structs (add signature, pubkey fields)
├── dag.rs               # DAG operations (integrate signing into commit creation)
├── signing.rs           # NEW: Ed25519 key management, sign, verify functions
├── identity.rs          # Author/signature (unchanged)
├── state.rs             # State materialization (unchanged)
├── issue.rs             # Issue operations (unchanged — signing injected at dag layer)
├── patch.rs             # Patch operations (unchanged — signing injected at dag layer)
├── sync.rs              # Sync logic (add verification before reconciliation)
├── tui.rs               # TUI dashboard (unchanged)
└── error.rs             # Error types (add signing/verification errors)

tests/
├── common/mod.rs        # Test helpers (add key generation helpers)
├── collab_test.rs       # Existing tests (update to use signing)
├── sync_test.rs         # Sync tests (add verification tests)
└── cli_test.rs          # CLI tests (add init-key tests)
```

**Structure Decision**: Single-crate CLI project. New `signing.rs` module contains all Ed25519 logic. Signing is injected at the DAG layer (`dag.rs`) so all event creation paths automatically sign. Verification is added in `sync.rs` before reconciliation.

## Key Design Decisions

1. **Signing at DAG layer, not domain layer**: `dag::create_root_event()` and `dag::append_event()` handle signing so that issue.rs, patch.rs, etc. don't need modification. The signing key is passed as a parameter.

2. **Canonical serialization**: Use `serde_json::to_string()` (not pretty-printed) with sorted keys for the signable payload. The `signature` and `pubkey` fields are added *after* signing, so they're naturally excluded from the signed content.

3. **Two-step serialization**: First serialize Event without signature fields → sign → then serialize SignedEvent (Event + signature + pubkey) as pretty-printed JSON for the blob. This means Event struct stays clean; a wrapper handles the signature fields.

4. **Key storage path**: `~/.config/git-collab/` following XDG conventions. Private key file permissions set to 0o600.

5. **Verification is all-or-nothing per ref**: If any commit in a ref's history fails verification, the entire ref is rejected (FR-008). This is checked by walking the DAG during sync.

## Complexity Tracking

No constitution violations to justify.
diff --git a/specs/001-gpg-event-signing/quickstart.md b/specs/001-gpg-event-signing/quickstart.md
new file mode 100644
index 0000000..615a0fd
--- /dev/null
+++ b/specs/001-gpg-event-signing/quickstart.md
@@ -0,0 +1,77 @@
# Quickstart: Ed25519 Signing for Event Commits

## Prerequisites

- Rust toolchain (2021 edition)
- git-collab built from source (`cargo build`)

## Setup

### 1. Generate a signing key

```bash
collab init-key
# Output: Ed25519 keypair generated.
#   Private key: ~/.config/git-collab/signing-key
#   Public key:  ~/.config/git-collab/signing-key.pub
#   Share your public key with collaborators: <base64-encoded-pubkey>
```

### 2. Use git-collab as before

All event operations now automatically sign with your key:

```bash
collab issue open -t "Bug: foo is broken"
# Creates a signed event commit — signature embedded in event.json

collab patch create -t "Fix foo" --head abc123
# Also signed automatically
```

### 3. Sync with verification

```bash
collab sync origin
# Fetches remote events, verifies all signatures
# Rejects unsigned or tamper-detected events
# Reports which commits failed and why
```

## What changes in event.json

Before:
```json
{
  "timestamp": "2026-03-21T10:00:00Z",
  "author": { "name": "Alice", "email": "alice@example.com" },
  "action": { "type": "IssueOpen", "title": "Bug", "body": "Details" }
}
```

After:
```json
{
  "timestamp": "2026-03-21T10:00:00Z",
  "author": { "name": "Alice", "email": "alice@example.com" },
  "action": { "type": "IssueOpen", "title": "Bug", "body": "Details" },
  "signature": "base64-encoded-ed25519-signature...",
  "pubkey": "base64-encoded-ed25519-public-key..."
}
```

## Key files

| File | Purpose |
|------|---------|
| `~/.config/git-collab/signing-key` | Ed25519 private key (base64) |
| `~/.config/git-collab/signing-key.pub` | Ed25519 public key (base64) |

## Error scenarios

| Scenario | Behavior |
|----------|----------|
| No signing key, try to create event | Error: "No signing key found. Run `collab init-key` first." |
| Sync receives unsigned commit | Ref rejected: "Commit abc123: missing signature" |
| Sync receives invalid signature | Ref rejected: "Commit abc123: signature verification failed" |
| Sync receives unknown pubkey | Ref rejected: "Commit abc123: unknown signing key" |
diff --git a/specs/001-gpg-event-signing/research.md b/specs/001-gpg-event-signing/research.md
new file mode 100644
index 0000000..15eaabb
--- /dev/null
+++ b/specs/001-gpg-event-signing/research.md
@@ -0,0 +1,74 @@
# Research: Ed25519 Signing for Event Commits

## R1: Ed25519 Crate Selection

**Decision**: Use `ed25519-dalek` crate for Ed25519 signing and verification.

**Rationale**: Most widely used pure-Rust Ed25519 implementation. Well-audited, maintained by the RustCrypto community. Provides `SigningKey`, `VerifyingKey`, and `Signature` types with straightforward API. Compatible with standard Ed25519 signatures.

**Alternatives considered**:
- `ring`: Faster but less ergonomic API, heavier dependency, not pure-Rust (uses C/ASM)
- `ec25519-compact`: Smaller crate used by Radicle, but less ecosystem support
- `ed25519-zebra`: Zcash-focused, batch verification oriented — overkill for this use case

**Crate versions**:
- `ed25519-dalek = "2"` (with `rand_core` feature for key generation)
- `rand = "0.8"` (for `OsRng` entropy source)
- `base64 = "0.22"` (for encoding keys and signatures in JSON)

## R2: Canonical Serialization Strategy

**Decision**: Serialize Event struct (without signature/pubkey fields) using `serde_json::to_string()` with a custom serializer that sorts map keys.

**Rationale**: JSON key ordering is not guaranteed by default in serde_json (it uses insertion order). For signature verification to work across different serialization contexts, key ordering must be deterministic. Using `serde_json`'s `sort_keys` feature (via `to_value()` then serializing the Value) guarantees canonical output.

**Approach**:
1. Serialize `Event` to `serde_json::Value`
2. The Value type naturally sorts object keys when serialized
3. Use `serde_json::to_string(&value)` for compact, sorted JSON
4. This byte string is what gets signed

**Alternatives considered**:
- JCS (JSON Canonicalization Scheme, RFC 8785): More rigorous but requires a separate crate and may over-complicate for this use case
- Hash the git tree instead: Ties signature to git internals rather than event content; harder to verify independently

## R3: Event Struct Design for Signatures

**Decision**: Use a wrapper `SignedEvent` struct that contains `Event` fields plus `signature` and `pubkey`, rather than adding optional fields to `Event`.

**Rationale**: Keeps the `Event` struct clean for internal use (construction, business logic). The `SignedEvent` is what gets serialized to `event.json`. On deserialization, the signature fields are extracted for verification, and the remaining fields reconstitute the `Event`.

**Implementation**:
```rust
// In event.rs — Event stays unchanged

// In signing.rs
#[derive(Serialize, Deserialize)]
struct SignedEvent {
    #[serde(flatten)]
    event: Event,
    signature: String,  // base64-encoded Ed25519 signature
    pubkey: String,     // base64-encoded Ed25519 public key
}
```

## R4: Key Storage Location and Format

**Decision**: Store keypair at `~/.config/git-collab/signing-key` (private) and `~/.config/git-collab/signing-key.pub` (public), both base64-encoded.

**Rationale**: Follows XDG Base Directory conventions. Separating public and private keys into distinct files is standard practice. Base64 encoding keeps files text-friendly and easy to copy/share (public key).

**Security**:
- Private key file: permissions 0o600 (owner read/write only)
- Directory: permissions 0o700
- No encryption of private key at rest (consistent with SSH key defaults; passphrase support deferred)

## R5: Verification During Sync

**Decision**: Verify all commits in a ref's DAG during sync, before reconciliation. Walk from root to tip, verify each commit's event.json signature.

**Rationale**: FR-008 requires atomic rejection — if any commit fails, the entire ref is rejected. Walking the full DAG ensures no unsigned commits slip through, including historical ones (FR-004a: no grandfathering).

**Performance**: Ed25519 verification is ~15,000 ops/second on modern hardware. Even 1000 commits per ref would complete in <100ms, well within SC-004 (1s per 100 commits).

**Edge case**: Merge commits (Action::Merge) created during reconciliation also need signatures. The syncing user's key signs these.
diff --git a/specs/001-gpg-event-signing/review.md b/specs/001-gpg-event-signing/review.md
new file mode 100644
index 0000000..8fa2145
--- /dev/null
+++ b/specs/001-gpg-event-signing/review.md
@@ -0,0 +1,37 @@
# Review Report: Ed25519 Signing for Event Commits

**Branch**: `001-gpg-event-signing` | **Reviewed**: 2026-03-21 | **Model**: Claude Sonnet

## Summary Table

| # | Dimension | Verdict | Notes |
|---|-----------|---------|-------|
| 1 | Spec-Plan Alignment | WARN | `UnknownKey` VerifyStatus not fully resolved — no known-keys registry designed |
| 2 | Plan-Tasks Completeness | WARN | Missing `dirs` crate in deps task; `error.rs` absent from plan file table |
| 3 | Dependency Ordering | PASS | Phase ordering correct; US2→US3 dependency sound |
| 4 | Parallelization Correctness | WARN | T004/T005/T006 all target same file (`signing_test.rs`); T032 must land before T031/T033 |
| 5 | Feasibility & Risk | WARN | Critical: `serde_json::Value` key sorting is version-dependent; `serde(flatten)` + `serde(tag)` is a known serde footgun |
| 6 | Standards Compliance | WARN | `rand` v0.8 wrong dep (should be `rand_core`); base64 variant unspecified |
| 7 | Implementation Readiness | WARN | `UnknownKey` semantics ambiguous; T009 canonical JSON assumption fragile |

## Overall Verdict: READY WITH WARNINGS

## Must Fix (blockers before implementation)

1. **Canonical JSON correctness** (T009): Explicitly verify `serde_json::Map` is `BTreeMap`-backed (no `preserve_order` feature). Add byte-exact test. Do not rely on "Value sorts keys automatically."

2. **`serde(flatten)` + internally-tagged enum compatibility** (T002): Write a prototype test confirming round-trip ser/de works with `Event`'s `#[serde(tag = "type")]` Action. If it fails, pivot to manual JSON construction.

3. **`UnknownKey` semantics** (T011, T025): Decide if in-scope. If out-of-scope, remove variant and reduce `VerifyStatus` to `{Valid, Invalid, Missing}`. If in-scope, add known-keys file design.

## Should Fix (warnings)

4. **`rand` dependency** (T001): Replace `rand = "0.8"` with `rand_core = { version = "0.6", features = ["getrandom"] }` or use ed25519-dalek's re-exported `OsRng`.

5. **Missing `dirs` crate** (T001): Add `dirs = "5"` to Cargo.toml dependencies task.

6. **Base64 encoding variant**: Specify standard base64 with padding. Add to contracts and reference in signing tasks.

7. **Parallel file conflict** (T004/T005/T006): Sequence these or have first task create skeleton.

8. **T032 ordering**: Must land before T031 and T033. Adjust parallelism annotation.
diff --git a/specs/001-gpg-event-signing/spec.md b/specs/001-gpg-event-signing/spec.md
new file mode 100644
index 0000000..3ef88aa
--- /dev/null
+++ b/specs/001-gpg-event-signing/spec.md
@@ -0,0 +1,109 @@
# Feature Specification: Ed25519 Signing for Event Commits

**Feature Branch**: `001-gpg-event-signing`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "implement signing. signing is mandatory"

## User Scenarios & Testing

### User Story 1 - Sign Event Commits (Priority: P1)

As a contributor, every event commit I create (issues, comments, patches, reviews) is automatically signed with my Ed25519 key, so that other collaborators can verify the event actually came from me.

**Why this priority**: Without signing, anyone with push access can forge events as another user. Signing is the core purpose of this feature and provides the authenticity guarantee that everything else builds on.

**Independent Test**: Can be fully tested by creating an issue or comment and verifying the resulting commit carries a valid Ed25519 signature matching the author's key.

**Acceptance Scenarios**:

1. **Given** a user has an Ed25519 signing key configured, **When** they create any event (issue, comment, patch, review, label, assign, close, reopen), **Then** the resulting commit is signed with their Ed25519 key.
2. **Given** a user has no signing key configured, **When** they attempt to create any event, **Then** the operation fails with a clear error message explaining that a signing key is required.
3. **Given** a user creates an event, **When** another user inspects the event commit, **Then** the signature is visible and attributable to the author.

---

### User Story 2 - Verify Signatures on Sync (Priority: P2)

As a collaborator syncing from a remote, all incoming event commits are verified for valid Ed25519 signatures, so that forged or tampered events are rejected before they enter my local state.

**Why this priority**: Signing without verification is incomplete — verification closes the trust loop and ensures the integrity of the collaboration history.

**Independent Test**: Can be tested by syncing from a remote that contains both validly-signed and unsigned/forged commits and confirming only valid ones are accepted.

**Acceptance Scenarios**:

1. **Given** a remote contains event commits with valid Ed25519 signatures, **When** I sync, **Then** the events are fetched and reconciled normally.
2. **Given** a remote contains event commits without signatures, **When** I sync, **Then** those commits are rejected and the user is warned about unsigned events.
3. **Given** a remote contains event commits with invalid or unrecognized signatures, **When** I sync, **Then** those commits are rejected and the user is warned with details about the verification failure.
4. **Given** a sync encounters some valid and some invalid commits on the same ref, **When** verification fails, **Then** the entire ref update is rejected (no partial application) and the user is informed which commits failed verification.

---

### User Story 3 - Merge Commit Signing During Reconciliation (Priority: P3)

As a collaborator, when sync reconciliation produces a merge commit (concurrent edits from multiple users), that merge commit is also signed with my Ed25519 key, maintaining the signing chain.

**Why this priority**: Merge commits created during reconciliation are new commits — if left unsigned, they break the trust chain even when all source commits were properly signed.

**Independent Test**: Can be tested by creating a fork scenario (two users edit the same issue concurrently), syncing to trigger reconciliation, and verifying the resulting merge commit is signed.

**Acceptance Scenarios**:

1. **Given** two users have divergent histories on the same ref, **When** sync reconciliation creates a merge commit, **Then** the merge commit is signed with the syncing user's Ed25519 key.

---

### Edge Cases

- What happens when syncing commits signed by a key not in the known-keys list? Verification fails and the commits are rejected, with a message indicating the signing key is unknown.
- What happens during sync if the signing user's secret key is unavailable (e.g., on a CI machine)? The sync fails at the reconciliation step if a merge commit is needed, since it cannot be signed. Fast-forward syncs that require no new commits succeed since no signing is needed.

## Clarifications

### Session 2026-03-21

- Q: How should existing unsigned event commits (created before this feature) be handled during sync verification? → A: Reject them — all historical unsigned commits fail verification. No grandfathering.
- Q: Which signing approach should the implementation use? → A: Ed25519 with pure-Rust crypto (aligned with Radicle's approach), not GPG. Use the `ed25519-dalek` or similar pure-Rust crate for in-process signing and verification.
- Q: Where should the Ed25519 signature be stored within the event commit? → A: As fields in `event.json` (e.g., `"signature": "base64..."`, `"pubkey": "base64..."`).
- Q: How should the Ed25519 keypair be provisioned for a user? → A: Require explicit `collab init-key` command that generates and stores the keypair. No auto-generation.
- Q: What data should be signed within the event? → A: Sign the entire serialized `event.json` content (excluding `signature` and `pubkey` fields themselves).

## Requirements

### Functional Requirements

- **FR-001**: System MUST sign every event commit with the author's Ed25519 key when creating issues, comments, patches, reviews, labels, assignments, closures, and reopens.
- **FR-002**: System MUST sign merge commits created during sync reconciliation with the syncing user's Ed25519 key.
- **FR-003**: System MUST verify Ed25519 signatures on all incoming event commits during sync before applying them to local state.
- **FR-004**: System MUST reject incoming event commits that have no signature, an invalid signature, or a signature from an unrecognized key.
- **FR-004a**: System MUST reject existing unsigned event commits during sync — there is no grandfathering for pre-feature historical commits.
- **FR-005**: System MUST fail with a clear, actionable error message when a user attempts to create an event without a signing key configured.
- **FR-006**: System MUST provide a `collab init-key` command that generates an Ed25519 keypair and stores it in a user-local configuration path. The system MUST NOT auto-generate keys.
- **FR-007**: System MUST report which specific commits failed signature verification during sync, including the reason for failure.
- **FR-008**: System MUST reject ref updates atomically — if any commit in a ref's history fails verification, the entire ref is not updated.
- **FR-009**: System MUST embed the Ed25519 signature and public key as fields in `event.json` (`"signature"` and `"pubkey"`, both base64-encoded) so they are transportable across remotes.
- **FR-010**: System MUST sign the entire serialized `event.json` content (excluding the `signature` and `pubkey` fields themselves). Verification reproduces this canonical serialization and checks against the embedded signature.

### Key Entities

- **Event Commit**: A git commit containing an `event.json` blob, now required to carry an Ed25519 signature from the event author.
- **Ed25519 Signing Key**: The user's Ed25519 private key used to sign event data, stored locally by the system.
- **Ed25519 Public Key**: The user's public key, shared with collaborators to verify signatures.
- **Signature Verification Result**: The outcome of verifying a commit's signature — valid, invalid, missing, or unknown key — used to accept or reject synced events.

## Success Criteria

### Measurable Outcomes

- **SC-001**: 100% of event commits created by the system carry a valid Ed25519 signature.
- **SC-002**: 100% of unsigned or invalidly-signed incoming event commits are rejected during sync.
- **SC-003**: Users without a configured signing key receive an error on their first attempted action, before any unsigned commit is created.
- **SC-004**: Signature verification adds no more than 1 second of overhead per 100 synced commits.

## Assumptions

- The system uses pure-Rust Ed25519 cryptography (e.g., `ed25519-dalek` crate) for signing and verification — no external GPG or SSH binaries required.
- Signing and verification happen in-process, not via subprocess calls.
- Collaborators exchange public keys out-of-band (e.g., direct exchange, included in repo metadata) — key distribution is outside the scope of this feature.
- The Ed25519 keypair is generated and managed by the system, not imported from external GPG/SSH keyrings.
diff --git a/specs/001-gpg-event-signing/tasks.md b/specs/001-gpg-event-signing/tasks.md
new file mode 100644
index 0000000..c01d7cd
--- /dev/null
+++ b/specs/001-gpg-event-signing/tasks.md
@@ -0,0 +1,202 @@
# Tasks: Ed25519 Signing for Event Commits

**Input**: Design documents from `/specs/001-gpg-event-signing/`
**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/

**Tests**: Included (TDD approach — user preference). Write tests first, ensure they fail, then implement.

**Organization**: Tasks grouped by user story for independent implementation and testing.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

---

## Phase 1: Setup

**Purpose**: Add dependencies and create the new signing module skeleton

<!-- sequential -->
- [x] T001 Add `ed25519-dalek` (v2, features: `rand_core`), `rand_core` (v0.6, features: `getrandom`), `base64` (v0.22), and `dirs` (v5) dependencies to Cargo.toml. Do NOT enable serde_json's `preserve_order` feature (BTreeMap-backed Map required for canonical key sorting)
- [x] T002 Create `src/signing.rs` module skeleton with `SignedEvent` struct (using `#[serde(flatten)]` on `Event`), `VerifyStatus` enum (`Valid`, `Invalid`, `Missing` — no `UnknownKey`, key trust is out of scope), and `SignatureVerificationResult` struct per data-model.md. Add `pub mod signing;` to `src/lib.rs`. IMPORTANT: Test that `serde(flatten)` round-trips correctly with Event's internally-tagged Action enum — if it doesn't work, pivot to manual JSON Value construction
- [x] T003 Add signing-related error variants to `src/error.rs`: `Signing(String)` for key/sign errors, `Verification(String)` for verify errors, and `KeyNotFound` for missing key file

---

## Phase 2: Foundational (Blocking Prerequisites)

**Purpose**: Core signing/verification functions that ALL user stories depend on

**CRITICAL**: No user story work can begin until this phase is complete

### Tests for Foundation

<!-- sequential (all target same file: tests/signing_test.rs) -->
- [x] T004 Write tests for key generation and storage in `tests/signing_test.rs`: create the test file, test `generate_keypair()` creates valid key files at expected paths with correct permissions (0o600 for private), test loading keypair from disk, test error when key file missing
- [x] T005 Write tests for sign/verify round-trip in `tests/signing_test.rs`: test `sign_event()` produces a `SignedEvent` with non-empty base64 `signature` and `pubkey` fields, test `verify_signed_event()` returns `Valid` for correctly signed event, test tampered event returns `Invalid`, test missing signature returns `Missing`
- [x] T006 Write tests for canonical serialization in `tests/signing_test.rs`: test that serializing the same `Event` twice produces byte-exact identical output, test that `SignedEvent` JSON contains `signature` and `pubkey` fields alongside flattened event fields, test `serde(flatten)` round-trip with internally-tagged Action enum

### Implementation for Foundation

<!-- sequential -->
- [x] T007 Implement `generate_keypair(config_dir: &Path) -> Result<VerifyingKey>` in `src/signing.rs`: generate Ed25519 keypair using `SigningKey::generate(&mut OsRng)`, write private key (base64) to `{config_dir}/signing-key` with 0o600 permissions, write public key (base64) to `{config_dir}/signing-key.pub`, create config dir with 0o700 if needed
- [x] T008 Implement `load_signing_key(config_dir: &Path) -> Result<SigningKey>` and `load_verifying_key(config_dir: &Path) -> Result<VerifyingKey>` in `src/signing.rs`: read base64-encoded key files, decode, construct key types, return `KeyNotFound` error if files don't exist
- [x] T009 Implement `canonical_json(event: &Event) -> Result<Vec<u8>>` in `src/signing.rs`: serialize Event to `serde_json::Value`, then to compact string with `to_string()`. CRITICAL: Confirm serde_json uses BTreeMap-backed Map (no `preserve_order` feature) which sorts keys alphabetically. Add a dedicated test that serializes the same Event twice and asserts byte-exact equality. Use `base64::engine::general_purpose::STANDARD` (with padding) for all base64 encoding throughout the module
- [x] T010 Implement `sign_event(event: &Event, signing_key: &SigningKey) -> Result<SignedEvent>` in `src/signing.rs`: call `canonical_json()`, sign bytes with `signing_key.sign()`, construct `SignedEvent` with base64-encoded signature and pubkey
- [x] T011 Implement `verify_signed_event(signed: &SignedEvent) -> Result<VerifyStatus>` in `src/signing.rs`: extract signature and pubkey, decode base64 (STANDARD with padding), reconstruct Event from SignedEvent, compute canonical JSON, verify signature against embedded pubkey, return `Valid` if signature checks out or `Invalid` if not. Note: key trust (known-keys registry) is out of scope — any cryptographically valid signature is accepted

**Checkpoint**: Foundation ready — signing primitives work, tests pass

---

## Phase 3: User Story 1 — Sign Event Commits (Priority: P1) MVP

**Goal**: Every event commit created by the system carries a valid Ed25519 signature embedded in event.json

**Independent Test**: Create an issue, read the event.json blob from the commit, verify it contains valid `signature` and `pubkey` fields

### Tests for User Story 1

<!-- parallel-group: 2 (max 3 concurrent) -->
- [x] T012 [P] [US1] Write integration test in `tests/collab_test.rs`: create a temp repo with a signing key, call `issue::open()`, walk the DAG, deserialize event.json as `SignedEvent`, assert `signature` and `pubkey` are present and `verify_signed_event()` returns `Valid`
- [x] T013 [P] [US1] Write integration test in `tests/collab_test.rs`: attempt `issue::open()` without a signing key present, assert it returns a `KeyNotFound` error
- [x] T014 [P] [US1] Write CLI test in `tests/cli_test.rs`: run `collab init-key`, assert success and key files created; run `collab init-key` again without `--force`, assert error; run with `--force`, assert success

### Implementation for User Story 1

<!-- sequential -->
- [x] T015 [US1] Update `dag::create_root_event()` and `dag::append_event()` in `src/dag.rs` to accept an `Option<&SigningKey>` parameter. When `Some`, call `sign_event()` and serialize `SignedEvent` as the blob instead of plain `Event`. When `None`, return `KeyNotFound` error
- [x] T016 [US1] Update `dag::reconcile()` in `src/dag.rs` to accept `Option<&SigningKey>` and sign the merge event when creating merge commits
- [x] T017 [US1] Update all callers in `src/issue.rs` to load the signing key via `load_signing_key()` and pass it to `dag::create_root_event()` / `dag::append_event()`. Use `dirs::config_dir()` or `$HOME/.config/git-collab` for key path
- [x] T018 [US1] Update all callers in `src/patch.rs` to load the signing key and pass it to DAG functions, same pattern as issue.rs
- [x] T019 [US1] Add `InitKey { force: bool }` variant to `Commands` enum in `src/cli.rs` with clap attributes for the `init-key` subcommand with `--force` flag
- [x] T020 [US1] Add `init-key` handler in `src/lib.rs` `run()` function: call `signing::generate_keypair()`, handle key-already-exists error (suggest `--force`), print pubkey on success per CLI contract
- [x] T021 [US1] Update `dag::walk_events()` in `src/dag.rs` to deserialize `SignedEvent` and extract the `Event` from it, so existing state materialization continues to work

**Checkpoint**: User Story 1 complete — all locally-created events are signed, `collab init-key` works

---

## Phase 4: User Story 2 — Verify Signatures on Sync (Priority: P2)

**Goal**: All incoming event commits are verified for valid Ed25519 signatures during sync, with unsigned/invalid commits rejected

**Independent Test**: Sync from a remote that contains unsigned commits, verify they are rejected with clear error messages

### Tests for User Story 2

<!-- parallel-group: 3 (max 3 concurrent) -->
- [x] T022 [P] [US2] Write test in `tests/sync_test.rs`: set up two repos (Alice and Bob) both with signing keys, Alice creates a signed issue, Bob syncs — verify sync succeeds and issue is present
- [x] T023 [P] [US2] Write test in `tests/sync_test.rs`: set up repo with unsigned event commits (created by directly writing event.json without signature), sync from it — verify the ref is rejected and error message includes commit OID and "missing signature"
- [x] T024 [P] [US2] Write test in `tests/sync_test.rs`: set up repo with tampered event (valid signature but modified event content), sync — verify ref rejected with "invalid signature"

### Implementation for User Story 2

<!-- sequential -->
- [x] T025 [US2] Implement `verify_ref(repo: &Repository, ref_name: &str) -> Result<Vec<SignatureVerificationResult>>` in `src/signing.rs`: walk DAG for the ref, for each commit read event.json, deserialize as `SignedEvent`, call `verify_signed_event()`, collect results. Return error if any commit lacks event.json
- [x] T026 [US2] Add verification step in `sync::reconcile_refs()` in `src/sync.rs`: before calling `dag::reconcile()`, call `verify_ref()` on the remote sync ref. If any result is not `Valid`, skip reconciliation for that ref, print error to stderr with commit OIDs and failure reasons per CLI contract, continue to next ref
- [x] T027 [US2] Handle the "new from remote" case in `sync::reconcile_refs()`: when a local ref doesn't exist and we're importing from remote, verify the remote ref first. Reject if verification fails

**Checkpoint**: User Story 2 complete — sync rejects unsigned/invalid events

---

## Phase 5: User Story 3 — Merge Commit Signing During Reconciliation (Priority: P3)

**Goal**: Merge commits created during sync reconciliation are signed with the syncing user's key

**Independent Test**: Create divergent histories between two repos, sync to trigger reconciliation, verify the merge commit's event.json contains a valid signature

### Tests for User Story 3

<!-- sequential -->
- [x] T028 [US3] Write test in `tests/sync_test.rs`: set up two repos (Alice and Bob) with signing keys, both create events on the same issue (divergent history), sync from one to the other — verify the reconciliation merge commit has a valid Ed25519 signature from the syncing user

### Implementation for User Story 3

<!-- sequential -->
- [x] T029 [US3] Update `sync::sync()` in `src/sync.rs` to load the syncing user's signing key at the start and pass it through to `reconcile_refs()` and then to `dag::reconcile()`
- [x] T030 [US3] Ensure `dag::reconcile()` uses the signing key (from T016) to sign the merge event — verify the merge commit's event.json blob is a properly signed `SignedEvent`

**Checkpoint**: All user stories complete — signing chain is maintained through reconciliation

---

## Phase 6: Polish & Cross-Cutting Concerns

**Purpose**: Update existing tests, ensure backward compatibility handling

<!-- sequential (T032 must land before T031/T033 since they depend on the helper) -->
- [x] T032 Update test helpers in `tests/common/mod.rs` to include a `setup_signing_key(config_dir: &Path)` helper that generates a test keypair, and update `init_repo()` / `TestRepo::new()` to call it

<!-- parallel-group: 4 (max 2 concurrent, after T032) -->
- [x] T031 [P] Update existing tests in `tests/collab_test.rs` to set up signing keys in test fixtures so all event creation tests pass with mandatory signing
- [x] T033 [P] Update existing tests in `tests/sync_test.rs` to use signing keys in both repos for sync tests

<!-- sequential -->
- [x] T034 Run `cargo test` and fix any remaining compilation errors or test failures across the entire test suite
- [x] T035 Run quickstart.md scenarios manually: `collab init-key`, create signed issue, sync with verification — verify all flows match documented behavior

---

## Dependencies & Execution Order

### Phase Dependencies

- **Phase 1 (Setup)**: No dependencies — start immediately
- **Phase 2 (Foundation)**: Depends on Phase 1 — BLOCKS all user stories
- **Phase 3 (US1)**: Depends on Phase 2 — MVP
- **Phase 4 (US2)**: Depends on Phase 3 (needs signed events to exist for verification)
- **Phase 5 (US3)**: Depends on Phase 4 (needs verification in sync before testing merge signing)
- **Phase 6 (Polish)**: Depends on Phase 5

### User Story Dependencies

- **US1 (Sign Events)**: Foundational — must complete first since US2 and US3 depend on events being signed
- **US2 (Verify on Sync)**: Depends on US1 — needs signed events to verify against
- **US3 (Merge Signing)**: Depends on US2 — needs verification in sync to validate merge signatures

### Within Each User Story

- Tests written FIRST, verified to FAIL
- Implementation follows test structure
- Story complete before next priority

### Parallel Opportunities

- T004, T005, T006 (foundation tests) can run in parallel
- T012, T013, T014 (US1 tests) can run in parallel
- T022, T023, T024 (US2 tests) can run in parallel
- T031, T032, T033 (polish) can run in parallel

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (T001-T003)
2. Complete Phase 2: Foundation (T004-T011)
3. Complete Phase 3: User Story 1 (T012-T021)
4. **STOP and VALIDATE**: All locally-created events are signed

### Incremental Delivery

1. Setup + Foundation → Signing primitives ready
2. US1 → Events are signed → MVP
3. US2 → Sync verifies signatures → Trust model complete
4. US3 → Merge commits signed → Full signing chain
5. Polish → All tests updated → Production ready

---

## Notes

- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story
- TDD approach: tests first, verify they fail, then implement
- Total: 35 tasks (3 setup, 8 foundation, 10 US1, 6 US2, 3 US3, 5 polish)
- Key architectural insight: signing is injected at the DAG layer so issue.rs/patch.rs callers just need to pass the signing key through
diff --git a/specs/003-key-trust-allowlist/.analyze-done b/specs/003-key-trust-allowlist/.analyze-done
new file mode 100644
index 0000000..55f596e
--- /dev/null
+++ b/specs/003-key-trust-allowlist/.analyze-done
@@ -0,0 +1,54 @@
# Cross-Artifact Consistency Analysis
## Date: 2026-03-21

## Findings

### MEDIUM: Plan references SignatureVerificationResult in check_trust but tasks describe different signature
- Plan (Architecture section): "For each VerifyStatus::Valid result, checks whether result.pubkey is in the trusted set"
- Task T014: "takes &[SignatureVerificationResult] and &TrustPolicy"
- Actual code: SignatureVerificationResult has `pubkey: Option<String>` -- this matches.
- However, the plan's VerifyStatus enum snippet omits the existing `Untrusted` variant description fields. Minor, no real inconsistency since the struct is separate from the enum.
- **Verdict**: Consistent. No action needed.

### MEDIUM: Task T009 implements remove_trusted_key() but it's labeled as US1
- T009 is in Phase 2 (US1 - Add Trusted Keys) but implements `remove_trusted_key()` which is a US3 concern.
- This is intentional for code locality (both save and remove operate on the same file) but creates a traceability gap: US3 remove tests (T017) depend on implementation done in US1's phase.
- **Verdict**: Acceptable pragmatic grouping but slightly misleading labeling.

### LOW: FR-002a numbering is non-standard
- Spec uses FR-002, FR-002a, FR-002b which breaks sequential numbering. FR-002a and FR-002b are sub-requirements of FR-002 but are really independent requirements about sync behavior (not about `key add`).
- **Verdict**: Cosmetic. No functional impact.

### LOW: Spec says "reject the entire ref" (FR-002a) and tasks confirm whole-ref rejection, but current code already rejects on Invalid/Missing
- The current reconcile_refs filters on `r.status != VerifyStatus::Valid`. Adding `Untrusted` variant means it would automatically be caught by this existing filter.
- Plan says trust check happens AFTER verify_ref in reconcile_refs, but the existing rejection logic only checks VerifyStatus. The plan's approach of modifying the results vector before the existing check would work, but T015 describes adding trust checking "after verify_ref succeeds" -- the integration point needs care.
- **Verdict**: The plan's architecture (post-verification filter that mutates results) is sound but T015 description could be more explicit about WHERE in reconcile_refs the check is inserted (before the existing failure-filter loop).

### LOW: No task for updating existing match arms when adding Untrusted variant (T001)
- T001 says "update any existing match arms that need a wildcard or explicit handling." The current code in sync.rs filters on `r.status != VerifyStatus::Valid` using PartialEq, not pattern matching, so no match arms need updating there. But verify_ref itself has match arms on VerifyStatus (lines 245-264) that would need an `Untrusted` arm or wildcard.
- **Verdict**: T001's description covers this ("update any existing match arms") but implementers should note verify_ref's internal match.

### PASS: All functional requirements have corresponding tasks
- FR-001 (project-local file): T003, T008
- FR-002 (key add + --self): T010, T011
- FR-002a (whole-ref rejection): T014, T015
- FR-002b (unsigned commits): T014 (passthrough of Missing status)
- FR-003 (key list): T018
- FR-004 (key remove): T009, T019
- FR-005 (reject untrusted during sync): T014, T015
- FR-006 (accept trusted during sync): T014, T015
- FR-007 (fallback when no file): T008 (Unconfigured variant), T015
- FR-008 (validate key format): T007
- FR-009 (no duplicates): T009
- FR-010 (optional label): T010, T011

### PASS: Dependency ordering is correct
- Phase 1 (types) -> Phase 2 (add) -> Phase 3 (sync) / Phase 4 (list/remove) -> Phase 5 (polish)
- Phase 3 and 4 can correctly parallelize (different files: sync.rs vs lib.rs)

### PASS: Test-first approach is consistently applied
- Every phase has tests written before implementation
- Tests are in parallel groups, implementations are sequential where needed

## Summary
No CRITICAL or HIGH findings. The artifacts are well-aligned with minor labeling and description clarity issues. All spec requirements have task coverage. Dependency ordering is sound. The plan's architecture integrates cleanly with the existing codebase structure.
diff --git a/specs/003-key-trust-allowlist/checklists/requirements.md b/specs/003-key-trust-allowlist/checklists/requirements.md
new file mode 100644
index 0000000..e77bc6b
--- /dev/null
+++ b/specs/003-key-trust-allowlist/checklists/requirements.md
@@ -0,0 +1,36 @@
# Specification Quality Checklist: Key Trust Allowlist

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-21
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass validation.
- Storage location (`.git/collab/trusted-keys`) is mentioned in Assumptions as a scoping constraint, not an implementation prescription.
- The fallback behavior (US2 scenario 4) ensures backward compatibility with repos that haven't configured trust yet.
diff --git a/specs/003-key-trust-allowlist/contracts/cli-commands.md b/specs/003-key-trust-allowlist/contracts/cli-commands.md
new file mode 100644
index 0000000..fe6bb23
--- /dev/null
+++ b/specs/003-key-trust-allowlist/contracts/cli-commands.md
@@ -0,0 +1,161 @@
# CLI Command Contracts: Key Trust Allowlist

**Date**: 2026-03-21 | **Feature**: 003-key-trust-allowlist

## `collab key add`

### Synopsis

```
collab key add [OPTIONS] [PUBKEY]
```

### Arguments

| Argument | Required | Description |
|----------|----------|-------------|
| `PUBKEY` | No (required unless `--self`) | Base64-encoded Ed25519 public key |

### Options

| Option | Description |
|--------|-------------|
| `--self` | Read public key from `~/.config/git-collab/signing-key.pub` |
| `--label <LABEL>` | Human-readable label for the key |

### Behavior

1. If `--self` is set, read pubkey from `~/.config/git-collab/signing-key.pub`. If `PUBKEY` is also provided, error.
2. Validate the key: base64 decode, check 32 bytes, check valid Ed25519 point.
3. Load existing trusted keys file (create `.git/collab/` directory and file if needed).
4. Check for duplicate (key already in file). If duplicate, print message and exit 0.
5. Append key (with optional label) to file.
6. Print confirmation.

### Exit Codes

| Code | Condition |
|------|-----------|
| 0 | Key added successfully, or key already trusted |
| 1 | Invalid key, missing argument, or I/O error |

### Output Examples

```
# Success
Trusted key added: dGhpcyBp...NDU= (Alice)

# Already trusted
Key dGhpcyBp...NDU= is already trusted.

# Invalid key
error: invalid public key: base64 decode failed

# --self without signing key
error: no signing key found — run 'collab init-key' to generate one

# --self and PUBKEY both provided
error: cannot specify both --self and a public key argument
```

---

## `collab key list`

### Synopsis

```
collab key list
```

### Behavior

1. Load and parse trusted keys file.
2. Print each entry, one per line.
3. If no trusted keys file exists or file is empty, print a message.

### Exit Codes

| Code | Condition |
|------|-----------|
| 0 | Always (even if no keys) |

### Output Examples

```
# With keys
dGhpcyBpcyBhIHRlc3Qga2V5IGJ5dGVzMTIzNDU=  Alice
YW5vdGhlciB0ZXN0IGtleSBieXRlczEyMzQ1Njc=  Bob (laptop)
c29tZSBvdGhlciBrZXkgYnl0ZXMxMjM0NTY3ODA=

# No keys
No trusted keys configured.
```

---

## `collab key remove`

### Synopsis

```
collab key remove <PUBKEY>
```

### Arguments

| Argument | Required | Description |
|----------|----------|-------------|
| `PUBKEY` | Yes | Base64-encoded public key to remove |

### Behavior

1. Load trusted keys file.
2. Find matching entry by pubkey string.
3. Rewrite file without the entry.
4. Print removed key and its label (so user can verify).

### Exit Codes

| Code | Condition |
|------|-----------|
| 0 | Key removed successfully |
| 1 | Key not found in trusted list, or I/O error |

### Output Examples

```
# Success
Removed trusted key: dGhpcyBp...NDU= (Alice)

# Not found
error: key dGhpcyBp...NDU= is not in the trusted keys list
```

---

## Sync Behavior Changes

### When Trusted Keys Are Configured

During `collab sync`, after cryptographic signature verification passes, each commit's signing pubkey is checked against the trusted keys set.

If any commit on a ref is signed by an untrusted key, the entire ref is rejected:

```
  Rejecting issues abc12345: commit def67890 — untrusted key: dGhpcyBp...NDU=
```

The untrusted key's full base64 value is printed so the user can copy-paste it into `collab key add` if desired.

### When No Trusted Keys File Exists

Sync proceeds as before (any valid signature accepted). A warning is printed once:

```
warning: no trusted keys configured — all valid signatures accepted. Run 'collab key add --self' to start.
```

### Interaction with Unsigned Events

Unsigned events (no `pubkey` field) are already rejected by `verify_ref()` with `VerifyStatus::Missing`. The trust layer does not change this behavior. Per FR-002b, trust policy only applies to signed commits.
diff --git a/specs/003-key-trust-allowlist/data-model.md b/specs/003-key-trust-allowlist/data-model.md
new file mode 100644
index 0000000..c268774
--- /dev/null
+++ b/specs/003-key-trust-allowlist/data-model.md
@@ -0,0 +1,111 @@
# Data Model: Key Trust Allowlist

**Date**: 2026-03-21 | **Feature**: 003-key-trust-allowlist

## Trusted Keys File

### Location

```
<repo>/.git/collab/trusted-keys
```

Project-local, not committed to any branch, not synced between collaborators.

### Format

SSH authorized_keys style. One entry per line.

```
# Trusted keys for project X
# Added 2026-03-21
dGhpcyBpcyBhIHRlc3Qga2V5IGJ5dGVzMTIzNDU= Alice
YW5vdGhlciB0ZXN0IGtleSBieXRlczEyMzQ1Njc= Bob (laptop)
c29tZSBvdGhlciBrZXkgYnl0ZXMxMjM0NTY3ODA=
```

### Parsing Rules

| Rule | Detail |
|------|--------|
| Delimiter | First space character separates key from label |
| Key field | Base64-encoded 32-byte Ed25519 public key (44 characters with padding) |
| Label field | Optional, free-form text (may contain spaces), everything after first space |
| Comments | Lines starting with `#` (after trimming) are skipped |
| Blank lines | Skipped |
| Malformed lines | Skipped with a warning to stderr (invalid base64, wrong byte length, invalid Ed25519 point) |
| Duplicates | Prevented on `add`; if present in file, deduplicated on load (last wins for label) |
| Encoding | UTF-8 |
| Line endings | LF or CRLF accepted |

### Validation on Add

A key is valid if all of the following hold:
1. Base64-decodable (standard alphabet, with or without padding)
2. Decoded bytes are exactly 32 bytes
3. Bytes represent a valid Ed25519 public key point (`VerifyingKey::from_bytes()` succeeds)

## In-Memory Representation

```rust
/// A single trusted key entry.
pub struct TrustedKey {
    /// Base64-encoded Ed25519 public key (as stored in file and in SignedEvent.pubkey).
    pub pubkey: String,
    /// Optional human-readable label.
    pub label: Option<String>,
}

/// Loaded trust policy.
pub enum TrustPolicy {
    /// No trusted keys file exists. Fall back to accepting any valid signature.
    Unconfigured,
    /// Trusted keys file exists (possibly empty). Enforce allowlist.
    Configured(Vec<TrustedKey>),
}
```

For fast lookup during sync, the `Configured` variant's keys are also loaded into a `HashSet<String>` keyed by the base64 pubkey string.

## VerifyStatus Enum (Modified)

```rust
pub enum VerifyStatus {
    Valid,      // Signature cryptographically valid
    Invalid,    // Signature present but verification failed
    Missing,    // No signature/pubkey fields
    Untrusted,  // NEW: valid signature but key not in trusted set
}
```

## Error Variants (New)

```rust
pub enum Error {
    // ... existing variants ...
    #[error("untrusted key: {0}")]
    UntrustedKey(String),
}
```

## CLI Data Flow

### `collab key add <pubkey> [--label <label>] [--self]`

1. If `--self`: read pubkey from `~/.config/git-collab/signing-key.pub`
2. Validate key (base64 + 32 bytes + valid point)
3. Load existing file (or create)
4. Check for duplicates
5. Append `<pubkey> <label>\n` (or just `<pubkey>\n`)

### `collab key list`

1. Load and parse file
2. Print each entry: `<pubkey>  <label>` (or `<pubkey>  (no label)`)

### `collab key remove <pubkey>`

1. Load file
2. Find matching entry
3. Rewrite file without the entry
4. Print removed key and its label
diff --git a/specs/003-key-trust-allowlist/plan.md b/specs/003-key-trust-allowlist/plan.md
new file mode 100644
index 0000000..fbc30fe
--- /dev/null
+++ b/specs/003-key-trust-allowlist/plan.md
@@ -0,0 +1,146 @@
# Implementation Plan: Key Trust Allowlist

**Branch**: `003-key-trust-allowlist` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/003-key-trust-allowlist/spec.md`

## Summary

Add a trusted-keys allowlist that gates sync verification. Currently `verify_ref()` accepts any valid Ed25519 signature. This feature adds a project-local trusted keys file (`.git/collab/trusted-keys`) and CLI commands (`collab key add/list/remove`) so that only events signed by explicitly trusted public keys are accepted during sync. When no trusted keys file exists, the system falls back to current behavior with a warning.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: git2 0.19, clap 4 (derive), ed25519-dalek 2, base64 0.22, serde/serde_json 1, dirs 5, thiserror 2
**Storage**: Git refs under `.git/refs/collab/`, trusted keys file at `.git/collab/trusted-keys` (plain text, not a git object)
**Testing**: `cargo test`, integration tests in `tests/`, unit tests inline (`#[cfg(test)]`), tempfile 3 for test repos
**Target Platform**: Linux/macOS CLI
**Project Type**: CLI tool (binary crate)
**Performance Goals**: Key management operations complete in under 1 second (SC-004)
**Constraints**: No new dependencies required. All functionality covered by existing crates.
**Scale/Scope**: Trusted keys files expected to have <100 entries per project.

## Constitution Check

Constitution is unconfigured (template placeholders). No gates to enforce.

## Architecture

### Integration with Existing verify_ref Flow

The key trust check is a **post-verification filter** layered on top of the existing `verify_ref()` function. The flow becomes:

1. `sync::reconcile_refs()` calls `signing::verify_ref()` (unchanged -- still checks cryptographic validity)
2. After `verify_ref()` returns, `reconcile_refs()` calls a new `trust::check_trust()` function that:
   - Loads the trusted keys file (if it exists)
   - For each `VerifyStatus::Valid` result, checks whether `result.pubkey` is in the trusted set
   - Returns a new `VerifyStatus::Untrusted` variant (or an `UntrustedKey` error) for keys not in the set
3. If no trusted keys file exists, all valid signatures pass through with a warning

This design keeps `verify_ref()` purely about cryptographic correctness and separates policy (trust) into its own module.

### New VerifyStatus Variant

Add `Untrusted` to the `VerifyStatus` enum:

```rust
pub enum VerifyStatus {
    Valid,
    Invalid,
    Missing,
    Untrusted,  // new: signature valid but key not in trusted set
}
```

### Trusted Keys File Format

Location: `.git/collab/trusted-keys`

```
# Comments start with #
<base64-pubkey> <optional label with spaces>
<base64-pubkey>
```

First space delimits key from label. Lines starting with `#` are comments. Empty lines skipped. Malformed lines produce a warning and are skipped.

### CLI Integration

New top-level `Key` subcommand group in `cli.rs`:

```
collab key add <pubkey> [--label <label>] [--self]
collab key list
collab key remove <pubkey>
```

`--self` reads from `~/.config/git-collab/signing-key.pub` and adds it.

## Project Structure

### Documentation (this feature)

```text
specs/003-key-trust-allowlist/
├── plan.md              # This file
├── spec.md              # Feature specification
├── research.md          # Codebase analysis and design research
├── data-model.md        # Trusted keys file format specification
├── quickstart.md        # Developer quickstart for implementation
├── contracts/
│   └── cli-commands.md  # CLI command contracts
└── checklists/          # Acceptance checklists
```

### Source Code (repository root)

```text
src/
├── trust.rs             # NEW: trusted keys file I/O, trust checking logic
├── signing.rs           # MODIFIED: add Untrusted variant to VerifyStatus
├── sync.rs              # MODIFIED: integrate trust check into reconcile_refs
├── cli.rs               # MODIFIED: add Key subcommand group
├── lib.rs               # MODIFIED: add trust module, wire Key commands
├── error.rs             # MODIFIED: add UntrustedKey error variant

tests/
├── trust_test.rs        # NEW: unit/integration tests for trust module
└── common/              # Shared test helpers (existing)
```

**Structure Decision**: Single flat module (`src/trust.rs`) following the existing pattern of one module per domain concept (signing, sync, issue, patch). No new directories needed.

## Key Design Decisions

### 1. Separate trust module vs. extending signing.rs

**Decision**: New `src/trust.rs` module.
**Rationale**: `signing.rs` handles cryptographic operations (sign, verify). Trust policy (which keys to accept) is a distinct concern. Separation keeps both modules focused and testable independently.

### 2. Untrusted as a VerifyStatus variant vs. separate error

**Decision**: Add `Untrusted` variant to `VerifyStatus`.
**Rationale**: The sync reconciliation loop already pattern-matches on `VerifyStatus`. Adding a variant keeps the rejection logic uniform rather than requiring a second error-handling path.

### 3. Trust check location: verify_ref vs. reconcile_refs

**Decision**: Check trust in `reconcile_refs()` after `verify_ref()` returns, not inside `verify_ref()`.
**Rationale**: `verify_ref()` is a pure cryptographic function that takes only `(repo, ref_name)`. Adding trust would require passing the trusted keys set and coupling it to file I/O. Keeping trust checking in `reconcile_refs()` (which already has access to the repo workdir) is cleaner.

### 4. Fallback behavior when no trusted keys file exists

**Decision**: Accept all valid signatures, print a warning.
**Rationale**: Per FR-007 and spec edge cases. This ensures backward compatibility and a smooth adoption path.

### 5. Key validation on add

**Decision**: Validate base64 decoding AND 32-byte length AND Ed25519 point validity on `collab key add`.
**Rationale**: Per FR-008. Catching invalid keys early prevents confusing errors during sync. Reuse existing `VerifyingKey::from_bytes()` from ed25519-dalek.

### 6. File locking

**Decision**: No file locking for trusted-keys file.
**Rationale**: The file is local-only, single-user. Concurrent CLI invocations on the same file are unlikely and the worst case is a benign race. Consistent with how git itself handles similar local config files.

## Complexity Tracking

No constitution violations to justify.
diff --git a/specs/003-key-trust-allowlist/quickstart.md b/specs/003-key-trust-allowlist/quickstart.md
new file mode 100644
index 0000000..a35624f
--- /dev/null
+++ b/specs/003-key-trust-allowlist/quickstart.md
@@ -0,0 +1,90 @@
# Quickstart: Key Trust Allowlist

**Date**: 2026-03-21 | **Feature**: 003-key-trust-allowlist

## Prerequisites

- Rust toolchain (edition 2021)
- Existing git-collab build: `cargo build`
- A signing keypair: `collab init-key`

## Implementation Order

### Step 1: Add `Untrusted` variant to `VerifyStatus`

File: `src/signing.rs`

Add `Untrusted` to the `VerifyStatus` enum. Update any exhaustive match arms in `src/sync.rs` and `src/signing.rs` that pattern-match on `VerifyStatus` (the compiler will find them).

### Step 2: Add error variant

File: `src/error.rs`

Add `UntrustedKey(String)` variant to the `Error` enum.

### Step 3: Create `src/trust.rs`

New file with these public functions:

```rust
pub fn trusted_keys_path(repo: &Repository) -> PathBuf
pub fn load_trust_policy(repo: &Repository) -> Result<TrustPolicy, Error>
pub fn add_key(repo: &Repository, pubkey: &str, label: Option<&str>) -> Result<(), Error>
pub fn remove_key(repo: &Repository, pubkey: &str) -> Result<(String, Option<String>), Error>
pub fn list_keys(repo: &Repository) -> Result<Vec<TrustedKey>, Error>
pub fn validate_pubkey(pubkey_b64: &str) -> Result<(), Error>
pub fn check_trust(policy: &TrustPolicy, results: &[SignatureVerificationResult]) -> Vec<SignatureVerificationResult>
```

### Step 4: Add CLI commands

File: `src/cli.rs`

Add `KeyCmd` enum and `Key(KeyCmd)` variant to `Commands`.

### Step 5: Wire CLI to trust module

File: `src/lib.rs`

Add `pub mod trust;` and handle `Commands::Key(cmd)` in `run()`.

### Step 6: Integrate trust into sync

File: `src/sync.rs`

In `reconcile_refs()`, after `verify_ref()` succeeds, load `TrustPolicy` and run `check_trust()`. Reject refs with untrusted keys.

### Step 7: Tests

- Unit tests in `src/trust.rs` (`#[cfg(test)]` module)
- Integration test in `tests/trust_test.rs`

## Build and Test

```bash
cargo build
cargo test
cargo test --test trust_test
```

## Manual Verification

```bash
# Generate key if needed
collab init-key

# Add your own key
collab key add --self --label "me"

# List keys
collab key list

# Add a teammate's key
collab key add dGhpcyBpcyBhIHRlc3Qga2V5IGJ5dGVzMTIzNDU= --label "Alice"

# Remove a key
collab key remove dGhpcyBpcyBhIHRlc3Qga2V5IGJ5dGVzMTIzNDU=

# Sync (should warn if no trusted keys configured)
collab sync
```
diff --git a/specs/003-key-trust-allowlist/research.md b/specs/003-key-trust-allowlist/research.md
new file mode 100644
index 0000000..959e246
--- /dev/null
+++ b/specs/003-key-trust-allowlist/research.md
@@ -0,0 +1,77 @@
# Research: Key Trust Allowlist

**Date**: 2026-03-21 | **Feature**: 003-key-trust-allowlist

## Codebase Analysis

### Current Signing Architecture

The signing system lives in `src/signing.rs` and provides:

- **Key management**: `generate_keypair()`, `load_signing_key()`, `load_verifying_key()` — keys stored at `~/.config/git-collab/signing-key{,.pub}` as base64-encoded Ed25519 bytes.
- **Event signing**: `sign_event()` produces a `SignedEvent` (event + base64 signature + base64 pubkey). The pubkey is embedded per-event, not referenced from a keyring.
- **Verification**: `verify_signed_event()` checks the signature against the embedded pubkey. `verify_ref()` walks the DAG for a ref and returns `Vec<SignatureVerificationResult>` with `status: VerifyStatus` and optional `pubkey: Option<String>`.
- **VerifyStatus**: `Valid`, `Invalid`, `Missing` — no trust/authorization concept.

### Current Sync Flow

`src/sync.rs::sync()`:
1. `git fetch` collab refs into `refs/collab/sync/{issues,patches}/*`
2. `reconcile_refs()` for issues and patches
3. `git push` local collab refs
4. Clean up sync refs

`reconcile_refs()`:
1. Iterates sync refs
2. Calls `signing::verify_ref(repo, remote_ref)` for each
3. Rejects the entire ref if **any** commit has `status != Valid`
4. If all valid: reconcile (merge DAGs) or adopt new ref

Key observation: the rejection logic at step 3 is where trust checking naturally fits. The `results` vector already contains `pubkey: Option<String>` for each verified commit.

### CLI Structure

`src/cli.rs` uses clap derive with a top-level `Commands` enum. Subcommand groups use nested enums (`IssueCmd`, `PatchCmd`). Adding `Key(KeyCmd)` follows the established pattern.

`src/lib.rs::run()` dispatches commands via pattern matching. Adding a `Commands::Key(cmd)` arm is straightforward.

### Error Handling

`src/error.rs` uses `thiserror` with variants for Git, JSON, IO, Signing, Verification, Cmd, and KeyNotFound. New trust-related errors fit naturally as additional variants.

### Storage Patterns

The project uses `.git/collab/` for local state (not git objects). The trusted keys file at `.git/collab/trusted-keys` follows this convention. The directory may not exist yet for a given repo; creation logic should mirror any existing patterns.

### Dependencies Already Available

- `base64` 0.22 — for key encoding/decoding
- `ed25519-dalek` 2 — for `VerifyingKey::from_bytes()` validation
- `dirs` 5 — for locating `~/.config/git-collab/signing-key.pub` (used by `--self`)
- `std::fs` — for file I/O (no serde needed; plain text format)

No new crate dependencies are required.

## Design Considerations

### Where to Load Trusted Keys

The trusted keys file path depends on `repo.path()` (the `.git/` directory). In `reconcile_refs()`, the repo is already available. The trust module should expose:

```rust
pub fn trusted_keys_path(repo: &Repository) -> PathBuf
pub fn load_trusted_keys(repo: &Repository) -> Result<Option<HashSet<String>>, Error>
```

Returning `Option<HashSet>` — `None` means no file exists (fallback mode), `Some(set)` means enforce trust.

### Backward Compatibility

- Unsigned events (`VerifyStatus::Missing`) are already rejected by `reconcile_refs()`. The spec says trust policy only applies to signed commits (FR-002b), but the current code already rejects Missing. No change needed — unsigned commits are rejected regardless.
- When no trusted keys file exists, behavior is identical to current (all valid sigs accepted). A warning is printed once per sync.

### Test Strategy

- **Unit tests** in `src/trust.rs`: parse trusted keys file, validate keys, check membership, handle edge cases (comments, blank lines, duplicates, malformed).
- **Integration tests** in `tests/trust_test.rs`: end-to-end flow using temp repos — add key, sync with trusted/untrusted events, verify acceptance/rejection.
- Existing tests in `tests/collab_test.rs` and `tests/sync_test.rs` should continue to pass (no trusted keys file = fallback mode).
diff --git a/specs/003-key-trust-allowlist/review.md b/specs/003-key-trust-allowlist/review.md
new file mode 100644
index 0000000..4663e80
--- /dev/null
+++ b/specs/003-key-trust-allowlist/review.md
@@ -0,0 +1,63 @@
# Review: 003-key-trust-allowlist

**Date**: 2026-03-21
**Reviewer**: Automated cross-artifact analysis
**Artifacts**: spec.md, plan.md, tasks.md
**Codebase verified**: src/signing.rs, src/error.rs, src/sync.rs

---

## Dimension Evaluation

### 1. Spec-Plan Alignment: PASS
The plan faithfully implements all 12 functional requirements (FR-001 through FR-010, including FR-002a and FR-002b). The architecture decisions (separate trust module, post-verification filter, Untrusted variant) directly serve the spec's requirements. The fallback behavior (FR-007), key validation (FR-008), duplicate prevention (FR-009), and label support (FR-010) are all addressed. No spec requirements are missed or contradicted.

### 2. Plan-Tasks Completeness: PASS
Every architectural element in the plan maps to at least one task. All files listed in the plan's Project Structure section appear in the tasks' File Ownership table. The TDD approach from the plan is reflected in test tasks preceding implementation tasks in every phase. One minor note: T009 bundles `remove_trusted_key()` implementation into Phase 2 (US1) rather than Phase 4 (US3), but this is pragmatic grouping for code locality and does not create a gap.

### 3. Dependency Ordering: PASS
Phase dependencies are correct and verified against the codebase:
- Phase 1 creates types that Phase 2+ needs (VerifyStatus::Untrusted, Error::UntrustedKey, trust module)
- Phase 2 creates trust.rs I/O functions that Phase 3 (sync integration) and Phase 4 (CLI list/remove) need
- Phase 3 and Phase 4 touch different files (sync.rs vs lib.rs) and can correctly parallelize
- Phase 5 depends on all prior phases
- Within-phase sequential ordering is correct (tests before implementation)

### 4. Parallelization Correctness: PASS
Parallel groups are correctly identified:
- Group 1 (T004/T005/T006): Different test locations (unit vs integration), no file conflicts
- Group 2 (T008/T009): Different functions in same file but no overlap in implementation
- Group 3 (T012/T013): Unit tests vs integration tests, different files
- Group 4 (T016/T017): Different test functions in same file, independent
- Group 5 (T020/T021): Different test files
- Phase 3 || Phase 4: Confirmed different files (sync.rs vs lib.rs)

Minor concern: Group 2 (T008/T009) both write to `src/trust.rs`. While they implement different functions, parallel editing of the same file can cause merge conflicts. This is a WARN-level risk but manageable with coordination.

### 5. Feasibility & Risk: PASS
- All referenced crate APIs exist and are correct (ed25519-dalek VerifyingKey::from_bytes, base64 decode, etc.)
- The existing codebase structure (SignatureVerificationResult with pubkey: Option<String>) supports the plan's approach
- The existing rejection logic in reconcile_refs (`r.status != VerifyStatus::Valid`) means adding Untrusted variant will be caught automatically, simplifying T015
- No new dependencies needed -- confirmed against Cargo.toml implicit constraints
- File format (authorized_keys style) is simple and well-understood
- Scale assumption (<100 keys) makes O(n) lookup acceptable

### 6. Implementation Readiness: PASS
- File paths are exact and verified against the repository
- Function signatures are specified with input/output types
- Test scenarios map to acceptance criteria in the spec
- Checkpoints after each phase enable incremental validation
- The TDD workflow is explicit and matches the user's stated preferences
- 22 tasks at reasonable granularity -- neither too coarse nor too fine

---

## Overall Verdict: PASS

The spec, plan, and tasks are well-aligned, complete, and ready for implementation. No critical or high-severity issues found. The architecture integrates cleanly with the existing codebase. The only minor concerns are:

1. T009 labeling (remove function grouped under US1 phase) -- pragmatic but slightly misleading
2. T008/T009 parallel editing of the same file -- manageable risk
3. T015's integration point description could be more explicit about insertion location within reconcile_refs

These are all LOW severity and do not block implementation.
diff --git a/specs/003-key-trust-allowlist/spec.md b/specs/003-key-trust-allowlist/spec.md
new file mode 100644
index 0000000..d6292ed
--- /dev/null
+++ b/specs/003-key-trust-allowlist/spec.md
@@ -0,0 +1,113 @@
# Feature Specification: Key Trust Allowlist

**Feature Branch**: `003-key-trust-allowlist`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Add a key trust/allowlist mechanism for sync verification. Currently any valid Ed25519 key is accepted during sync — an attacker with remote write access can forge events with their own key. Add a known-keys file that lists trusted public keys, and reject events signed by unknown keys during sync. Users should be able to add/remove trusted keys via CLI commands."

## User Scenarios & Testing

### User Story 1 - Add Trusted Keys (Priority: P1)

As a collaborator, I can add a teammate's public key to a trusted keys list so that events signed by their key are accepted during sync.

**Why this priority**: Without the ability to register trusted keys, there is no allowlist to verify against. This is the foundational action that enables the entire trust model.

**Independent Test**: Can be fully tested by running `collab key add <pubkey>`, then verifying the key appears in the trusted keys list via `collab key list`.

**Acceptance Scenarios**:

1. **Given** a collaborator has another user's base64-encoded public key, **When** they run `collab key add <pubkey>`, **Then** the key is stored in the project's trusted keys file and a confirmation message is shown.
2. **Given** a collaborator runs `collab key add --self`, **Then** the user's own public key is read from the local signing key and added to the trusted list.
3. **Given** a collaborator runs `collab key add` with an invalid or malformed key, **Then** the command fails with a clear error message.
4. **Given** a collaborator runs `collab key add` with a key that is already trusted, **Then** the command reports that the key is already in the list (no duplicate).

---

### User Story 2 - Reject Untrusted Keys During Sync (Priority: P2)

As a collaborator, when I sync from a remote, events signed by keys not in my trusted keys list are rejected, preventing impersonation by attackers with remote write access.

**Why this priority**: This is the core security gate. Without rejection of untrusted keys, the allowlist has no enforcement and the trust model is incomplete.

**Independent Test**: Can be tested by syncing from a remote that contains events signed by an untrusted key and verifying they are rejected with a clear message identifying the unknown key.

**Acceptance Scenarios**:

1. **Given** a remote contains events signed by a trusted key, **When** I sync, **Then** the events are accepted normally.
2. **Given** a remote contains events signed by a key not in the trusted keys list, **When** I sync, **Then** the ref is rejected and the user is warned with the unknown public key value so they can decide whether to trust it.
3. **Given** a remote contains events from multiple authors (some trusted, some not), **When** I sync, **Then** only refs containing untrusted signatures are rejected; refs with all-trusted signatures sync normally.
4. **Given** the trusted keys file does not exist (no keys added yet), **When** I sync, **Then** the system falls back to the current behavior (accept any valid signature) and warns the user that no trusted keys are configured.

---

### User Story 3 - Manage Trusted Keys (Priority: P3)

As a collaborator, I can list and remove trusted keys to maintain the allowlist over time — for example, removing a key when a teammate leaves the project.

**Why this priority**: Key lifecycle management is important for long-term security but is not required for the initial trust model to function.

**Independent Test**: Can be tested by adding keys, listing them, removing one, and verifying the list reflects the removal.

**Acceptance Scenarios**:

1. **Given** trusted keys have been added, **When** I run `collab key list`, **Then** all trusted public keys are listed with any associated labels.
2. **Given** a trusted key exists, **When** I run `collab key remove <pubkey>`, **Then** the key is removed from the trusted list and a confirmation is shown.
3. **Given** I run `collab key remove` with a key not in the list, **Then** the command fails with a clear error message.

---

### Edge Cases

- What happens if the trusted keys file is corrupted or has invalid entries? The system skips invalid lines with a warning and processes valid entries.
- What happens if a user's key is rotated (old key replaced with new)? The old key must be explicitly removed and the new key added. Events signed with the old key remain valid if the old key is still trusted.
- What happens during the first sync on a fresh clone with no trusted keys file? The system falls back to accepting any valid signature and warns that no trust policy is configured.
- Can trusted keys be shared across collaborators? The trusted keys file is stored locally per-repository, not synced. Each collaborator maintains their own list.

## Clarifications

### Session 2026-03-21

- Q: Should the entire ref be rejected if any commit is from an untrusted key? → A: Yes, reject the entire ref (consistent with current verify_ref behavior and DAG integrity).
- Q: Should `collab key add --self` auto-add the user's own pubkey? → A: Yes, add a `--self` flag that reads from `~/.config/git-collab/signing-key.pub`.
- Q: What is the trusted keys file format? → A: SSH authorized_keys style — `<base64-key> <label>`, first space delimits. Lines starting with `#` are comments, empty lines skipped.
- Q: How to handle unsigned legacy commits when trust file exists? → A: Only enforce trust on signed commits (those with a `pubkey` field). Unsigned commits are outside the trust policy scope.
- Q: Should key removal support labels or require confirmation? → A: Remove by pubkey only, no confirmation. Print the removed key and label so user can verify.

## Requirements

### Functional Requirements

- **FR-001**: System MUST store trusted public keys in a project-local file (not user-global), so different projects can have different trust policies.
- **FR-002**: System MUST provide a `collab key add <pubkey>` command that appends a base64-encoded Ed25519 public key to the trusted keys file. A `--self` flag MUST read the user's own public key from the local signing key file and add it.
- **FR-002a**: System MUST reject the entire ref during sync if any commit on it is signed by an untrusted key (whole-ref rejection for DAG integrity).
- **FR-002b**: System MUST only enforce trust policy on signed commits (those with a `pubkey` field). Unsigned legacy commits are outside the trust policy scope and are not subject to key trust checks.
- **FR-003**: System MUST provide a `collab key list` command that displays all currently trusted public keys.
- **FR-004**: System MUST provide a `collab key remove <pubkey>` command that removes a public key from the trusted keys file.
- **FR-005**: System MUST reject event commits during sync whose signature was made by a key not present in the trusted keys file, reporting the unknown key's base64 value.
- **FR-006**: System MUST accept event commits during sync whose signature was made by a key present in the trusted keys file, provided the signature is cryptographically valid.
- **FR-007**: System MUST fall back to current behavior (accept any valid signature) when no trusted keys file exists, and print a warning advising the user to configure trusted keys.
- **FR-008**: System MUST validate that a key being added is a syntactically valid base64-encoded 32-byte Ed25519 public key, rejecting malformed input.
- **FR-009**: System MUST prevent duplicate keys in the trusted keys file.
- **FR-010**: System MUST allow the `collab key add` command to accept an optional label for the key (e.g., `collab key add <pubkey> --label "Alice"`), displayed in `collab key list` output.

### Key Entities

- **Trusted Keys File**: A project-local file listing base64-encoded Ed25519 public keys that are authorized to sign events. SSH authorized_keys style format: one key per line, first space delimits key from label. Lines starting with `#` are comments, empty lines are skipped.
- **Trusted Key Entry**: A single line in the trusted keys file consisting of a base64-encoded public key, optionally followed by a space and a human-readable label (label may contain spaces).

## Success Criteria

### Measurable Outcomes

- **SC-001**: 100% of events signed by untrusted keys are rejected during sync when a trusted keys file is configured.
- **SC-002**: 0% of events signed by trusted keys are incorrectly rejected.
- **SC-003**: Users receive the unknown key's public value in rejection messages, enabling a single copy-paste to add trust if desired.
- **SC-004**: Key management operations (add, list, remove) complete in under 1 second.

## Assumptions

- The trusted keys file is stored locally in the repository's git directory (e.g., under `.git/collab/trusted-keys`), not committed to the main branch.
- Each collaborator maintains their own trusted keys list independently.
- Public keys are exchanged out-of-band (same assumption as the signing feature).
- The user's own public key is not automatically added to the trusted list — it must be explicitly added like any other key.
diff --git a/specs/003-key-trust-allowlist/tasks.md b/specs/003-key-trust-allowlist/tasks.md
new file mode 100644
index 0000000..1bb807f
--- /dev/null
+++ b/specs/003-key-trust-allowlist/tasks.md
@@ -0,0 +1,179 @@
# Tasks: Key Trust Allowlist

**Input**: Design documents from `/specs/003-key-trust-allowlist/`
**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/

**Tests**: TDD approach -- write tests first, verify they fail, then implement.

**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

---

## Phase 1: Foundational (Blocking Prerequisites)

**Purpose**: Shared types, error variants, and the trust module skeleton that all user stories depend on.

<!-- sequential -->
- [x] T001 [US1] Add `Untrusted` variant to `VerifyStatus` enum in `src/signing.rs` -- add `Untrusted` to the enum and update any existing match arms that need a wildcard or explicit handling (e.g., Display, PartialEq derive is automatic). No behavioral change yet.

<!-- sequential -->
- [x] T002 [US1] Add `UntrustedKey(String)` variant to `Error` enum in `src/error.rs`.

<!-- sequential -->
- [x] T003 [US1] Create `src/trust.rs` with `TrustedKey` and `TrustPolicy` structs, `trusted_keys_path()`, `load_trust_policy()`, `is_key_trusted()`, and `validate_pubkey()` function stubs (return `todo!()`). Register `pub mod trust;` in `src/lib.rs`.

**Checkpoint**: Foundation types compiled. All user stories can now proceed.

---

## Phase 2: User Story 1 -- Add Trusted Keys (Priority: P1)

**Goal**: Users can add a teammate's public key to a trusted keys list via `collab key add`.

**Independent Test**: Run `collab key add <pubkey>`, then verify the key appears via `collab key list`.

### Tests for User Story 1

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

<!-- parallel-group: 1 -->
- [x] T004 [P] [US1] Unit tests for `validate_pubkey()` in `src/trust.rs` (`#[cfg(test)]` module) -- test valid base64 32-byte Ed25519 key accepted, invalid base64 rejected, wrong byte length rejected, invalid Ed25519 point rejected.
- [x] T005 [P] [US1] Unit tests for `load_trust_policy()` and `save_trusted_key()` in `src/trust.rs` (`#[cfg(test)]` module) -- test: file not found returns `TrustPolicy::Unconfigured`, empty file returns `Configured(empty)`, file with keys+comments+blanks parses correctly, malformed lines skipped with warning, duplicates deduplicated.
- [x] T006 [P] [US1] Integration test for `collab key add` in `tests/trust_test.rs` -- test: add valid key writes to `.git/collab/trusted-keys`, add duplicate key prints already-trusted message, add invalid key returns error, `--self` reads from signing-key.pub, `--self` with PUBKEY arg errors, add with `--label` stores label.

### Implementation for User Story 1

<!-- sequential -->
- [x] T007 [US1] Implement `validate_pubkey()` in `src/trust.rs` -- base64 decode, check 32 bytes, `VerifyingKey::from_bytes()`. Return `Result<(), Error>` with descriptive error messages.

<!-- parallel-group: 2 -->
- [x] T008 [P] [US1] Implement `trusted_keys_path()`, `load_trust_policy()`, and file parsing in `src/trust.rs` -- parse `<base64-key> <optional label>` format, skip `#` comments and blank lines, warn on malformed lines, deduplicate on load. Return `TrustPolicy::Unconfigured` when file missing, `TrustPolicy::Configured(Vec<TrustedKey>)` when present.
- [x] T009 [P] [US1] Implement `save_trusted_key()` and `remove_trusted_key()` in `src/trust.rs` -- append key+label to file (create `.git/collab/` dir if needed), check for duplicates before adding. `remove_trusted_key()` rewrites file without the matching entry.

<!-- sequential -->
- [x] T010 [US1] Add `KeyCmd` subcommand enum to `src/cli.rs` -- add `Key(KeyCmd)` variant to `Commands` enum. `KeyCmd` has `Add { pubkey: Option<String>, self_key: bool, label: Option<String> }`, `List`, and `Remove { pubkey: String }` following the existing `IssueCmd`/`PatchCmd` pattern.

<!-- sequential -->
- [x] T011 [US1] Wire `Commands::Key(KeyCmd::Add { .. })` handler in `src/lib.rs` -- implement the `key add` logic: handle `--self` (load from `signing_key_dir()/signing-key.pub`), validate key, check duplicate, save, print confirmation. Wire only `Add` for now; `List` and `Remove` can return `todo!()`.

**Checkpoint**: `collab key add` works end-to-end. Tests from T004-T006 pass.

---

## Phase 3: User Story 2 -- Reject Untrusted Keys During Sync (Priority: P2)

**Goal**: Events signed by keys not in the trusted keys list are rejected during sync.

**Independent Test**: Sync from a remote with events signed by an untrusted key and verify rejection.

### Tests for User Story 2

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

<!-- parallel-group: 3 -->
- [x] T012 [P] [US2] Unit tests for `check_trust()` in `src/trust.rs` (`#[cfg(test)]` module) -- test: `TrustPolicy::Unconfigured` returns all results unchanged, `Configured` with key in set returns `Valid`, `Configured` with key not in set returns `Untrusted`, `Missing`/`Invalid` statuses pass through unchanged, empty configured set rejects all valid signatures.
- [x] T013 [P] [US2] Integration test for sync trust rejection in `tests/trust_test.rs` -- test: sync accepts events from trusted key, sync rejects ref when any commit has untrusted key (whole-ref rejection), sync with no trusted keys file falls back to accept-all with warning, rejection message includes the untrusted key's base64 value.

### Implementation for User Story 2

<!-- sequential -->
- [x] T014 [US2] Implement `check_trust()` in `src/trust.rs` -- takes `&[SignatureVerificationResult]` and `&TrustPolicy`, returns new `Vec<SignatureVerificationResult>` with `Valid` statuses changed to `Untrusted` for keys not in the trusted set. `Unconfigured` passes through unchanged.

<!-- sequential -->
- [x] T015 [US2] Integrate trust checking into `reconcile_refs()` in `src/sync.rs` -- after `verify_ref()` succeeds, load `TrustPolicy` via `trust::load_trust_policy()`, call `trust::check_trust()` on results, reject ref if any result is `Untrusted` (print untrusted key value). When `Unconfigured`, print warning once per sync. Add `use crate::trust;` import.

**Checkpoint**: Sync enforces key trust. Events from untrusted keys are rejected. Tests from T012-T013 pass.

---

## Phase 4: User Story 3 -- Manage Trusted Keys (Priority: P3)

**Goal**: Users can list and remove trusted keys to maintain the allowlist.

**Independent Test**: Add keys, list them, remove one, verify the list reflects the removal.

### Tests for User Story 3

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

<!-- parallel-group: 4 -->
- [x] T016 [P] [US3] Integration tests for `collab key list` in `tests/trust_test.rs` -- test: list with no file prints "No trusted keys configured.", list with keys shows each key and label, list with key without label shows key only.
- [x] T017 [P] [US3] Integration tests for `collab key remove` in `tests/trust_test.rs` -- test: remove existing key prints confirmation with label, remove non-existent key returns error, remove last key leaves empty file (still `Configured`).

### Implementation for User Story 3

<!-- sequential -->
- [x] T018 [US3] Wire `Commands::Key(KeyCmd::List)` handler in `src/lib.rs` -- load trust policy, print each key and label (or "No trusted keys configured." if unconfigured/empty).

<!-- sequential -->
- [x] T019 [US3] Wire `Commands::Key(KeyCmd::Remove { .. })` handler in `src/lib.rs` -- call `trust::remove_trusted_key()`, print removed key and label, or error if not found.

**Checkpoint**: Full key lifecycle (add/list/remove) works. All tests pass.

---

## Phase 5: Polish & Cross-Cutting Concerns

**Purpose**: Edge cases, backward compatibility, and final validation.

<!-- parallel-group: 5 -->
- [x] T020 [P] Verify existing tests still pass (`tests/collab_test.rs`, `tests/sync_test.rs`) -- no trusted keys file means fallback behavior, no regressions.
- [x] T021 [P] Add edge case tests in `tests/trust_test.rs` -- corrupted trusted keys file (some valid, some invalid lines), key rotation scenario (old key removed, new key added, old events still valid if old key re-trusted).

<!-- sequential -->
- [x] T022 Run full test suite (`cargo test`) and fix any compilation or test failures.

---

## Dependencies & Execution Order

### Phase Dependencies

- **Phase 1 (Foundational)**: No dependencies -- start immediately. T001 -> T002 -> T003 strictly sequential (each depends on prior compilation).
- **Phase 2 (US1 Add Keys)**: Depends on Phase 1 completion.
- **Phase 3 (US2 Sync Trust)**: Depends on Phase 2 (needs `trust.rs` file I/O and `VerifyStatus::Untrusted`).
- **Phase 4 (US3 List/Remove)**: Depends on Phase 2 (needs `trust.rs` CRUD functions). Can run in parallel with Phase 3.
- **Phase 5 (Polish)**: Depends on Phases 3 and 4.

### Within Each Phase

- Tests MUST be written and FAIL before implementation begins.
- [P] tasks within the same parallel group can run concurrently.
- Sequential tasks must complete in listed order.

### Parallel Opportunities

- **Phase 2**: T004, T005, T006 (tests, different files/modules). T008, T009 (different functions, no overlap).
- **Phase 3**: T012, T013 (tests, different files). T014 then T015 sequential (T015 depends on T014).
- **Phase 4**: T016, T017 (tests, same file but independent test functions). T018 then T019 sequential.
- **Phase 3 and Phase 4** can run in parallel if staffed (different files: `src/sync.rs` vs `src/lib.rs`).
- **Phase 5**: T020, T021 parallel (different test files).

### File Ownership by Task

| File | Tasks |
|------|-------|
| `src/signing.rs` | T001 |
| `src/error.rs` | T002 |
| `src/trust.rs` | T003, T004, T005, T007, T008, T009, T012, T014 |
| `src/cli.rs` | T010 |
| `src/lib.rs` | T003 (mod declaration), T011, T018, T019 |
| `src/sync.rs` | T015 |
| `tests/trust_test.rs` | T006, T013, T016, T017, T021 |

---

## Notes

- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story for traceability
- Each user story should be independently completable and testable
- Verify tests fail before implementing
- Commit after each task or logical group
- Stop at any checkpoint to validate story independently
diff --git a/specs/004-dashboard-filtering/.analyze-done b/specs/004-dashboard-filtering/.analyze-done
new file mode 100644
index 0000000..4d0472c
--- /dev/null
+++ b/specs/004-dashboard-filtering/.analyze-done
@@ -0,0 +1 @@
2026-03-21T08:38:49+00:00
diff --git a/specs/004-dashboard-filtering/checklists/requirements.md b/specs/004-dashboard-filtering/checklists/requirements.md
new file mode 100644
index 0000000..662e668
--- /dev/null
+++ b/specs/004-dashboard-filtering/checklists/requirements.md
@@ -0,0 +1,34 @@
# Specification Quality Checklist: Dashboard Filtering

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-21
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
diff --git a/specs/004-dashboard-filtering/data-model.md b/specs/004-dashboard-filtering/data-model.md
new file mode 100644
index 0000000..3fb58ea
--- /dev/null
+++ b/specs/004-dashboard-filtering/data-model.md
@@ -0,0 +1,47 @@
# Data Model: Dashboard Filtering

## New Types

### StatusFilter (enum)

Replaces `show_all: bool` on `App`.

| Variant  | Meaning                            |
|----------|------------------------------------|
| `Open`   | Show only open items (default)     |
| `Closed` | Show only closed items             |
| `All`    | Show all items regardless of status|

**Cycling order**: Open → All → Closed → Open (via `a` key).

## Modified Types

### App (struct)

| Field           | Old Type       | New Type        | Notes                                    |
|-----------------|----------------|-----------------|------------------------------------------|
| `show_all`      | `bool`         | *removed*       | Replaced by `status_filter`              |
| `status_filter` | —              | `StatusFilter`  | New field, defaults to `StatusFilter::Open` |
| `search_query`  | —              | `String`        | New field, current search text           |
| `search_active` | —              | `bool`          | New field, whether search mode is active |

## Unchanged Types (read-only usage)

- `IssueStatus` — used to match against `StatusFilter`
- `PatchStatus` — used to match against `StatusFilter` (maps Open→Open, Closed/Merged→Closed)
- `IssueState.title` — matched against `search_query`
- `PatchState.title` — matched against `search_query`

## State Transitions

```
Normal Mode:
  '/' pressed → search_active = true, search_query = ""
  'a' pressed → status_filter cycles to next variant

Search Mode:
  Escape pressed → search_active = false, search_query = ""
  Backspace pressed → search_query.pop()
  Printable char pressed → search_query.push(char)
  (all other keys ignored while in search mode)
```
diff --git a/specs/004-dashboard-filtering/plan.md b/specs/004-dashboard-filtering/plan.md
new file mode 100644
index 0000000..12cc398
--- /dev/null
+++ b/specs/004-dashboard-filtering/plan.md
@@ -0,0 +1,94 @@
# Implementation Plan: Dashboard Filtering

**Branch**: `004-dashboard-filtering` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/004-dashboard-filtering/spec.md`

## Summary

Add text search and status filter cycling to the TUI dashboard. Users press `/` to enter search mode with real-time case-insensitive title filtering, cycle status filters (open/closed/all) with the existing `a` key, and see active filter state in the footer. All changes are confined to `src/tui.rs` — no new dependencies or modules required.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: ratatui 0.30, crossterm 0.29, git2 0.19
**Storage**: N/A (ephemeral filter state, no persistence)
**Testing**: cargo test (integration tests in `tests/`)
**Target Platform**: Terminal (Linux/macOS)
**Project Type**: CLI tool with TUI dashboard
**Performance Goals**: Immediate keystroke response (sub-frame, <16ms)
**Constraints**: Single-threaded event loop, no async
**Scale/Scope**: ~1000 lines in tui.rs, adding ~100-150 lines

## Constitution Check

*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

Constitution is unpopulated (template only). No gates to enforce. User preference for TDD noted from memory — tests should be written before implementation.

## Project Structure

### Documentation (this feature)

```text
specs/004-dashboard-filtering/
├── plan.md              # This file
├── research.md          # Phase 0 output
├── data-model.md        # Phase 1 output
├── spec.md              # Feature specification
├── checklists/          # Quality checklists
└── tasks.md             # Phase 2 output (via /speckit.tasks)
```

### Source Code (repository root)

```text
src/
├── tui.rs               # PRIMARY: All filtering logic added here
└── state.rs             # READ ONLY: IssueStatus, PatchStatus enums used for filtering

tests/
├── common/              # Shared test helpers
└── [existing tests]     # No new test files needed; TUI filtering tested via unit tests in src/tui.rs
```

**Structure Decision**: Single-file change in `src/tui.rs`. The `App` struct gains new fields (`search_query: String`, `search_active: bool`, `status_filter: StatusFilter`). The existing `show_all: bool` is replaced by a `StatusFilter` enum (Open, Closed, All). Visible item methods (`visible_issues`, `visible_patches`) are extended to apply both text and status filters. The `render_footer` function is updated to show active filter state. A new `InputMode` enum or flag manages whether keystrokes go to search input vs normal navigation.

## Design Decisions

### 1. Search Mode State Machine

Add an `InputMode` concept to `App`:
- **Normal**: All existing keybindings work as before
- **Search**: Keystrokes append to `search_query`; only Escape, Backspace, and printable chars are handled

The `/` key transitions Normal → Search. Escape transitions Search → Normal and clears the query.

### 2. Status Filter Enum Replaces `show_all` Boolean

```
enum StatusFilter { Open, Closed, All }
```

The `a` key cycles: Open → All → Closed → Open. This replaces the current `show_all: bool` toggle. The list title shows the current filter (e.g., "Issues (closed)").

### 3. Filter Application in Visible Methods

`visible_issues()` and `visible_patches()` apply both filters:
1. Status filter (match on `IssueStatus`/`PatchStatus`)
2. Text filter (case-insensitive substring match on `title`)

Both filters compose — an item must pass both to be visible.

### 4. Footer Rendering

When search is active: footer shows `Search: {query}_` with a cursor indicator.
When search is inactive but a status filter other than the default (Open) is active: footer shows the active status filter in the hint area.
Normal mode with no special filter: existing key hints shown.

### 5. Selection Reset on Filter Change

Any filter change (typing, status cycle) checks if the current selection is still within bounds. If not, resets to index 0 or None if the list is empty.

### 6. Key Conflict Check

Verified: `/` is not bound to any existing action in `run_loop`. The `a` key is already used for `show_all` toggle and will be repurposed for the status filter cycle.
diff --git a/specs/004-dashboard-filtering/research.md b/specs/004-dashboard-filtering/research.md
new file mode 100644
index 0000000..484652f
--- /dev/null
+++ b/specs/004-dashboard-filtering/research.md
@@ -0,0 +1,31 @@
# Research: Dashboard Filtering

## R1: `/` Key Availability

**Decision**: `/` is available and unbound in the TUI event loop.
**Rationale**: Verified by reading `run_loop` in `src/tui.rs` — the match statement handles `q`, `c` (ctrl), `1`, `2`, `j`/`k`, `d`, `a`, `g`, `o`, `r`, and arrow/page keys. No `/` handler exists.
**Alternatives considered**: None needed.

## R2: Input Mode Pattern in ratatui

**Decision**: Use a boolean `search_active` flag on `App` to switch between normal and search input handling.
**Rationale**: The TUI already uses a simple match-based event loop. Adding a lightweight mode flag at the top of the key handler is the simplest approach. Full state machine (enum with multiple modes) is overkill since there are only two modes.
**Alternatives considered**: Separate `InputMode` enum — rejected as unnecessary complexity for two states.

## R3: Status Filter Cycling

**Decision**: Replace `show_all: bool` with `StatusFilter` enum cycling Open → All → Closed → Open on `a` key.
**Rationale**: The existing `a` key already toggles between open-only and all. Extending to a three-state cycle is natural. The order Open → All → Closed matches frequency of use (most users want open or all).
**Alternatives considered**: Separate keys for each status — rejected; too many keybindings for a simple filter.

## R4: Case-Insensitive Matching

**Decision**: Use `str::to_lowercase()` for both query and title before calling `contains()`.
**Rationale**: Standard Rust approach, no external dependencies needed. Unicode edge cases (Turkish İ, etc.) are acceptable to ignore for a developer tool.
**Alternatives considered**: Regex — overkill. Fuzzy matching — out of scope per spec.

## R5: Existing `follow_link` and `show_all` Interactions

**Decision**: The `follow_link` method currently sets `self.show_all = true` to reveal linked items across status boundaries. This will be updated to set `self.status_filter = StatusFilter::All` instead.
**Rationale**: Direct mechanical replacement. Same behavior, new type.
**Alternatives considered**: None — straightforward refactor.
diff --git a/specs/004-dashboard-filtering/review.md b/specs/004-dashboard-filtering/review.md
new file mode 100644
index 0000000..48f8e30
--- /dev/null
+++ b/specs/004-dashboard-filtering/review.md
@@ -0,0 +1,20 @@
# Pre-Implementation Review

**Feature**: Dashboard Filtering
**Artifacts reviewed**: spec.md, plan.md, tasks.md, data-model.md, research.md, checklists/requirements.md
**Review model**: Claude Opus 4.6 (same conversation)
**Generating model**: Claude Opus 4.6

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS | All 3 user stories and 10 FRs addressed in plan |
| Plan-Tasks Completeness | PASS | Every plan component has corresponding tasks |
| Dependency Ordering | PASS | Setup → Foundational → US1 → US2 → US3 → Polish |
| Parallelization Correctness | PASS | Correctly identifies zero parallelism (single file) |
| Feasibility & Risk | WARN | T017 is complex; T022 was a no-op (fixed) |
| Standards Compliance | WARN | Constitution unpopulated; TDD approach included per user memory |
| Implementation Readiness | WARN | Escape conflict noted; PatchStatus::Merged mapping clarified |

**Overall**: READY WITH WARNINGS (all actionable items addressed)
diff --git a/specs/004-dashboard-filtering/spec.md b/specs/004-dashboard-filtering/spec.md
new file mode 100644
index 0000000..19a3b70
--- /dev/null
+++ b/specs/004-dashboard-filtering/spec.md
@@ -0,0 +1,96 @@
# Feature Specification: Dashboard Filtering

**Feature Branch**: `004-dashboard-filtering`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Allow filtering of issues/patches from dashboard. Add the ability to filter the issue and patch lists in the TUI dashboard. Users should be able to filter by status (open/closed) and search by title text. A '/' key activates a filter/search input bar, typing filters the visible list in real-time. Escape clears the filter. The status bar should show the active filter. This builds on the existing TUI in src/tui.rs."

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Text Search Filtering (Priority: P1)

A user browsing the TUI dashboard has many issues or patches and wants to quickly find a specific one by title. They press `/` to activate search mode, type a few characters, and the list narrows in real-time to show only items whose titles contain the typed text. Pressing Escape exits search mode and clears the filter, restoring the full list.

**Why this priority**: This is the core filtering interaction — without text search, the feature has no value.

**Independent Test**: Can be fully tested by launching the dashboard with multiple items, pressing `/`, typing partial title text, and verifying only matching items appear. Delivers immediate value for navigating large lists.

**Acceptance Scenarios**:

1. **Given** the dashboard is showing a list of issues, **When** the user presses `/`, **Then** a search input bar appears at the bottom of the screen and the dashboard enters search mode.
2. **Given** search mode is active, **When** the user types characters, **Then** the visible list updates in real-time to show only items whose titles contain the typed text (case-insensitive).
3. **Given** search mode is active with a filter applied, **When** the user presses Escape, **Then** the search bar disappears, the filter is cleared, and the full list is restored.
4. **Given** search mode is active, **When** the user presses Backspace, **Then** the last character is removed from the search input and the filter updates accordingly.
5. **Given** a filter is active and no items match, **When** the list is empty, **Then** the detail pane shows a "no matches" indicator.

---

### User Story 2 - Status Filter Cycling (Priority: P2)

A user wants to see only open or only closed issues/patches. They cycle through status filters (open, closed, all) to narrow the list by item state.

**Why this priority**: Status filtering complements text search and helps users focus on actionable items versus historical ones.

**Independent Test**: Can be tested by toggling between status filter modes and verifying the list contents change to reflect only items of the selected status.

**Acceptance Scenarios**:

1. **Given** the dashboard is showing issues, **When** the user cycles the status filter, **Then** the list shows only items matching the selected status (open, closed, or all).
2. **Given** a status filter is active, **When** the user switches tabs, **Then** the status filter applies to the new tab as well.
3. **Given** a status filter is set to "closed", **When** there are no closed items, **Then** the list is empty and the detail pane indicates no items match.

---

### User Story 3 - Combined Filtering with Status Bar Feedback (Priority: P3)

A user applies both a text search and a status filter simultaneously. The status bar at the bottom of the dashboard shows what filters are currently active so the user always knows why the list may be narrowed.

**Why this priority**: Combining filters and showing active filter state prevents confusion and completes the filtering experience.

**Independent Test**: Can be tested by activating both a text search and a status filter, verifying the list shows only items matching both criteria, and confirming the status bar displays the active filters.

**Acceptance Scenarios**:

1. **Given** a text filter "bug" and status filter "open" are both active, **When** the list renders, **Then** only open items with "bug" in their title are shown.
2. **Given** any filter is active, **When** the user looks at the status bar, **Then** it displays the current filter state (e.g., search text and status filter).
3. **Given** filters are active and the user presses Escape, **When** the search bar closes, **Then** the text filter is cleared but the status filter remains.

---

### Edge Cases

- What happens when the user presses `/` while already in search mode? The keypress is treated as a literal `/` character appended to the search input.
- What happens when the user types a very long search string? The input is displayed with truncation to fit the status bar width.
- What happens when items are refreshed (`r` key) while a filter is active? The filter remains applied to the refreshed data.
- What happens when the user navigates to a selected item and then activates a filter that hides it? The selection resets to the first visible item, or clears if no items match.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST provide a search mode activated by the `/` key that displays an input bar for typing filter text.
- **FR-002**: System MUST filter the visible list in real-time as the user types, matching against item titles (case-insensitive substring match).
- **FR-003**: System MUST clear the text filter and exit search mode when the user presses Escape.
- **FR-004**: System MUST allow cycling between status filters: open only, closed only, and all.
- **FR-005**: System MUST combine text and status filters — an item must match both criteria to be visible.
- **FR-006**: System MUST display the active filter state in the status bar at the bottom of the dashboard.
- **FR-007**: System MUST reset the list selection to the first matching item when the filter changes cause the current selection to become hidden.
- **FR-008**: System MUST support Backspace to delete characters from the search input while in search mode.
- **FR-009**: System MUST preserve active filters when refreshing data with the `r` key.
- **FR-010**: System MUST apply filters consistently across both the Issues and Patches tabs.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: Users can locate a specific item in a list of 50+ items in under 5 seconds using text search.
- **SC-002**: Filtering feedback is immediate — the list updates with each keystroke with no perceptible delay.
- **SC-003**: Users can determine what filters are active at any time by glancing at the status bar.
- **SC-004**: All existing keyboard shortcuts continue to function when no filter is active (no regressions).

## Assumptions

- The `/` key is not currently bound to any other action in the TUI and is available for search activation.
- Case-insensitive substring matching is sufficient; regex or fuzzy matching is out of scope.
- The existing `show_all` boolean will be replaced or extended by the new status filter cycling mechanism.
- Filter state is ephemeral — it does not persist across dashboard sessions.
diff --git a/specs/004-dashboard-filtering/tasks.md b/specs/004-dashboard-filtering/tasks.md
new file mode 100644
index 0000000..bbb5a4f
--- /dev/null
+++ b/specs/004-dashboard-filtering/tasks.md
@@ -0,0 +1,177 @@
# Tasks: Dashboard Filtering

**Input**: Design documents from `/specs/004-dashboard-filtering/`
**Prerequisites**: plan.md (required), spec.md (required), research.md, data-model.md

**Tests**: Included per TDD preference (user memory). Tests written first, must fail before implementation.

**Organization**: Tasks grouped by user story. All changes in `src/tui.rs` (single file).

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Add new types and modify App struct to support filtering

- [ ] T001 Add `StatusFilter` enum (Open, Closed, All) with a `next()` cycling method in `src/tui.rs`
- [ ] T002 Add `search_query: String`, `search_active: bool`, and `status_filter: StatusFilter` fields to `App` struct, replacing `show_all: bool` in `src/tui.rs`
- [ ] T003 Update `App::new()` to initialize new fields (`status_filter: StatusFilter::Open`, `search_query: String::new()`, `search_active: false`) in `src/tui.rs`

**Checkpoint**: App struct compiles with new fields. Existing functionality temporarily broken (show_all references).

---

## Phase 2: Foundational (Blocking Prerequisites)

**Purpose**: Update all existing code that references `show_all` to use `status_filter`, restoring compilation

**⚠️ CRITICAL**: No user story work can begin until this phase is complete

- [ ] T004 Update `visible_issues()` to filter by `status_filter` instead of `show_all` in `src/tui.rs` — match `StatusFilter::Open` to `IssueStatus::Open`, `StatusFilter::Closed` to `IssueStatus::Closed`, `StatusFilter::All` to no filter
- [ ] T005 Update `visible_patches()` to filter by `status_filter` instead of `show_all` in `src/tui.rs` — match `StatusFilter::Open` to `PatchStatus::Open`, `StatusFilter::Closed` to BOTH `PatchStatus::Closed` AND `PatchStatus::Merged` (merged patches are treated as closed for filtering), `StatusFilter::All` to no filter
- [ ] T006 Update `visible_issue_count()` to use `status_filter` instead of `show_all` in `src/tui.rs`
- [ ] T007 Update `follow_link()` to set `self.status_filter = StatusFilter::All` instead of `self.show_all = true` in `src/tui.rs`
- [ ] T008 Update `render_list()` to derive list title from `status_filter` (e.g., "Issues (open)", "Issues (closed)", "Issues (all)") instead of `show_all` in `src/tui.rs`
- [ ] T009 Update the `'a'` key handler in `run_loop()` to call `status_filter.next()` cycle instead of toggling `show_all` in `src/tui.rs`

**Checkpoint**: Project compiles and passes existing tests. `a` key now cycles Open → All → Closed → Open. All existing behavior preserved.

---

## Phase 3: User Story 1 - Text Search Filtering (Priority: P1) 🎯 MVP

**Goal**: Users press `/` to enter search mode, type to filter items by title in real-time, Escape clears and exits.

**Independent Test**: Launch dashboard with multiple items, press `/`, type partial title, verify only matching items appear.

### Tests for User Story 1 ⚠️

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

- [ ] T010 [US1] Write unit test in `src/tui.rs` (mod tests): verify `visible_issues()` filters by `search_query` with case-insensitive substring match when `search_active` is true and `search_query` is non-empty
- [ ] T011 [US1] Write unit test in `src/tui.rs` (mod tests): verify `visible_patches()` filters by `search_query` with case-insensitive substring match when `search_active` is true and `search_query` is non-empty
- [ ] T012 [US1] Write unit test in `src/tui.rs` (mod tests): verify `visible_issues()` returns all status-matching items when `search_query` is empty

### Implementation for User Story 1

- [ ] T013 [US1] Add text filter to `visible_issues()`: when `search_query` is non-empty, additionally filter items where `title.to_lowercase().contains(&search_query.to_lowercase())` in `src/tui.rs`
- [ ] T014 [US1] Add text filter to `visible_patches()`: same case-insensitive substring match logic in `src/tui.rs`
- [ ] T015 [US1] Add text filter to `visible_issue_count()`: apply same text filter logic for consistency in `src/tui.rs`
- [ ] T016 [US1] Add `/` key handler in `run_loop()`: when not in search mode, set `search_active = true` and `search_query = ""` in `src/tui.rs`
- [ ] T017 [US1] Add search mode key handling in `run_loop()`: when `search_active` is true, intercept key events BEFORE the normal match block (critical: `Escape` must clear query and exit search mode, NOT quit the TUI — the normal `KeyCode::Esc => return Ok(())` arm must not fire while searching). Handle `Backspace` to pop last char, printable `Char(c)` to append (including `/`), and reset selection after each change in `src/tui.rs`
- [ ] T018 [US1] Add selection reset helper: after any filter change (typing, backspace, escape), check if current selection index >= `visible_count()` and reset to 0 or None in `src/tui.rs`

**Checkpoint**: Text search fully functional. Press `/`, type, list filters. Escape clears. Tests pass.

---

## Phase 4: User Story 2 - Status Filter Cycling (Priority: P2)

**Goal**: The `a` key cycles between open/closed/all status filters, applied consistently across tabs.

**Independent Test**: Press `a` multiple times, verify list shows only items matching the selected status.

### Tests for User Story 2 ⚠️

- [ ] T019 [US2] Write unit test in `src/tui.rs` (mod tests): verify `StatusFilter::next()` cycles Open → All → Closed → Open
- [ ] T020 [US2] Write unit test in `src/tui.rs` (mod tests): verify `visible_issues()` returns only closed issues when `status_filter` is `Closed`
- [ ] T021 [US2] Write unit test in `src/tui.rs` (mod tests): verify `visible_patches()` returns closed and merged patches when `status_filter` is `Closed`

### Implementation for User Story 2

- [ ] T022 [US2] Write unit test in `src/tui.rs` (mod tests): verify `a` key handler cycles `status_filter` through Open → All → Closed → Open and resets selection to 0 or None after each cycle
- [ ] T023 [US2] Ensure status filter persists across tab switches — verify `switch_tab()` does NOT reset `status_filter` in `src/tui.rs`
- [ ] T024 [US2] Add selection reset in `a` key handler: after cycling status filter, reset selection to 0 or None based on new `visible_count()` in `src/tui.rs`

**Checkpoint**: Status filter cycling works. `a` cycles through all three states. Tests pass.

---

## Phase 5: User Story 3 - Combined Filtering with Status Bar Feedback (Priority: P3)

**Goal**: Text and status filters compose, and the footer shows active filter state at all times.

**Independent Test**: Set both a text filter and status filter, verify only doubly-matching items show. Check footer reflects active state.

### Tests for User Story 3 ⚠️

- [ ] T025 [US3] Write unit test in `src/tui.rs` (mod tests): verify `visible_issues()` applies both `status_filter` and `search_query` simultaneously — only items matching both criteria are returned
- [ ] T026 [US3] Write unit test in `src/tui.rs` (mod tests): verify Escape clears text filter but preserves `status_filter`

### Implementation for User Story 3

- [ ] T027 [US3] Update `render_footer()` in `src/tui.rs`: when `search_active` is true, display `Search: {query}_` with cursor indicator instead of normal key hints
- [ ] T028 [US3] Update `render_footer()` in `src/tui.rs`: when `search_active` is false, show status filter state in the `filter_hint` area (replace "a:open only"/"a:show all" with three-state hint based on `status_filter`)
- [ ] T029 [US3] Update `render_footer()` in `src/tui.rs`: add `/:search` to the normal mode key hints

**Checkpoint**: Combined filtering works. Footer shows active filters. All tests pass.

---

## Phase 6: Polish & Cross-Cutting Concerns

**Purpose**: Edge cases, refresh behavior, and final cleanup

- [ ] T030 Ensure `reload()` in `src/tui.rs` preserves `search_query`, `search_active`, and `status_filter` after refresh (verify fields are not reset)
- [ ] T031 Handle empty filtered list in `render_detail()` in `src/tui.rs` — when no items match current filters, show "No matches for current filter" instead of the generic "No issues/patches to display"
- [ ] T032 Handle long search query display in `render_footer()` in `src/tui.rs` — truncate `search_query` to fit available footer width
- [ ] T033 Run `cargo clippy` and fix any warnings introduced by the new code in `src/tui.rs`
- [ ] T034 Run `cargo test` and verify all existing and new tests pass

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies — can start immediately
- **Foundational (Phase 2)**: Depends on Phase 1 — BLOCKS all user stories
- **User Story 1 (Phase 3)**: Depends on Phase 2
- **User Story 2 (Phase 4)**: Depends on Phase 2 (independent of US1)
- **User Story 3 (Phase 5)**: Depends on Phase 3 AND Phase 4 (combines both filters)
- **Polish (Phase 6)**: Depends on all user stories

### Within Each User Story

- Tests MUST be written and FAIL before implementation
- Implementation tasks are sequential (same file: `src/tui.rs`)

### Parallel Opportunities

- **Phase 1**: T001, T002, T003 are sequential (same file, dependent)
- **Phase 2**: T004-T009 are sequential (same file)
- **Phase 3 & 4**: US1 and US2 could theoretically run in parallel after Phase 2, but since all changes are in the same file (`src/tui.rs`), they MUST be sequential
- **No [P] markers**: All tasks modify `src/tui.rs` — no parallel execution possible

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (T001-T003)
2. Complete Phase 2: Foundational (T004-T009)
3. Complete Phase 3: User Story 1 (T010-T018)
4. **STOP and VALIDATE**: Test text search independently
5. Deploy/demo if ready

### Incremental Delivery

1. Setup + Foundational → Status filter enum replaces show_all (existing behavior preserved)
2. Add User Story 1 → Text search works → Validate
3. Add User Story 2 → Status cycling verified → Validate
4. Add User Story 3 → Combined filtering + footer → Validate
5. Polish → Edge cases, cleanup → Final validation

---

## Notes

- All 34 tasks modify a single file: `src/tui.rs` — no parallelism possible
- TDD approach: 8 test tasks + 26 implementation tasks
- Each user story builds on the foundational `StatusFilter` refactor
- US3 is the only story with cross-story dependency (requires both text and status filters)
diff --git a/specs/005-tui-issue-creation/checklists/requirements.md b/specs/005-tui-issue-creation/checklists/requirements.md
new file mode 100644
index 0000000..54f673a
--- /dev/null
+++ b/specs/005-tui-issue-creation/checklists/requirements.md
@@ -0,0 +1,34 @@
# Specification Quality Checklist: TUI Issue Creation

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-21
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass. Spec is ready for `/speckit.plan`.
diff --git a/specs/005-tui-issue-creation/plan.md b/specs/005-tui-issue-creation/plan.md
new file mode 100644
index 0000000..9924f64
--- /dev/null
+++ b/specs/005-tui-issue-creation/plan.md
@@ -0,0 +1,46 @@
# Implementation Plan: TUI Issue Creation

**Branch**: `005-tui-issue-creation` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)

## Summary

Add inline issue creation to the TUI dashboard. User presses `n` to enter a two-step form (title → body) rendered in the footer area. Submitting calls `issue::open()` and auto-refreshes the list. All changes in `src/tui.rs`.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: ratatui 0.30, crossterm 0.29, git2 0.19
**Testing**: cargo test
**Project Type**: CLI tool with TUI dashboard

## Design Decisions

### 1. Creation Form State Machine

Add an `InputMode` enum to `App`:
- `Normal` — existing behavior
- `Search` — existing search mode (currently `search_active: bool`)
- `CreateTitle` — typing issue title
- `CreateBody` — typing issue body

This replaces the `search_active: bool` with a proper enum since we now have 4 modes. The search_query field is reused as a general input buffer, and a new `create_title` field stores the title while the user enters the body.

### 2. Input Handling

All form modes intercept keys before the normal match block (same pattern as search). Escape always cancels/exits the form. Enter submits the current field. Backspace deletes. Printable chars append.

### 3. Issue Creation

On final submit (Enter in body mode, or Escape to skip body), call `crate::issue::open(repo, &title, &body)`. On success, reload the issue list and show confirmation in status_msg. On error, show error in status_msg.

### 4. Tab Switching

If user presses `n` while on Patches tab, switch to Issues tab first (since we're creating an issue).

## Source Code

```text
src/
├── tui.rs    # PRIMARY: Add InputMode enum, form handling, issue creation
├── issue.rs  # READ ONLY: issue::open() called for creation
```
diff --git a/specs/005-tui-issue-creation/spec.md b/specs/005-tui-issue-creation/spec.md
new file mode 100644
index 0000000..10d069a
--- /dev/null
+++ b/specs/005-tui-issue-creation/spec.md
@@ -0,0 +1,100 @@
# Feature Specification: TUI Issue Creation

**Feature Branch**: `005-tui-issue-creation`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Allow creation of issue from dashboard. Add the ability to create a new issue directly from the TUI dashboard without leaving the interface. The user presses a key to open a minimal issue creation form where they can enter a title and optional body. On submit, the issue is created. On cancel (Escape), the form is dismissed. The new issue should appear in the list after creation."

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Create Issue with Title (Priority: P1)

A user browsing the dashboard notices something that needs tracking. They press `n` to open an inline issue creation form. A text input appears where they type the issue title. They press Enter to submit, and the issue is created and immediately visible in the issue list.

**Why this priority**: Creating an issue with a title is the minimum viable interaction — without this, the feature has no value.

**Independent Test**: Can be fully tested by pressing `n`, typing a title, pressing Enter, and verifying the new issue appears in the list.

**Acceptance Scenarios**:

1. **Given** the dashboard is showing the Issues tab, **When** the user presses `n`, **Then** an input form appears at the bottom of the screen prompting for a title.
2. **Given** the title input is active, **When** the user types text and presses Enter, **Then** a new issue is created with that title and an empty body.
3. **Given** the issue was just created, **When** the dashboard refreshes, **Then** the new issue appears in the issue list.
4. **Given** the title input is active, **When** the user presses Escape, **Then** the form is dismissed without creating an issue.
5. **Given** the title input is active, **When** the user presses Enter with an empty title, **Then** the form is dismissed without creating an issue (empty titles are not allowed).

---

### User Story 2 - Create Issue with Title and Body (Priority: P2)

After entering a title, the user can optionally add a body/description to provide more context. After submitting the title, the form transitions to a body input. The user can type the body and press Enter to submit, or press Escape to skip the body and create the issue with title only.

**Why this priority**: A body provides valuable context but is optional — the feature works without it.

**Independent Test**: Can be tested by pressing `n`, entering a title, pressing Enter, then entering a body and pressing Enter again, and verifying the issue has both title and body.

**Acceptance Scenarios**:

1. **Given** the user has entered a title and pressed Enter, **When** the form transitions to body input, **Then** a prompt for the body text appears.
2. **Given** the body input is active, **When** the user types text and presses Enter, **Then** the issue is created with both the title and body.
3. **Given** the body input is active, **When** the user presses Escape, **Then** the issue is created with the title only and an empty body.

---

### User Story 3 - Status Feedback During Creation (Priority: P3)

The user sees clear feedback during the creation process: the current step (title/body), a confirmation message after successful creation, or an error message if something goes wrong.

**Why this priority**: Feedback completes the user experience but the feature works without it.

**Independent Test**: Can be tested by creating an issue and verifying a success message appears in the status bar.

**Acceptance Scenarios**:

1. **Given** the creation form is active, **When** the user looks at the footer, **Then** it shows which field is being edited (e.g., "New issue - Title:" or "New issue - Body:").
2. **Given** an issue was successfully created, **When** the form closes, **Then** a confirmation message appears in the status bar (e.g., "Issue created: {short-id}").
3. **Given** issue creation fails, **When** the error occurs, **Then** an error message appears in the status bar.

---

### Edge Cases

- What happens when the user presses `n` while on the Patches tab? The dashboard switches to the Issues tab before showing the form, since issues are created on the Issues tab.
- What happens when the user presses `n` while in search mode? The `n` keypress is treated as a search character, not an issue creation trigger (search mode intercepts all keys).
- What happens if the title contains only whitespace? It is treated as empty and the form is dismissed without creating an issue.
- What happens if the user presses Backspace during title/body input? The last character is removed from the current input field.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST provide an issue creation mode activated by the `n` key from the Issues tab.
- **FR-002**: System MUST display a title input field when creation mode is activated.
- **FR-003**: System MUST create a new issue when the user submits a non-empty title by pressing Enter.
- **FR-004**: System MUST dismiss the creation form without side effects when the user presses Escape.
- **FR-005**: System MUST refresh the issue list after successful issue creation so the new issue is immediately visible.
- **FR-006**: System MUST optionally prompt for a body after the title is submitted.
- **FR-007**: System MUST support Backspace to delete characters from the current input field.
- **FR-008**: System MUST reject empty or whitespace-only titles (no issue is created).
- **FR-009**: System MUST display a confirmation message in the status bar after successful creation.
- **FR-010**: System MUST display an error message in the status bar if issue creation fails.

### Key Entities

- **Issue Creation Form**: A temporary input state with two sequential fields (title, body) that captures user input and delegates to the existing issue creation logic.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: Users can create a new issue from the dashboard in under 10 seconds without leaving the TUI.
- **SC-002**: The newly created issue appears in the list immediately after creation with no manual refresh needed.
- **SC-003**: Cancelling at any point during creation leaves the dashboard in its original state with no side effects.
- **SC-004**: All existing keyboard shortcuts continue to function when the creation form is not active (no regressions).

## Assumptions

- The `n` key is not currently bound to any action in the TUI and is available for issue creation.
- Issue creation uses the existing programmatic interface (not shell-out to the CLI binary).
- The body input is a single line; multi-line body editing is out of scope.
- The creation form appears in the footer area, replacing the status bar temporarily.
diff --git a/specs/005-tui-issue-creation/tasks.md b/specs/005-tui-issue-creation/tasks.md
new file mode 100644
index 0000000..b35b9dd
--- /dev/null
+++ b/specs/005-tui-issue-creation/tasks.md
@@ -0,0 +1,134 @@
# Tasks: TUI Issue Creation

**Input**: Design documents from `/specs/005-tui-issue-creation/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)

**Tests**: Not explicitly requested — test tasks omitted.

**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Replace `search_active: bool` with `InputMode` enum and add supporting fields

- [ ] T001 Add `InputMode` enum (`Normal`, `Search`, `CreateTitle`, `CreateBody`) and replace `search_active: bool` with `input_mode: InputMode` in App struct in `src/tui.rs`
- [ ] T002 Add `create_title: String` field to App struct for storing title while entering body in `src/tui.rs`
- [ ] T003 Update all existing `search_active` references to use `InputMode::Search` / `InputMode::Normal` checks in `src/tui.rs`

**Checkpoint**: Existing search functionality works identically using the new `InputMode` enum

---

## Phase 2: User Story 1 - Create Issue with Title (Priority: P1) 🎯 MVP

**Goal**: User presses `n` to open inline title input, types title, presses Enter to create issue

**Independent Test**: Press `n`, type a title, press Enter — new issue appears in list

### Implementation for User Story 1

- [ ] T004 [US1] Handle `n` keypress in Normal mode: set `input_mode` to `CreateTitle`, clear input buffer, switch to Issues tab if on Patches tab in `src/tui.rs`
- [ ] T005 [US1] Handle key input in `CreateTitle` mode: printable chars append to input buffer, Backspace deletes last char, Escape cancels and returns to Normal mode in `src/tui.rs`
- [ ] T006 [US1] Handle Enter in `CreateTitle` mode: reject empty/whitespace-only title (dismiss form), otherwise call `issue::open(repo, &title, "")`, reload issue list, return to Normal mode in `src/tui.rs`
- [ ] T007 [US1] Render title input form in footer area when `input_mode` is `CreateTitle` — show "New issue - Title: {input}" replacing status bar in `src/tui.rs`

**Checkpoint**: User Story 1 fully functional — user can create issues with title-only from dashboard

---

## Phase 3: User Story 2 - Create Issue with Title and Body (Priority: P2)

**Goal**: After entering title, user can optionally enter a body before creating the issue

**Independent Test**: Press `n`, enter title, press Enter, enter body, press Enter — issue has both title and body. Or press Escape at body step to create with title only.

### Implementation for User Story 2

- [ ] T008 [US2] Modify Enter handler in `CreateTitle`: instead of creating issue immediately, store title in `create_title` field and transition to `CreateBody` mode in `src/tui.rs`
- [ ] T009 [US2] Handle key input in `CreateBody` mode: printable chars append to input buffer, Backspace deletes last char in `src/tui.rs`
- [ ] T010 [US2] Handle Enter in `CreateBody` mode: call `issue::open(repo, &create_title, &body)`, reload issue list, return to Normal mode in `src/tui.rs`
- [ ] T011 [US2] Handle Escape in `CreateBody` mode: create issue with title only (`issue::open(repo, &create_title, "")`), return to Normal mode in `src/tui.rs`
- [ ] T012 [US2] Render body input form in footer area when `input_mode` is `CreateBody` — show "New issue - Body: {input}" in `src/tui.rs`

**Checkpoint**: User Stories 1 AND 2 both work — full two-step creation flow functional

---

## Phase 4: User Story 3 - Status Feedback During Creation (Priority: P3)

**Goal**: User sees clear feedback: current step label, success confirmation, error message

**Independent Test**: Create an issue and verify success message appears in status bar

### Implementation for User Story 3

- [ ] T013 [US3] On successful issue creation, set `status_msg` to "Issue created: {short-id}" in `src/tui.rs`
- [ ] T014 [US3] On failed issue creation, set `status_msg` to error description and return to Normal mode in `src/tui.rs`

**Checkpoint**: All user stories independently functional with full feedback loop

---

## Phase 5: Polish & Cross-Cutting Concerns

**Purpose**: Edge case handling and robustness

- [ ] T015 Ensure `n` keypress is ignored when `input_mode` is not `Normal` (search mode should still capture it as a character) in `src/tui.rs`
- [ ] T016 Trim whitespace from title before validation (whitespace-only titles should be rejected) in `src/tui.rs`

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies — T001-T003 must complete first
- **User Story 1 (Phase 2)**: Depends on Phase 1 (InputMode enum exists)
- **User Story 2 (Phase 3)**: Depends on Phase 2 (modifies title submit behavior)
- **User Story 3 (Phase 4)**: Depends on Phase 2 (needs creation flow to add feedback to)
- **Polish (Phase 5)**: Depends on all user stories being complete

### Within Each User Story

- All tasks are in `src/tui.rs` so no parallelization within phases
- Tasks within each phase should be done sequentially

### Parallel Opportunities

- User Story 3 (Phase 4) could be partially done in parallel with User Story 2 (Phase 3) since feedback is independent of the two-step flow
- Polish tasks T015 and T016 are independent of each other

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (InputMode enum refactor)
2. Complete Phase 2: User Story 1 (title-only creation)
3. **STOP and VALIDATE**: Test title-only creation works
4. Proceed to Phase 3: User Story 2 (add body step)

### Incremental Delivery

1. Setup → InputMode enum ready
2. Add US1 → Title-only creation works (MVP!)
3. Add US2 → Two-step title+body creation works
4. Add US3 → Status feedback works
5. Polish → Edge cases handled

---

## Notes

- All changes are in a single file (`src/tui.rs`) — no parallel file editing possible
- The `issue::open()` function in `src/issue.rs` is read-only — already exists
- Total tasks: 16
- Tasks per user story: US1=4, US2=5, US3=2, Setup=3, Polish=2
- Suggested MVP scope: Phase 1 + Phase 2 (User Story 1)
diff --git a/specs/008-commit-browser/checklists/requirements.md b/specs/008-commit-browser/checklists/requirements.md
new file mode 100644
index 0000000..bb049d2
--- /dev/null
+++ b/specs/008-commit-browser/checklists/requirements.md
@@ -0,0 +1,34 @@
# Specification Quality Checklist: Commit Browser in TUI Dashboard

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-03-21
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

- All items pass. Spec is ready for `/speckit.plan`.
diff --git a/specs/008-commit-browser/plan.md b/specs/008-commit-browser/plan.md
new file mode 100644
index 0000000..868a707
--- /dev/null
+++ b/specs/008-commit-browser/plan.md
@@ -0,0 +1,68 @@
# Implementation Plan: Commit Browser in TUI Dashboard

**Branch**: `008-commit-browser` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)

## Summary

Add an in-dashboard commit/event browser to the TUI. When viewing a patch or issue, the user presses a key to see the full event DAG as a scrollable list (type, author, timestamp, signature status). Selecting an event shows its full details including commit ID, payload, and signature info. All changes primarily in `src/tui.rs`, reading from existing `dag::walk_events()` and `signing::verify_signed_event()`.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: ratatui 0.30, crossterm 0.29, git2 0.19, ed25519-dalek 2, serde/serde_json 1
**Testing**: cargo test
**Project Type**: CLI tool with TUI dashboard

## Design Decisions

### 1. ViewMode Extension

Add a `CommitList` variant to the existing `ViewMode` enum:
- `Details` — existing behavior
- `Diff` — existing diff view
- `CommitList` — event history list
- `CommitDetail` — single event detail view

This reuses the existing view mode pattern. The commit browser key (e.g., `c`) toggles into `CommitList` from the detail pane.

### 2. Event Data Loading

On entering `CommitList` mode, call `dag::walk_events(repo, ref_name)` where `ref_name` is constructed from the selected item's ID (e.g., `refs/collab/issues/{id}` or `refs/collab/patches/{id}`). Store the result as `Vec<(Oid, Event)>` in a new `event_history` field on App, along with an `event_list_state: ListState` for navigation.

For signature verification (US3), also call `signing::verify_ref(repo, ref_name)` to get `Vec<SignatureVerificationResult>` and store alongside.

### 3. Event List Rendering

Render the event list in the detail pane area (right side). Each row shows:
```
[sig_icon] EventType | author_name | timestamp
```
Where `sig_icon` is a single character: `✓` (valid), `✗` (invalid), `?` (untrusted), `-` (missing).

### 4. Event Detail Rendering

On Enter, show the selected event's full details in the detail pane:
- Commit: `{short_oid}`
- Author: `{name} <{email}>`
- Date: `{timestamp}`
- Type: `{action_type}`
- Signature: `{status}` (and pubkey if present)
- Content: event-specific payload (comment body, review verdict+body, etc.)

### 5. Navigation

- `c` from detail pane → enter CommitList mode
- Up/Down in CommitList → navigate events
- Enter in CommitList → show CommitDetail
- Escape in CommitDetail → back to CommitList
- Escape in CommitList → back to previous ViewMode (Details)

## Source Code

```text
src/
├── tui.rs    # PRIMARY: Add ViewMode variants, event loading, rendering, navigation
├── dag.rs    # READ ONLY: walk_events() for loading event history
├── signing.rs # READ ONLY: verify_dag() for signature status
├── event.rs  # READ ONLY: Event/Action types for display formatting
```
diff --git a/specs/008-commit-browser/review.md b/specs/008-commit-browser/review.md
new file mode 100644
index 0000000..2099990
--- /dev/null
+++ b/specs/008-commit-browser/review.md
@@ -0,0 +1,85 @@
# Pre-Implementation Review: 008-commit-browser

**Reviewer**: Claude Opus 4.6 (cross-model review)
**Date**: 2026-03-21

## Summary Table

| Criterion               | Status  | Notes                                                        |
|--------------------------|---------|--------------------------------------------------------------|
| Spec-Plan Alignment      | PASS    | All 3 user stories and edge cases addressed                  |
| Plan-Tasks Completeness  | PASS    | 19 tasks cover all plan items                                |
| Dependency Ordering      | PASS    | Phases correctly sequenced; parallel opportunities accurate  |
| Feasibility              | FAIL    | Plan references `signing::verify_dag()` which does not exist |
| Risk                     | MEDIUM  | One naming error; `'c'` key is available; single-file scope  |

## Detailed Findings

### 1. Spec-Plan Alignment: PASS

All three user stories are addressed:
- **US1** (event history list): Plan section 2 (event data loading) + section 3 (event list rendering) + section 5 (navigation)
- **US2** (individual event details): Plan section 4 (event detail rendering) + section 5 (Enter/Escape navigation)
- **US3** (signature status): Plan section 3 (sig_icon in list rows) + section 4 (signature info in detail)

All edge cases from the spec are covered by Phase 5 polish tasks (T018, T019) and T009 (error handling).

### 2. Plan-Tasks Completeness: PASS

Every plan section maps to at least one task:
- ViewMode extension -> T001
- Event data loading -> T005, T015
- Event list rendering -> T008, T016
- Event detail rendering -> T012, T013, T017
- Navigation -> T005-T007, T010-T011, T014
- Error handling -> T009
- Edge cases -> T018, T019

No plan items are missing task coverage.

### 3. Dependency Ordering: PASS

Phase ordering is correct:
- Phase 1 (setup) has no dependencies -- correct
- Phase 2 (US1) depends on Phase 1 for ViewMode variants and fields -- correct
- Phase 3 (US2) depends on Phase 2 for loaded event list -- correct
- Phase 4 (US3) depends on Phase 2 but not Phase 3 -- correct, can parallelize with Phase 3
- Phase 5 (polish) depends on all -- correct

### 4. Feasibility: FAIL - Function Name Mismatch

**Critical issue**: The plan references `signing::verify_dag()` in the summary and in design decision 2. This function does not exist. The actual function is:

```
signing::verify_ref(repo: &Repository, ref_name: &str) -> Result<Vec<SignatureVerificationResult>, Error>
```

This appears in tasks T003, T015, T016, and T017 (which reference `signing::verify_dag()`). The function signature and return type are otherwise exactly as the plan describes -- the only error is the name.

**Impact**: Low severity. The fix is a simple rename in the plan/tasks (`verify_dag` -> `verify_ref`). The actual implementation will need to call `signing::verify_ref()`.

**Other feasibility checks -- all pass**:
- `dag::walk_events(repo, ref_name) -> Result<Vec<(Oid, Event)>, Error>` -- confirmed, exists at `dag.rs:59` with exactly the signature the plan describes
- `ViewMode` enum at `tui.rs:30` has `Details` and `Diff` variants -- matches plan
- `App` struct at `tui.rs:68` has `mode: ViewMode`, `status_msg: Option<String>`, `scroll: u16`, `list_state: ListState` -- matches plan assumptions
- `InputMode` enum has `Normal`, `Search`, `CreateTitle` -- matches T019 (but plan should also list `CreateBody` if it exists)
- `Pane` enum has `ItemList` and `Detail` -- matches plan's navigation model
- `SignatureVerificationResult` struct and `VerifyStatus` enum exist in `signing.rs` with the fields described
- Key `'c'` is not bound in Normal mode (only `Ctrl+c`) -- available as planned

### 5. Risk Assessment

| Risk | Severity | Mitigation |
|------|----------|------------|
| `verify_dag` does not exist (it's `verify_ref`) | Low | Rename in implementation; same signature/return type |
| All 19 tasks in single file (`tui.rs`) limits parallelism | Low | Acceptable for feature scope; phases provide natural ordering |
| No test tasks included | Medium | Tasks doc notes "test tasks omitted" -- consider adding integration tests for event loading at minimum |
| `walk_events` + `verify_ref` both walk the DAG independently | Low | Performance acceptable for <100 events per spec SC-003; could optimize later by combining walks |
| `CreateBody` input mode not mentioned in T019 | Low | The grep shows `CreateTitle` exists in InputMode; verify `CreateBody` also exists and add to T019 guard |

## Recommendations

1. **Fix the function name**: Replace all references to `signing::verify_dag()` with `signing::verify_ref()` in plan.md and tasks.md.
2. **Add basic tests**: Even though tests were not requested, consider at least one test for the `action_type_label` helper (T004) and the `format_event_detail` helper (T012) since they are pure functions.
3. **Check CreateBody**: Verify whether `InputMode::CreateBody` exists and ensure T019 guards against it as well.
4. **Consider combined DAG walk**: `walk_events()` and `verify_ref()` both do full revwalks. A future optimization could combine them, but this is not a blocker.
diff --git a/specs/008-commit-browser/spec.md b/specs/008-commit-browser/spec.md
new file mode 100644
index 0000000..c473b95
--- /dev/null
+++ b/specs/008-commit-browser/spec.md
@@ -0,0 +1,99 @@
# Feature Specification: Commit Browser in TUI Dashboard

**Feature Branch**: `008-commit-browser`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Add the ability to browse commit details from within the dashboard. When viewing a patch or issue, show the event commit history and allow inspecting individual commits (author, timestamp, event type, signature status). Currently you can checkout with 'o' but there's no in-dashboard commit exploration."

## User Scenarios & Testing *(mandatory)*

### User Story 1 - View Event History for an Item (Priority: P1)

A user viewing a patch or issue in the dashboard wants to see the full event history — all the commits that make up the collaboration DAG for that item. They press a key to open a commit/event list showing each event's type (e.g., IssueOpen, PatchReview, PatchComment), author, and timestamp in chronological order.

**Why this priority**: Without seeing the event history, there is no commit browsing at all — this is the core of the feature.

**Independent Test**: Select an issue or patch with multiple events, press the commit browser key, and verify a scrollable list of events appears with type, author, and timestamp for each.

**Acceptance Scenarios**:

1. **Given** the user is viewing an issue or patch in the detail pane, **When** they press the commit browser key, **Then** a list of all events for that item appears showing event type, author name, and timestamp for each entry.
2. **Given** the event list is displayed, **When** the user scrolls up and down, **Then** the list navigates through events correctly.
3. **Given** the event list is displayed, **When** the user presses Escape, **Then** the view returns to the normal detail view.
4. **Given** an issue or patch has only one event (e.g., just the opening event), **When** the user opens the commit browser, **Then** a single entry is shown.

---

### User Story 2 - Inspect Individual Event Details (Priority: P2)

After opening the event list, the user selects a specific event to see its full details: the commit ID, author name and email, timestamp, event type, and the payload contents (e.g., for a comment event, the comment body; for a review, the verdict and body).

**Why this priority**: Seeing the list is useful on its own, but inspecting details completes the browsing experience.

**Independent Test**: Open the event list, select an event, and verify the detail view shows commit ID, author, timestamp, event type, and payload.

**Acceptance Scenarios**:

1. **Given** the event list is displayed, **When** the user selects an event and presses Enter, **Then** the detail view shows the commit ID (short hash), author name and email, timestamp, event type, and event-specific content.
2. **Given** the user is viewing event details, **When** they press Escape, **Then** they return to the event list.
3. **Given** the event is a comment, **When** viewing details, **Then** the comment body is displayed.
4. **Given** the event is a review, **When** viewing details, **Then** the verdict and body are displayed.

---

### User Story 3 - View Signature Status (Priority: P3)

The user wants to know whether each event commit was signed and whether the signature is valid, invalid, missing, or from an untrusted key. This information is shown alongside each event in the list and in the detail view.

**Why this priority**: Signature verification adds trust information but the feature is fully usable without it.

**Independent Test**: Open the event list for an item with signed events and verify signature status indicators appear next to each event.

**Acceptance Scenarios**:

1. **Given** the event list is displayed, **When** an event commit has a valid signature, **Then** a "valid" indicator is shown next to that event.
2. **Given** the event list is displayed, **When** an event commit has no signature, **Then** a "missing" indicator is shown.
3. **Given** the event list is displayed, **When** an event commit has an invalid or untrusted signature, **Then** the appropriate indicator is shown.
4. **Given** the user is viewing event details for a signed event, **When** they look at signature information, **Then** the public key and verification status are displayed.

---

### Edge Cases

- What happens when the user presses the commit browser key with no item selected? Nothing happens — the key is ignored.
- What happens when the DAG ref cannot be read (e.g., corrupted ref)? An error message appears in the status bar and the commit browser is not opened.
- What happens when the event list is very long (hundreds of events)? The list is scrollable and renders efficiently without freezing.
- What happens when the user presses the commit browser key while in search mode or issue creation mode? The keypress is captured by the active input mode, not treated as a commit browser trigger.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST provide a commit browser mode activated by a designated key from the detail pane of any issue or patch.
- **FR-002**: System MUST display a scrollable list of all events for the selected item, ordered chronologically.
- **FR-003**: Each event in the list MUST show the event type, author name, and timestamp.
- **FR-004**: System MUST allow the user to select an event and view its full details including commit ID, author name and email, timestamp, event type, and event payload content.
- **FR-005**: System MUST allow the user to dismiss the commit browser and return to the normal view using Escape.
- **FR-006**: System MUST display signature verification status for each event when signature data is available.
- **FR-007**: System MUST handle errors gracefully when reading event data, showing an error message in the status bar.
- **FR-008**: System MUST support keyboard navigation (up/down arrows) through the event list.

### Key Entities

- **Event Entry**: A single event in the collaboration DAG, containing commit ID, timestamp, author, event type/action, and optional signature information.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: Users can view the full event history of any issue or patch without leaving the TUI dashboard.
- **SC-002**: Users can inspect individual event details (commit ID, author, timestamp, type, payload) with no more than 2 keypresses from the detail view.
- **SC-003**: Navigation through the event list is responsive with no perceptible delay for items with up to 100 events.
- **SC-004**: All existing keyboard shortcuts continue to function when the commit browser is not active (no regressions).

## Assumptions

- The commit browser key is not currently bound to any action in the TUI detail pane and is available.
- Event data is read from the existing DAG walking functions — no new data sources are needed.
- The commit browser replaces or overlays the current detail pane content; it does not open a new window or pane.
- Signature verification uses the existing verification infrastructure.
diff --git a/specs/008-commit-browser/tasks.md b/specs/008-commit-browser/tasks.md
new file mode 100644
index 0000000..aac9559
--- /dev/null
+++ b/specs/008-commit-browser/tasks.md
@@ -0,0 +1,132 @@
# Tasks: Commit Browser in TUI Dashboard

**Input**: Design documents from `/specs/008-commit-browser/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)

**Tests**: Not explicitly requested — test tasks omitted.

**Organization**: Tasks grouped by user story for independent implementation and testing.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Extend ViewMode and add event storage fields to App

- [ ] T001 Add `CommitList` and `CommitDetail` variants to `ViewMode` enum in `src/tui.rs`
- [ ] T002 Add `event_history: Vec<(git2::Oid, Event)>` and `event_list_state: ListState` fields to App struct in `src/tui.rs`
- [ ] T003 Add `signature_results: Vec<SignatureVerificationResult>` field to App struct in `src/tui.rs`
- [ ] T004 Add helper method `fn action_type_label(action: &Action) -> &str` to format event types for display in `src/tui.rs`

**Checkpoint**: App compiles with new fields and ViewMode variants; existing functionality unaffected

---

## Phase 2: User Story 1 - View Event History (Priority: P1) 🎯 MVP

**Goal**: User presses `c` in detail pane to see scrollable event list with type, author, timestamp

**Independent Test**: Select an issue/patch, press `c`, verify event list appears with chronological events

### Implementation for User Story 1

- [ ] T005 [US1] Handle `c` keypress in detail pane Normal mode: construct ref name from selected item ID, call `dag::walk_events()`, store results in `event_history`, set `mode` to `CommitList` in `src/tui.rs`
- [ ] T006 [US1] Handle Up/Down navigation in `CommitList` mode using `event_list_state` in `src/tui.rs`
- [ ] T007 [US1] Handle Escape in `CommitList` mode: clear event history, return to `Details` mode in `src/tui.rs`
- [ ] T008 [US1] Render event list in detail pane when `mode` is `CommitList` — show each event as `EventType | author_name | timestamp` in a scrollable List widget in `src/tui.rs`
- [ ] T009 [US1] Handle error from `walk_events()`: show error in `status_msg` and stay in current mode in `src/tui.rs`

**Checkpoint**: User Story 1 fully functional — user can browse event history for any item

---

## Phase 3: User Story 2 - Inspect Individual Event Details (Priority: P2)

**Goal**: User selects an event and presses Enter to see full details including commit ID, author email, payload

**Independent Test**: Open event list, press Enter on an event, verify detail view shows commit ID, author, type, and payload content

### Implementation for User Story 2

- [ ] T010 [US2] Handle Enter in `CommitList` mode: set `mode` to `CommitDetail` in `src/tui.rs`
- [ ] T011 [US2] Handle Escape in `CommitDetail` mode: return to `CommitList` mode in `src/tui.rs`
- [ ] T012 [US2] Add helper method `fn format_event_detail(oid: &Oid, event: &Event) -> String` that formats commit ID (7-char short hash), author name+email, timestamp, event type, and action-specific payload in `src/tui.rs`
- [ ] T013 [US2] Render event detail in detail pane when `mode` is `CommitDetail` — show formatted detail as a scrollable Paragraph in `src/tui.rs`
- [ ] T014 [US2] Handle scroll (Up/Down or PageUp/PageDown) in `CommitDetail` mode for long content in `src/tui.rs`

**Checkpoint**: User Stories 1 AND 2 both work — full browse and inspect flow functional

---

## Phase 4: User Story 3 - Signature Status (Priority: P3)

**Goal**: Each event shows signature verification status in the list and in detail view

**Independent Test**: Open event list for signed items, verify signature indicators appear

### Implementation for User Story 3

- [ ] T015 [US3] On entering `CommitList` mode, also call `signing::verify_ref()` and store results in `signature_results` in `src/tui.rs`
- [ ] T016 [US3] Update event list rendering to prepend signature icon (`✓`/`✗`/`?`/`-`) from matched `SignatureVerificationResult` in `src/tui.rs`
- [ ] T017 [US3] Update event detail rendering to include signature status and public key (if present) from `SignatureVerificationResult` in `src/tui.rs`

**Checkpoint**: All user stories independently functional with signature verification

---

## Phase 5: Polish & Cross-Cutting Concerns

**Purpose**: Edge case handling and UX refinement

- [ ] T018 Ensure `c` keypress is ignored when not in detail pane or when no item is selected in `src/tui.rs`
- [ ] T019 Ensure `c` keypress is ignored during all input modes (Search, CreateTitle, CreateBody) — guard against all non-Normal InputMode variants in `src/tui.rs`

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies — T001-T004 must complete first
- **User Story 1 (Phase 2)**: Depends on Phase 1 (ViewMode variants and fields exist)
- **User Story 2 (Phase 3)**: Depends on Phase 2 (event list loaded and navigable)
- **User Story 3 (Phase 4)**: Depends on Phase 2 (event list exists to add indicators to)
- **Polish (Phase 5)**: Depends on all user stories being complete

### Parallel Opportunities

- T002 and T003 can be done in parallel (independent fields)
- US3 (Phase 4) can be done in parallel with US2 (Phase 3) — signature display is independent of detail view
- Polish tasks T018 and T019 are independent

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (ViewMode + fields)
2. Complete Phase 2: User Story 1 (event list browsing)
3. **STOP and VALIDATE**: Test event browsing works
4. Proceed to Phase 3-4

### Incremental Delivery

1. Setup → ViewMode extended, fields added
2. Add US1 → Event list browsing works (MVP!)
3. Add US2 → Event detail inspection works
4. Add US3 → Signature status shown
5. Polish → Edge cases handled

---

## Notes

- All changes in single file (`src/tui.rs`) — limited parallel editing
- `dag::walk_events()` and `signing::verify_dag()` are read-only dependencies
- Total tasks: 19
- Tasks per story: Setup=4, US1=5, US2=5, US3=3, Polish=2
- Suggested MVP scope: Phase 1 + Phase 2 (event list browsing)
diff --git a/specs/009-open-file-at-comment/checklists/requirements.md b/specs/009-open-file-at-comment/checklists/requirements.md
new file mode 100644
index 0000000..354d9ed
--- /dev/null
+++ b/specs/009-open-file-at-comment/checklists/requirements.md
@@ -0,0 +1,23 @@
# Requirements Checklist: 009-open-file-at-comment

## Functional Requirements

- [x] **FR-001**: System MUST bind the 'e' key in the TUI diff view to trigger the open-file-at-comment action.
- [x] **FR-002**: System MUST determine the inline comment nearest to (at or above) the current scroll position in the rendered diff.
- [x] **FR-003**: System MUST extract the `file` and `line` fields from the selected `InlineComment`.
- [x] **FR-004**: System MUST resolve the editor command from `$VISUAL` (preferred) or `$EDITOR` (fallback).
- [x] **FR-005**: System MUST suspend the TUI (restore terminal state), execute `<editor> +<line> <file>`, then restore the TUI after the editor exits.
- [x] **FR-006**: System MUST display a status message when no editor is configured or the referenced file does not exist.
- [x] **FR-007**: System MUST ignore 'e' keypresses when not in diff view mode or when no inline comments exist near the scroll position.

## User Stories

- [x] **US-001 (P1)**: Open file at inline comment line in $EDITOR
- [x] **US-002 (P2)**: Handle missing files and editor gracefully with status messages

## Success Criteria

- [x] **SC-001**: Pressing 'e' opens correct file at correct line in under 1 second
- [x] **SC-002**: TUI suspends/resumes cleanly around editor launch
- [x] **SC-003**: All error cases handled without panics, with user-visible status messages
- [x] **SC-004**: No regressions to existing keybindings or TUI behavior
diff --git a/specs/009-open-file-at-comment/plan.md b/specs/009-open-file-at-comment/plan.md
new file mode 100644
index 0000000..1e687c5
--- /dev/null
+++ b/specs/009-open-file-at-comment/plan.md
@@ -0,0 +1,108 @@
# Implementation Plan: Open File at Inline Comment Location

**Branch**: `009-open-file-at-comment` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/009-open-file-at-comment/spec.md`

## Summary

Add an 'e' keybinding in the TUI patch diff view that opens the file referenced by the nearest inline comment at the exact line number in the user's `$EDITOR`. The implementation lives almost entirely in `src/tui.rs`, reading inline comment data from existing `InlineComment` structs in `src/state.rs` and using `std::process::Command` to launch the editor.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: ratatui 0.30, crossterm 0.29, git2 0.19
**Storage**: N/A (no new persistence; reads existing InlineComment data from git refs)
**Testing**: cargo test, cargo clippy
**Target Platform**: Linux/macOS terminal
**Project Type**: CLI / TUI application
**Performance Goals**: Editor launch in under 1 second
**Constraints**: Must suspend/resume TUI cleanly; must handle missing $EDITOR gracefully
**Scale/Scope**: Single keybinding addition; ~50-100 lines of new code

## Constitution Check

No violations. This is a small, self-contained feature addition that does not introduce new dependencies, new data models, or architectural changes.

## Project Structure

### Documentation (this feature)

```text
specs/009-open-file-at-comment/
├── spec.md
├── plan.md
└── checklists/
    └── requirements.md
```

### Source Code (repository root)

```text
src/
├── tui.rs          # Primary changes: keybinding handler, editor launch, comment-at-scroll lookup
├── state.rs        # No changes needed (InlineComment struct already has file + line)
├── event.rs        # No changes needed (PatchInlineComment action already defined)
└── ...
tests/
└── ...             # New unit tests for comment-at-scroll logic and editor command building
```

**Structure Decision**: All changes are in the existing single-project structure. No new files or modules needed.

## Implementation Details

### Phase 1: Track inline comment positions in rendered diff

The `colorize_diff` function in `src/tui.rs` already renders inline comments into the diff output. During rendering, build a side-channel data structure that maps rendered line numbers (in the `Text` output) to `(file, line)` pairs from the `InlineComment`. This could be a `Vec<(usize, String, u32)>` storing `(rendered_line_index, file_path, source_line)`.

Return this mapping alongside the `Text` from `colorize_diff` (or store it on the `App` struct).

### Phase 2: Find nearest comment from scroll position

Given the current `app.scroll` value and the comment-position mapping from Phase 1, find the inline comment whose rendered line is closest to and at or above the scroll position. This is a simple reverse linear scan of the mapping.

Helper function signature:
```rust
fn find_comment_at_scroll(
    comment_positions: &[(usize, String, u32)],
    scroll: u16,
) -> Option<(String, u32)>  // (file_path, line_number)
```

### Phase 3: Editor resolution and launch

Resolve the editor command:
1. Check `$VISUAL`, then `$EDITOR`, then fall back to showing a status message.
2. Split the editor string on whitespace to handle editors like `code --wait`.
3. Build and execute `std::process::Command` with `+<line>` and `<file>` arguments.

Before spawning:
- Check that the file exists on disk; if not, set a status message and return.
- Suspend the TUI: call `crossterm::terminal::disable_raw_mode()`, `crossterm::execute!(stdout, LeaveAlternateScreen)`.

After the editor exits:
- Restore the TUI: call `crossterm::terminal::enable_raw_mode()`, `crossterm::execute!(stdout, EnterAlternateScreen)`.
- Force a full redraw.

### Phase 4: Keybinding integration

In the main event loop in `src/tui.rs` (around line 258), add a handler for `KeyCode::Char('e')`:
- Guard: only active when `app.tab == Tab::Patches` and `app.mode == ViewMode::Diff`.
- Call `find_comment_at_scroll` to get file and line.
- If found, launch editor. If not found, no-op (or show "No comment at cursor").

### Phase 5: Status message display

Add an optional `status_message: Option<String>` field to `App` (if not already present). Render it in the footer area of the TUI. Clear it on the next keypress or after a timeout.

## Key Design Decisions

1. **Comment-at-scroll vs. cursor-based selection**: Using scroll position rather than a dedicated cursor keeps the implementation simple and avoids introducing a new navigation mode. The tradeoff is slightly less precision, but inline comments are visually distinct (magenta) so the user can easily scroll to one.

2. **$VISUAL before $EDITOR**: Following Unix convention where `$VISUAL` is for full-screen editors and `$EDITOR` is for line-based editors. Since we are opening at a specific line in a file, either works, but `$VISUAL` takes precedence per convention.

3. **No new dependencies**: Uses only `std::process::Command` and existing crossterm APIs for terminal suspend/resume. No new crates needed.

## Complexity Tracking

No constitution violations. This is a small, focused feature.
diff --git a/specs/009-open-file-at-comment/review.md b/specs/009-open-file-at-comment/review.md
new file mode 100644
index 0000000..da78353
--- /dev/null
+++ b/specs/009-open-file-at-comment/review.md
@@ -0,0 +1,57 @@
# Pre-Implementation Review

**Feature**: Open File at Inline Comment Location from Dashboard
**Artifacts reviewed**: spec.md, plan.md, tasks.md, checklists/requirements.md
**Review model**: Claude Opus 4.6 (1M context)
**Generating model**: Unknown (spec/plan), Claude Opus 4.6 (tasks)

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS | All 7 FRs, both user stories, and all edge cases covered |
| Plan-Tasks Completeness | PASS | Every plan phase has corresponding tasks; all architectural components covered |
| Dependency Ordering | PASS | Correct phase ordering: setup -> US1 -> US2 -> polish |
| Parallelization Correctness | PASS | No [P] markers used; correct since all tasks touch src/tui.rs |
| Feasibility & Risk | WARN | Two design concerns with colorize_diff refactor and render_detail mutability |
| Standards Compliance | PASS | Constitution is a template (no rules to violate); no new dependencies |
| Implementation Readiness | WARN | Two tasks need more specificity around mutability and terminal lifecycle |

**Overall**: READY WITH WARNINGS

## Findings

### Critical (FAIL -- must fix before implementing)

None.

### Warnings (WARN -- recommend fixing, can proceed)

1. **T005 mutability conflict with render_detail**: The current `render_detail()` function takes `app: &App` (immutable reference). Storing `comment_positions` from `colorize_diff()` back into `app` during rendering requires either (a) changing `render_detail` to take `&mut App`, (b) moving the comment-position computation out of the render path into the pre-render cache block in `run_loop()` (around lines 241-252 where diff_cache is populated), or (c) using interior mutability (`RefCell`). Option (b) is the cleanest approach and aligns with the existing `diff_cache` pattern. The task description acknowledges this with "(requires `app` to be `&mut App` or positions stored via a different mechanism)" but should prescribe a specific approach to avoid implementation ambiguity.

2. **T008 terminal lifecycle**: The `open_editor_at()` function takes a `Terminal` reference for suspend/resume, but the actual `terminal` variable is owned by `run_loop()`, not directly accessible from a standalone function. The function signature should also include `stdout` handling (the current `run()` function calls `stdout().execute(LeaveAlternateScreen)` directly). The implementation should follow the exact pattern from lines 222-229 of `src/tui.rs` for consistency. Consider passing `terminal` as `&mut Terminal<CrosstermBackend<io::Stdout>>` and using `crossterm::execute!(io::stdout(), LeaveAlternateScreen)` directly rather than going through the terminal backend.

3. **Edge case: `$EDITOR` with arguments**: The spec (edge case 4) calls out handling editors like `code --wait`. T007 says "split on whitespace" which is a basic approach but could break paths with spaces (e.g., `/usr/local/My Editor/bin/editor`). This is an acceptable tradeoff for v1 and matches common Unix tool behavior, but worth noting.

4. **No test tasks**: The spec does not explicitly request tests and the tasks omit them. However, `find_comment_at_scroll()` and `resolve_editor()` are pure functions that would benefit from unit tests. The user's memory notes a TDD preference. Consider adding test tasks for these two helpers at minimum.

### Observations (informational)

1. **Verified struct existence**: `InlineComment` exists at `src/state.rs:50` with fields `author`, `file`, `line`, `body`, `timestamp` -- matches spec's Key Entities section exactly. No changes to `state.rs` or `event.rs` needed.

2. **Verified colorize_diff exists**: `colorize_diff()` at `src/tui.rs:706` already tracks `current_file` and `current_new_line` internally, so building the comment-position mapping (T004) is a natural extension of existing logic.

3. **No existing status_msg or InputMode**: The current `tui.rs` has no status message mechanism or input mode enum. T001 introduces `status_msg` which is the minimal approach. If feature 008 (commit browser) lands first, there may be a `status_msg` field already added -- check before implementing.

4. **Scope is well-contained**: 21 tasks, all in `src/tui.rs`, estimated at 50-100 lines of new code. This is proportional to the feature complexity.

5. **Plan phase numbering vs task phases**: The plan describes 5 phases (track positions, find nearest, editor launch, keybinding, status message) while tasks consolidate into 4 phases (setup, US1, US2, polish). This is a valid simplification -- the plan's phases 1-4 map to tasks Phase 1+2 (US1), and plan phase 5 maps to tasks Phase 3 (US2).

6. **`render_detail` currently calls `colorize_diff` inline**: The diff text is computed inside `render_detail()` at line 477. Since `render_detail` takes `&App`, the comment positions cannot be stored there. The recommended approach is to compute comment positions in the same pre-render block as `diff_cache` (lines 241-252 of `run_loop()`), or compute them alongside the diff and store both in the cache.

## Recommended Actions

- [ ] Clarify T005: prescribe computing comment positions in the `run_loop()` pre-render block (alongside `diff_cache` population at lines 241-252) rather than inside `render_detail()`
- [ ] Clarify T008: specify that `terminal` is passed as `&mut Terminal<CrosstermBackend<io::Stdout>>` and that suspend/resume uses `crossterm::execute!(io::stdout(), ...)` directly
- [ ] Consider adding unit test tasks for `find_comment_at_scroll()` and `resolve_editor()` given user's TDD preference
- [ ] Check whether feature 008 has already introduced a `status_msg` field to `App` before implementing T001
diff --git a/specs/009-open-file-at-comment/spec.md b/specs/009-open-file-at-comment/spec.md
new file mode 100644
index 0000000..cf83b74
--- /dev/null
+++ b/specs/009-open-file-at-comment/spec.md
@@ -0,0 +1,75 @@
# Feature Specification: Open File at Inline Comment Location from Dashboard

**Feature Branch**: `009-open-file-at-comment`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "When viewing a patch diff with inline comments in the TUI dashboard, pressing a key (e.g. 'e') on an inline comment should open the referenced file at the exact line in the user's $EDITOR."

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Open file at inline comment line (Priority: P1)

As a code reviewer using the TUI dashboard, I want to press 'e' while viewing a patch diff to open the file referenced by the nearest inline comment at the exact line number in my $EDITOR, so I can quickly jump from reviewing to editing without manually locating the file and line.

**Why this priority**: This is the core feature. Without it, the user must manually copy file paths and line numbers from the diff view, breaking their review flow.

**Independent Test**: Can be fully tested by creating a patch with an inline comment on a known file and line, pressing 'e' in diff view, and verifying $EDITOR opens the correct file at the correct line.

**Acceptance Scenarios**:

1. **Given** the user is viewing a patch diff that has an inline comment on `src/patch.rs:5`, **When** the user scrolls to that comment and presses 'e', **Then** the TUI suspends, `$EDITOR +5 src/patch.rs` is executed, and after the editor exits, the TUI resumes.
2. **Given** the user is viewing a patch diff with multiple inline comments, **When** the user scrolls to a specific comment and presses 'e', **Then** the editor opens at the file and line of the inline comment closest to (at or above) the current scroll position.
3. **Given** the user has `$EDITOR` set to `vim`, **When** they press 'e' on an inline comment, **Then** `vim +<line> <file>` is executed.
4. **Given** the user has `$VISUAL` set but `$EDITOR` is unset, **When** they press 'e', **Then** `$VISUAL` is used as the editor command.

---

### User Story 2 - Handle missing files and editor gracefully (Priority: P2)

As a user, when I press 'e' on an inline comment that references a file that no longer exists or when no editor is configured, I want to see a clear status message rather than a crash.

**Why this priority**: Error handling is important for robustness but is secondary to the core functionality. Users should never experience a panic from this feature.

**Independent Test**: Can be tested by removing the referenced file or unsetting $EDITOR/$VISUAL, pressing 'e', and verifying a status message appears in the TUI footer.

**Acceptance Scenarios**:

1. **Given** an inline comment references `src/deleted.rs:10` but that file does not exist on disk, **When** the user presses 'e', **Then** the TUI displays a status message like "File not found: src/deleted.rs" and does not crash or spawn an editor.
2. **Given** neither `$EDITOR` nor `$VISUAL` is set, **When** the user presses 'e', **Then** the TUI displays a status message like "No editor configured. Set $EDITOR or $VISUAL." and does not crash.
3. **Given** the editor command fails (exits with non-zero), **When** the editor exits, **Then** the TUI resumes normally and optionally shows a brief status message.

---

### Edge Cases

- What happens when the user presses 'e' while not in diff view mode? The keypress should be ignored (no-op).
- What happens when the user presses 'e' while viewing a diff with no inline comments? The keypress should be ignored (no-op).
- What happens when the user presses 'e' on an orphan comment (comment on a file not in the diff)? The editor should still open the file at the referenced line, since the `InlineComment` struct contains the file path and line.
- What happens when `$EDITOR` contains arguments (e.g., `code --wait`)? The command should be split and executed properly.
- What happens when the scroll position is between two inline comments? The nearest comment at or above the scroll position should be selected.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST bind the 'e' key in the TUI diff view to trigger the open-file-at-comment action.
- **FR-002**: System MUST determine the inline comment nearest to (at or above) the current scroll position in the rendered diff.
- **FR-003**: System MUST extract the `file` and `line` fields from the selected `InlineComment`.
- **FR-004**: System MUST resolve the editor command from `$VISUAL` (preferred) or `$EDITOR` (fallback).
- **FR-005**: System MUST suspend the TUI (restore terminal state), execute `<editor> +<line> <file>`, then restore the TUI after the editor exits.
- **FR-006**: System MUST display a status message when no editor is configured or the referenced file does not exist.
- **FR-007**: System MUST ignore 'e' keypresses when not in diff view mode or when no inline comments exist near the scroll position.

### Key Entities

- **InlineComment**: Existing struct in `src/state.rs` with `file: String`, `line: u32`, `body: String`, `author: Author`, `timestamp: String`. No changes needed.
- **App**: Existing TUI application state struct. May need a `status_message: Option<String>` field (or reuse existing mechanism) for transient user feedback.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: Pressing 'e' on an inline comment in diff view opens the correct file at the correct line in under 1 second.
- **SC-002**: The TUI suspends cleanly before editor launch and resumes correctly after editor exit with no visual artifacts.
- **SC-003**: All error cases (missing file, missing editor, editor failure) are handled without panics and display a user-visible status message.
- **SC-004**: Existing keybindings and TUI behavior are unaffected by this feature (no regressions).
diff --git a/specs/009-open-file-at-comment/tasks.md b/specs/009-open-file-at-comment/tasks.md
new file mode 100644
index 0000000..87c50ee
--- /dev/null
+++ b/specs/009-open-file-at-comment/tasks.md
@@ -0,0 +1,126 @@
# Tasks: Open File at Inline Comment Location

**Input**: Design documents from `/specs/009-open-file-at-comment/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)

**Tests**: Not explicitly requested in spec -- test tasks omitted.

**Organization**: Tasks grouped by user story. All changes in `src/tui.rs` (single file).

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2)

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Add new fields and helpers to App struct to support comment tracking and status messages

- [ ] T001 Add `status_msg: Option<String>` field to `App` struct in `src/tui.rs` for transient user feedback messages
- [ ] T002 Add `comment_positions: Vec<(usize, String, u32)>` field to `App` struct in `src/tui.rs` to map rendered diff line indices to `(rendered_line_index, file_path, source_line)` tuples
- [ ] T003 Update `App::new()` to initialize `status_msg: None` and `comment_positions: Vec::new()` in `src/tui.rs`

**Checkpoint**: App compiles with new fields. Existing functionality unaffected.

---

## Phase 2: User Story 1 - Open File at Inline Comment Line (Priority: P1) MVP

**Goal**: Press 'e' in diff view to open the file referenced by the nearest inline comment at the exact line in $EDITOR/$VISUAL

**Independent Test**: Create a patch with an inline comment on a known file and line, press 'e' in diff view, verify the editor opens the correct file at the correct line.

### Implementation for User Story 1

- [ ] T004 [US1] Refactor `colorize_diff()` in `src/tui.rs` to return `(Text, Vec<(usize, String, u32)>)` instead of just `Text` -- track each inline comment's rendered line index alongside its `file` and `line` fields from `InlineComment`
- [ ] T005 [US1] Update the call site of `colorize_diff()` in `render_detail()` in `src/tui.rs` to store the returned comment positions into `app.comment_positions` (requires `app` to be `&mut App` or positions stored via a different mechanism)
- [ ] T006 [US1] Add helper function `find_comment_at_scroll(comment_positions: &[(usize, String, u32)], scroll: u16) -> Option<(String, u32)>` in `src/tui.rs` -- reverse linear scan to find the inline comment whose rendered line is closest to and at or above the current scroll position
- [ ] T007 [US1] Add helper function `resolve_editor() -> Option<Vec<String>>` in `src/tui.rs` -- check `$VISUAL` first, then `$EDITOR`, split on whitespace to handle editors like `code --wait`, return `None` if neither is set
- [ ] T008 [US1] Add function `open_editor_at(file: &str, line: u32, terminal: &mut Terminal<CrosstermBackend<io::Stdout>>) -> Result<(), String>` in `src/tui.rs` -- check file exists on disk with `std::path::Path::new(file).exists()`, suspend TUI (`disable_raw_mode` + `LeaveAlternateScreen`), build and execute `std::process::Command` with `+<line>` and `<file>` args, wait for exit, restore TUI (`enable_raw_mode` + `EnterAlternateScreen`), force redraw by clearing terminal
- [ ] T009 [US1] Add `KeyCode::Char('e')` handler in `run_loop()` in `src/tui.rs` -- guard: only active when `app.tab == Tab::Patches && app.mode == ViewMode::Diff`; call `find_comment_at_scroll()` to get file and line; if found, call `resolve_editor()` then `open_editor_at()`; clear `status_msg` on success
- [ ] T010 [US1] Update `render_footer()` in `src/tui.rs` to include `e:edit` hint when in diff view mode (`app.tab == Tab::Patches && app.mode == ViewMode::Diff`)

**Checkpoint**: Core open-file-at-comment functionality works end-to-end. User can press 'e' on a diff with inline comments and the editor opens at the correct location.

---

## Phase 3: User Story 2 - Handle Missing Files and Editor Gracefully (Priority: P2)

**Goal**: Show clear status messages for error cases instead of crashing

**Independent Test**: Unset $EDITOR/$VISUAL and press 'e', or press 'e' on a comment referencing a deleted file -- verify status message appears in footer.

### Implementation for User Story 2

- [ ] T011 [US2] In `KeyCode::Char('e')` handler in `src/tui.rs`: when `resolve_editor()` returns `None`, set `app.status_msg = Some("No editor configured. Set $EDITOR or $VISUAL.".to_string())` and return early
- [ ] T012 [US2] In `open_editor_at()` in `src/tui.rs`: when `Path::new(file).exists()` is false, return `Err(format!("File not found: {}", file))` and in the caller set `app.status_msg` to that error message
- [ ] T013 [US2] In `open_editor_at()` in `src/tui.rs`: when the editor command exits with a non-zero status, return `Err(format!("Editor exited with status: {}", code))` and in the caller set `app.status_msg` accordingly
- [ ] T014 [US2] In `KeyCode::Char('e')` handler in `src/tui.rs`: when `find_comment_at_scroll()` returns `None` (no comment near scroll position), set `app.status_msg = Some("No inline comment at current position".to_string())`
- [ ] T015 [US2] Render `status_msg` in `render_footer()` in `src/tui.rs`: when `app.status_msg.is_some()`, display the message in the footer area with a distinct style (e.g., yellow on dark background) instead of normal key hints
- [ ] T016 [US2] Clear `status_msg` on any keypress in `run_loop()` in `src/tui.rs`: at the top of the key event handler block, set `app.status_msg = None` before processing the key

**Checkpoint**: All error cases handled gracefully with user-visible status messages. No panics.

---

## Phase 4: Polish & Cross-Cutting Concerns

**Purpose**: Edge cases, cleanup, and final validation

- [ ] T017 Ensure 'e' keypress is a no-op when `app.tab != Tab::Patches` or `app.mode != ViewMode::Diff` in `src/tui.rs` (verify guard condition in T009 is correct)
- [ ] T018 Ensure 'e' keypress is a no-op when the patch has no inline comments in `src/tui.rs` (comment_positions will be empty, find_comment_at_scroll returns None, handled by T014)
- [ ] T019 Ensure `comment_positions` is cleared when switching away from diff view mode or when switching patches in `src/tui.rs`
- [ ] T020 Run `cargo clippy` and fix any warnings introduced by the new code
- [ ] T021 Run `cargo test` and verify all existing tests still pass (no regressions)

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies -- T001-T003 must complete first
- **User Story 1 (Phase 2)**: Depends on Phase 1 (new App fields exist)
- **User Story 2 (Phase 3)**: Depends on Phase 2 (editor launch and comment lookup exist)
- **Polish (Phase 4)**: Depends on Phase 2 and Phase 3

### Within Each Phase

- Phase 1: T001-T003 are sequential (same file, dependent struct changes)
- Phase 2: T004 before T005 (colorize_diff refactor before call site update); T006-T007 independent helpers; T008 depends on T007; T009 depends on T006+T008; T010 independent
- Phase 3: T011-T016 are all sequential (same file, build on each other)
- Phase 4: T017-T019 independent verifications; T020-T021 final checks

### Parallel Opportunities

- All tasks modify `src/tui.rs` -- no parallel execution possible
- T006 and T007 could theoretically be written in parallel (independent helper functions) but since they are in the same file, sequential is safer

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (T001-T003)
2. Complete Phase 2: User Story 1 (T004-T010)
3. **STOP and VALIDATE**: Test that 'e' opens the editor at the correct file+line
4. Proceed to Phase 3-4

### Incremental Delivery

1. Setup -- App struct extended with new fields
2. Add US1 -- Editor launch from diff view works (MVP!)
3. Add US2 -- Error handling and status messages
4. Polish -- Edge cases, clippy, test validation

---

## Notes

- All 21 tasks modify a single file: `src/tui.rs` -- no parallelism possible
- The `colorize_diff()` refactor (T004) is the most complex task, touching rendering logic
- The `open_editor_at()` function (T008) handles terminal suspend/resume, which must match the pattern used in the existing `run()` function
- `InlineComment` struct in `src/state.rs` already has `file: String` and `line: u32` fields -- no changes needed there
- No `status_msg` or `InputMode` exists in current `tui.rs` -- T001 introduces the status message field
- Total tasks: 21 (Setup=3, US1=7, US2=6, Polish=5)
diff --git a/specs/010-email-patch-import/checklists/requirements.md b/specs/010-email-patch-import/checklists/requirements.md
new file mode 100644
index 0000000..85894ef
--- /dev/null
+++ b/specs/010-email-patch-import/checklists/requirements.md
@@ -0,0 +1,39 @@
# Requirements Checklist: Email/Format-Patch Import

## Functional Requirements

- [x] **FR-001**: System MUST accept a file path argument pointing to a `.patch` file generated by `git format-patch`.
- [x] **FR-002**: System MUST parse the patch file to extract the commit message subject (title), body (description), and diff content.
- [x] **FR-003**: System MUST apply the patch to create a real git commit on a temporary branch (`collab/imported/<short-oid>`).
- [x] **FR-004**: System MUST create a patch DAG entry using the existing `patch::create()` function with the resulting commit OID.
- [x] **FR-005**: System MUST default `--base` to `main` if not specified.
- [x] **FR-006**: System MUST validate that the patch file is a valid `git format-patch` output before attempting to apply.
- [x] **FR-007**: System MUST NOT modify the working tree or current branch when importing a patch.
- [x] **FR-008**: System MUST report the created patch ID to stdout on success.
- [x] **FR-009**: System MUST exit with a non-zero status and descriptive error for malformed patches, missing files, apply conflicts, and missing base branches.
- [x] **FR-010**: System MUST support the `--series` flag for importing multiple patch files as a single patch DAG entry (P3).
- [x] **FR-011**: System MUST roll back any partially-applied commits if a series import fails partway through (P3).

## User Stories

- [x] **US-P1**: Import a .patch file and create patch DAG entry
- [x] **US-P2**: Review imported patches using existing review flow
- [x] **US-P3**: Multi-patch series import

## Edge Cases

- [x] **EC-001**: Patch file path does not exist
- [x] **EC-002**: Patch file is empty (0 bytes)
- [x] **EC-003**: Patch generated against a very different tree (conflicts)
- [x] **EC-004**: Same patch imported twice (two distinct DAG entries)
- [x] **EC-005**: Patch contains binary diffs
- [x] **EC-006**: Base branch does not exist
- [x] **EC-007**: Repository has uncommitted changes (no effect on import)

## Success Criteria

- [x] **SC-001**: End-to-end import-review-merge workflow works
- [x] **SC-002**: Error cases produce clear messages with no side effects
- [x] **SC-003**: Imported patches indistinguishable from local patches in list/show
- [x] **SC-004**: Multi-patch series import works (P3)
- [x] **SC-005**: All code passes cargo test and cargo clippy
diff --git a/specs/010-email-patch-import/plan.md b/specs/010-email-patch-import/plan.md
new file mode 100644
index 0000000..30b10f6
--- /dev/null
+++ b/specs/010-email-patch-import/plan.md
@@ -0,0 +1,170 @@
# Implementation Plan: Email/Format-Patch Import

**Branch**: `010-email-patch-import` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/010-email-patch-import/spec.md`

## Summary

Enable mailing-list-style contributions by adding a `git collab patch import <file>` subcommand. The command parses a `git format-patch` output file, applies it to a temporary branch using git2, captures the resulting commit OID, and creates a patch DAG entry via the existing `patch::create()` infrastructure. Review, comments, and merge flow work unchanged.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: git2 0.19, clap 4 (derive), serde/serde_json 1, chrono 0.4, ed25519-dalek 2, thiserror 2
**Storage**: Git refs under `.git/refs/collab/patches/` (existing), temp branches under `refs/heads/collab/imported/`
**Testing**: `cargo test`, `cargo clippy`
**Target Platform**: Linux/macOS CLI
**Project Type**: CLI tool
**Constraints**: Must not modify working tree or current branch during import

## Existing Code Analysis

### `src/patch.rs` - Patch operations module
- `patch::create(repo, title, body, base_ref, head_commit)` creates a patch DAG entry. Takes a `head_commit` string (OID) and `base_ref` string. Returns the patch ID. This is the function the import command will call after applying the patch.
- `patch::merge()`, `patch::review()`, `patch::comment()`, `patch::show()`, `patch::list()` all operate on the DAG entry. No changes needed for imported patches -- they will work as-is if `create()` produces a valid entry.

### `src/event.rs` - Event types
- `Action::PatchCreate { title, body, base_ref, head_commit }` is the event type used by `patch::create()`. The import command produces exactly these fields.
- No new `Action` variants are needed.

### `src/dag.rs` - DAG operations
- `dag::create_root_event()` creates an orphan commit with a signed event blob. Used by `patch::create()`.
- `dag::append_event()` appends to an existing DAG ref. Used by review/comment/merge.
- Signing uses `ed25519_dalek::SigningKey`. The import command will use the maintainer's signing key (same as any other patch create).

### `src/cli.rs` - CLI definitions
- `PatchCmd` enum defines subcommands under `git collab patch`. The `Import` variant will be added here.
- Pattern follows existing subcommands: struct fields map to clap args.

## Project Structure

### Documentation (this feature)

```text
specs/010-email-patch-import/
├── spec.md
├── plan.md
└── checklists/
    └── requirements.md
```

### Source Code (repository root)

```text
src/
├── cli.rs          # Add Import variant to PatchCmd enum
├── main.rs         # Add match arm for PatchCmd::Import
├── patch.rs        # Add import() and import_series() functions
├── event.rs        # No changes needed
├── dag.rs          # No changes needed
└── error.rs        # Add import-specific error variants if needed

tests/
└── patch_import.rs # Integration tests for import functionality
```

**Structure Decision**: All new logic lives in `src/patch.rs` as new public functions alongside existing patch operations. No new modules needed -- the import function is a composition of file parsing + git2 apply + existing `patch::create()`.

## Implementation Phases

### Phase 1: CLI wiring and patch file parsing (P1 foundation)

**Files modified**: `src/cli.rs`, `src/main.rs`, `src/patch.rs`

1. Add `Import` variant to `PatchCmd` in `src/cli.rs`:
   ```rust
   Import {
       /// Path to .patch file
       file: PathBuf,
       /// Base branch ref (default: main)
       #[arg(long, default_value = "main")]
       base: String,
       /// Import multiple patches as a single series
       #[arg(long)]
       series: bool,
   }
   ```

2. Add `import()` function to `src/patch.rs`:
   - Read the patch file from disk (`std::fs::read_to_string`).
   - Validate it looks like a `git format-patch` mbox file (check for `From ` line prefix and `Subject:` header).
   - Parse subject line to extract title (strip `[PATCH]` prefix and number markers).
   - Parse body: everything between the end of headers and the `---` separator before the diff.
   - Extract the raw diff content (everything from `diff --git` onward).

3. Add match arm in `src/main.rs` for `PatchCmd::Import` that calls `patch::import()`.

### Phase 2: Apply patch and create DAG entry (P1 core)

**Files modified**: `src/patch.rs`

1. Resolve the base branch OID using `repo.revparse_single(&format!("refs/heads/{}", base))`.
2. Get the base commit's tree.
3. Parse the diff using `git2::Diff::from_buffer(diff_bytes)`.
4. Apply the diff to the base tree using `repo.apply(&diff, git2::ApplyLocation::Index, None)` or by building a new index via `repo.apply_to_tree()`.
5. Write the resulting index as a tree (`index.write_tree_to(repo)`).
6. Create a commit on a temp branch:
   - Branch name: `collab/imported/<short-oid>` (use first 8 chars of tree OID as a placeholder, or the commit OID after creation).
   - Parent: the base commit.
   - Author signature: extracted from the patch file `From:` / `Date:` headers (preserving the original contributor's identity).
   - Committer signature: the maintainer (from `get_author(repo)`).
   - Message: the original commit message from the patch.
7. Create the temp branch ref pointing to the new commit.
8. Call `patch::create(repo, &title, &body, &base, &commit_oid_str)` to create the DAG entry.
9. Print the patch ID to stdout.

### Phase 3: Error handling and validation

**Files modified**: `src/patch.rs`, `src/error.rs`

1. Add error variants to the project's error type:
   - `PatchFileNotFound(PathBuf)`
   - `InvalidPatchFile(String)` -- for malformed files
   - `PatchApplyConflict(String)` -- when the diff cannot apply cleanly
   - `BaseBranchNotFound(String)`
2. Validate the file exists before reading.
3. Validate the base branch exists before attempting apply.
4. If `apply_to_tree()` fails, return `PatchApplyConflict` with details.
5. Ensure no partial state is left behind on any error path (no orphan branches, no partial DAG).

### Phase 4: Multi-patch series import (P3)

**Files modified**: `src/patch.rs`, `src/cli.rs`

1. When `--series` is set and multiple files are provided, change `file: PathBuf` to accept multiple files (use `Vec<PathBuf>` or positional args).
2. Add `import_series()` function:
   - Sort patch files by name (convention: `0001-*.patch`, `0002-*.patch`, ...).
   - Detect cover letter (`0000-cover-letter.patch`) and extract title/body from it.
   - Apply patches sequentially, each one's parent being the previous commit.
   - If any patch fails to apply, delete the temp branch (rollback) and return error.
   - On success, create a single DAG entry with head_commit = final commit OID.
   - Title comes from cover letter subject, or first patch subject if no cover letter.

### Phase 5: Tests

**Files created**: `tests/patch_import.rs`

1. **test_import_single_patch**: Generate a patch file in a temp repo, import it, verify DAG entry exists with correct title and OID.
2. **test_import_malformed_file**: Try importing a non-patch file, verify error.
3. **test_import_conflict**: Create a patch against a diverged base, verify conflict error.
4. **test_import_missing_file**: Try importing a nonexistent path, verify error.
5. **test_import_custom_base**: Import with `--base develop`, verify base_ref in DAG.
6. **test_import_series**: Import a 3-patch series, verify single DAG entry with correct final OID (P3).
7. **test_import_series_rollback**: Import a series where patch 2 fails, verify no branches or DAG entries remain (P3).
8. **test_imported_patch_reviewable**: Import a patch, then run review and comment operations on it.

## Key Technical Decisions

1. **Patch parsing**: Use simple string parsing for mbox format rather than adding a mail-parsing crate. The `git format-patch` output format is well-defined and stable.

2. **Apply mechanism**: Use `git2::apply_to_tree()` to apply the diff to the base tree in-memory, then write the tree and create a commit. This avoids touching the working directory or index.

3. **Author preservation**: The commit created in the temp branch preserves the original contributor's author identity from the patch `From:` header. The committer is set to the maintainer. This matches standard `git am` behavior.

4. **Temp branch naming**: Use `collab/imported/<short-oid>` where short-oid is the first 8 characters of the created commit OID. This provides uniqueness and traceability.

5. **No new Action variant**: The imported patch uses the existing `Action::PatchCreate` event. There is no need to distinguish imported patches from locally-created ones in the DAG -- the review flow is identical.

## Complexity Tracking

No constitution violations. The feature adds a single new subcommand and two new functions to an existing module. No new crates, no new abstractions, no new persistence mechanisms.
diff --git a/specs/010-email-patch-import/review.md b/specs/010-email-patch-import/review.md
new file mode 100644
index 0000000..da282ef
--- /dev/null
+++ b/specs/010-email-patch-import/review.md
@@ -0,0 +1,59 @@
# Pre-Implementation Review

**Feature**: Email/Format-Patch Import
**Artifacts reviewed**: spec.md, plan.md, tasks.md, checklists/requirements.md
**Review model**: Claude Opus 4.6 (1M context)
**Generating model**: Unknown (spec/plan generated in earlier session)

## Summary

| Dimension | Verdict | Issues |
|-----------|---------|--------|
| Spec-Plan Alignment | PASS | All user stories and requirements addressed |
| Plan-Tasks Completeness | PASS | Every plan component has corresponding tasks |
| Dependency Ordering | PASS | Correct phase ordering, no circular dependencies |
| Parallelization Correctness | PASS | [P] markers accurate, no same-file conflicts in parallel groups |
| Feasibility & Risk | WARN | Two API concerns worth noting (see findings) |
| Standards Compliance | PASS | Constitution is a template (not customized), no violations possible |
| Implementation Readiness | WARN | One task needs more specificity (see findings) |

**Overall**: READY WITH WARNINGS

## Findings

### Critical (FAIL -- must fix before implementing)

None.

### Warnings (WARN -- recommend fixing, can proceed)

1. **`Diff::from_buffer` lifetime mismatch**: `git2::Diff::from_buffer()` returns `Diff<'static>` but `repo.apply_to_tree()` accepts `&Diff<'_>`. This will work, but the diff must be parsed from the raw diff portion of the patch file only (from `diff --git` onward), not the full mbox content. The plan states this correctly ("everything from `diff --git` onward"), but if the parser includes mbox headers in the buffer passed to `from_buffer()`, it will fail silently or produce wrong results. **Recommendation**: T011 should explicitly note that only the diff portion (starting at `diff --git`) is passed to `Diff::from_buffer()`.

2. **CLI `files` field design for T022**: The plan describes changing `file: PathBuf` to accept multiple files for series mode, but clap's `Vec<PathBuf>` with `num_args = 1..` would make the single-file case also use a `Vec`. T022 and T023 should clarify that `import()` receives `&[PathBuf]` with the first element used for single mode and the full slice for series mode, or use separate positional args. This is a minor design choice but should be decided before implementation to avoid rework.

3. **Author signature extraction from patch file**: T011 describes parsing `From:` / `Date:` headers to preserve the original contributor's author identity. The `git2::Signature` requires name, email, and a `Time` struct. Parsing RFC 2822 dates (as used in `git format-patch` output) into `git2::Time` requires manual conversion -- there is no built-in parser in git2. The `chrono` crate (already a dependency) can parse RFC 2822 via `DateTime::parse_from_rfc2822()` and convert to a Unix timestamp for `git2::Time::new()`. This is feasible but non-trivial and should be explicitly called out in T011.

4. **`apply_to_tree` failure modes**: When `apply_to_tree()` fails due to conflicts, git2 returns a generic `Error` with class `Apply`. The error message may not be descriptive enough for the user. T026 mentions binary diffs but the plan should also consider that `apply_to_tree` does not produce merge conflicts in the traditional sense -- it simply fails. The error handling in T013 should wrap the git2 error with a user-friendly message like "patch does not apply cleanly to base branch '<name>'".

### Observations (informational)

1. **No new crates needed**: Confirmed that `git2::Diff::from_buffer()` and `Repository::apply_to_tree()` both exist in git2 0.19.0. Signatures verified against the vendored source.

2. **Existing `Error` enum already has `Io` variant**: The `std::io::Error` variant in `src/error.rs` already covers file-not-found cases. The proposed `PatchFileNotFound(PathBuf)` variant in T002 is more descriptive but overlaps. Either approach works; the explicit variant gives better error messages.

3. **US2 is pure verification**: Correctly identified as requiring no new code. The 4 test tasks (T015-T018) validate that `patch::create()` produces DAG entries fully compatible with existing `show/review/comment/merge` operations. This is the right approach.

4. **Temp branch naming**: The plan uses `refs/heads/collab/imported/<short-oid>`. This creates branches visible in `git branch` output. Consider whether these should be cleaned up after merge, or documented as expected behavior. The spec does not address cleanup.

5. **Duplicate import (EC-004)**: The spec explicitly states "two separate DAG entries are created." This is correct and matches the existing `patch::create()` behavior -- each call creates a new orphan commit with a new OID.

6. **Working tree safety (FR-007)**: Using `apply_to_tree()` operates entirely in-memory on tree objects, never touching the working directory or index. This is the correct approach and satisfies FR-007 by design.

7. **Task count is proportional**: 29 tasks for a medium-complexity feature (new subcommand, file parsing, git operations, series mode) is appropriate. The breakdown avoids tasks that are too large or too granular.

## Recommended Actions

- [ ] Clarify in T011 that `Diff::from_buffer()` receives only the diff portion (from first `diff --git` line), not the full mbox content
- [ ] Clarify in T011 that RFC 2822 date parsing uses `chrono::DateTime::parse_from_rfc2822()` converted to `git2::Time::new(unix_timestamp, offset_minutes)`
- [ ] Decide in T022 whether `Import` uses `files: Vec<PathBuf>` from the start (with single-file being `files.len() == 1`) or keeps `file: PathBuf` until the series phase
- [ ] Ensure T013 wraps `apply_to_tree()` errors with user-friendly context about which base branch the patch failed to apply against
diff --git a/specs/010-email-patch-import/spec.md b/specs/010-email-patch-import/spec.md
new file mode 100644
index 0000000..3cdb1f2
--- /dev/null
+++ b/specs/010-email-patch-import/spec.md
@@ -0,0 +1,113 @@
# Feature Specification: Email/Format-Patch Import

**Feature Branch**: `010-email-patch-import`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Support email format-patch based contributions for mailing list style workflow"

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Import a .patch file and create patch DAG entry (Priority: P1)

A contributor without push access generates a `.patch` file using `git format-patch` and sends it to a maintainer (via email, file sharing, etc.). The maintainer runs `git collab patch import <file.patch>` which:
1. Parses and validates the patch file.
2. Applies the patch to a temporary branch derived from the current HEAD (or a specified base).
3. Records the resulting commit OID.
4. Creates a patch DAG entry via the existing `patch::create()` infrastructure so that the imported patch appears in `git collab patch list` and is reviewable.

**Why this priority**: This is the core feature. Without import, nothing else works. It delivers the fundamental value proposition: enabling contributions without push access.

**Independent Test**: Can be fully tested by generating a `.patch` file with `git format-patch`, running `git collab patch import <file>`, and verifying the patch appears in `git collab patch list` with correct title, author metadata, and head commit OID.

**Acceptance Scenarios**:

1. **Given** a valid `.patch` file generated by `git format-patch`, **When** the maintainer runs `git collab patch import my.patch`, **Then** the patch is applied to a temp branch `collab/imported/<short-oid>`, a patch DAG entry is created with the commit message as the title, and the patch ID is printed to stdout.
2. **Given** a valid `.patch` file with a commit message containing a subject and body, **When** imported, **Then** the subject becomes the patch title and the body becomes the patch description.
3. **Given** a `.patch` file that does not apply cleanly to the base branch, **When** the maintainer runs `git collab patch import my.patch`, **Then** the command exits with a clear error message indicating the conflict and no DAG entry is created.
4. **Given** a malformed file that is not a valid `git format-patch` output, **When** the maintainer runs `git collab patch import bad.txt`, **Then** the command exits with error "not a valid patch file" and no side effects occur.
5. **Given** the `--base` flag is provided, **When** the maintainer runs `git collab patch import --base develop my.patch`, **Then** the patch is applied against the `develop` branch instead of the default `main`.

---

### User Story 2 - Review imported patches using existing review flow (Priority: P2)

Once a patch has been imported, a maintainer or reviewer uses the existing `git collab patch review`, `git collab patch comment`, `git collab patch diff`, and `git collab patch merge` commands exactly as they would for any locally-created patch. The imported patch is indistinguishable from a push-access patch in the review flow.

**Why this priority**: Leverages existing infrastructure. Validates that the import creates a fully compatible DAG entry. No new code needed if P1 is done correctly, but must be explicitly verified.

**Independent Test**: Import a patch, then run `git collab patch show <id>`, `git collab patch diff <id>`, `git collab patch review <id>`, and `git collab patch merge <id>` to verify each works.

**Acceptance Scenarios**:

1. **Given** an imported patch, **When** the reviewer runs `git collab patch show <id>`, **Then** the patch details are displayed including title, author, base, head commit, and body.
2. **Given** an imported patch, **When** the reviewer runs `git collab patch diff <id>`, **Then** the diff between base and the imported commit is shown.
3. **Given** an imported patch that has been approved, **When** the maintainer runs `git collab patch merge <id>`, **Then** the imported commit is merged into the base branch as normal.
4. **Given** an imported patch, **When** the reviewer runs `git collab patch comment <id> --body "looks good"`, **Then** the comment is appended to the patch DAG.

---

### User Story 3 - Multi-patch series import (Priority: P3)

A contributor generates a multi-commit patch series with `git format-patch -n`. The maintainer imports the entire series at once with `git collab patch import *.patch` or `git collab patch import --series 0001.patch 0002.patch 0003.patch`. The patches are applied sequentially, and a single patch DAG entry is created whose head commit points to the final commit in the series. The title is derived from the cover letter (if present) or the first patch subject.

**Why this priority**: Multi-patch series are common in mailing-list workflows but are an enhancement over the single-patch MVP. Can be deferred without blocking basic usage.

**Independent Test**: Generate a 3-commit series with `git format-patch -3`, import all three, and verify a single patch DAG entry is created with the correct head commit pointing to the last applied commit.

**Acceptance Scenarios**:

1. **Given** multiple `.patch` files from a series, **When** the maintainer runs `git collab patch import --series 0001.patch 0002.patch 0003.patch`, **Then** all patches are applied sequentially and a single patch DAG entry is created with the head commit being the final applied commit.
2. **Given** a series where the second patch fails to apply, **When** the maintainer runs the import, **Then** the command rolls back all applied patches, prints an error identifying which patch failed, and no DAG entry is created.
3. **Given** a series with a cover letter (`0000-cover-letter.patch`), **When** imported, **Then** the cover letter subject is used as the patch title and its body as the patch description.

---

### Edge Cases

- What happens when the patch file path does not exist? The command exits with error "file not found: <path>".
- What happens when the patch file is empty (0 bytes)? The command exits with error "not a valid patch file".
- What happens when the patch was generated against a very different tree? The apply fails with a conflict error, no DAG entry is created.
- What happens when the user imports the same patch twice? Two separate DAG entries are created (idempotency is not enforced; each import is a distinct submission).
- What happens when the patch file contains binary diffs? The import handles binary diffs if git2 supports them; otherwise a clear error is returned.
- What happens when the base branch does not exist? The command exits with error "base branch '<name>' not found".
- What happens when the repository has uncommitted changes? The import operates on a detached temp branch and does not affect the working tree; uncommitted changes are irrelevant.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST accept a file path argument pointing to a `.patch` file generated by `git format-patch`.
- **FR-002**: System MUST parse the patch file to extract the commit message subject (title), body (description), and diff content.
- **FR-003**: System MUST apply the patch to create a real git commit on a temporary branch (`collab/imported/<short-oid>`).
- **FR-004**: System MUST create a patch DAG entry using the existing `patch::create()` function with the resulting commit OID.
- **FR-005**: System MUST default `--base` to `main` if not specified.
- **FR-006**: System MUST validate that the patch file is a valid `git format-patch` output before attempting to apply.
- **FR-007**: System MUST NOT modify the working tree or current branch when importing a patch.
- **FR-008**: System MUST report the created patch ID to stdout on success.
- **FR-009**: System MUST exit with a non-zero status and descriptive error for malformed patches, missing files, apply conflicts, and missing base branches.
- **FR-010**: System MUST support the `--series` flag for importing multiple patch files as a single patch DAG entry (P3).
- **FR-011**: System MUST roll back any partially-applied commits if a series import fails partway through (P3).

### Key Entities

- **PatchFile**: A file on disk containing `git format-patch` output. Key attributes: path, parsed subject, parsed body, raw diff content.
- **ImportedPatch**: The result of applying a patch file. Key attributes: commit OID, temp branch name, base ref, title, body.
- **PatchDAGEntry**: An existing entity (via `patch::create()`). Links the imported commit to the collab review system under `refs/collab/patches/<id>`.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: A single `.patch` file generated by `git format-patch` can be imported and reviewed end-to-end (import, show, diff, review, merge) without errors.
- **SC-002**: Malformed or conflicting patch files produce clear, actionable error messages and no side effects (no orphan branches, no partial DAG entries).
- **SC-003**: The imported patch is indistinguishable from a locally-created patch when viewed via `git collab patch list` and `git collab patch show`.
- **SC-004**: A multi-patch series (3+ patches) can be imported as a single reviewable unit (P3).
- **SC-005**: All new code passes `cargo test` and `cargo clippy` with no warnings.

## Assumptions

- Contributors have access to `git format-patch` (standard git tooling).
- The maintainer has push access and a local clone of the repository.
- Patch files follow the standard `git format-patch` mbox format.
- The `git2` crate's `Diff::from_buffer()` and `apply()` APIs can parse and apply format-patch output.
- The existing `patch::create()` and DAG infrastructure does not need modification to support externally-created commits.
diff --git a/specs/010-email-patch-import/tasks.md b/specs/010-email-patch-import/tasks.md
new file mode 100644
index 0000000..3dae834
--- /dev/null
+++ b/specs/010-email-patch-import/tasks.md
@@ -0,0 +1,161 @@
# Tasks: Email/Format-Patch Import

**Input**: Design documents from `/specs/010-email-patch-import/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)

**Tests**: Included per TDD preference (user memory). Tests written first, must fail before implementation.

**Organization**: Tasks grouped by user story. Primary changes in `src/patch.rs`, `src/cli.rs`, `src/main.rs`, `src/error.rs`.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

## Phase 1: Setup (Shared Infrastructure)

**Purpose**: Add CLI wiring and error variants for the import subcommand

- [ ] T001 [US1] Add `Import` variant to `PatchCmd` enum in `src/cli.rs` with `file: PathBuf`, `--base` (default "main"), and `--series` flag
- [ ] T002 [P] [US1] Add import-specific error variants to `src/error.rs`: `PatchFileNotFound(PathBuf)`, `InvalidPatchFile(String)`, `PatchApplyConflict(String)`, `BaseBranchNotFound(String)`
- [ ] T003 [US1] Add match arm for `PatchCmd::Import` in `src/main.rs` that delegates to `patch::import()` (stub initially)

**Checkpoint**: Project compiles with `git collab patch import <file>` recognized by clap. Stub returns an error.

---

## Phase 2: User Story 1 - Import a .patch File and Create Patch DAG Entry (Priority: P1)

**Goal**: Parse a `git format-patch` output file, apply it to a temp branch, and create a patch DAG entry via existing `patch::create()` infrastructure.

**Independent Test**: Generate a `.patch` file with `git format-patch`, run `git collab patch import <file>`, verify the patch appears in `git collab patch list` with correct title and head commit OID.

### Tests for User Story 1

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

- [ ] T004 [P] [US1] Write integration test `test_import_single_patch` in `tests/patch_import.rs`: create a temp repo, make a commit, generate a `.patch` file with `git format-patch`, import it, verify DAG entry exists with correct title and commit OID
- [ ] T005 [P] [US1] Write integration test `test_import_malformed_file` in `tests/patch_import.rs`: try importing a plain text file, verify error contains "not a valid patch file"
- [ ] T006 [P] [US1] Write integration test `test_import_missing_file` in `tests/patch_import.rs`: try importing a nonexistent path, verify error contains "file not found"
- [ ] T007 [P] [US1] Write integration test `test_import_conflict` in `tests/patch_import.rs`: create a patch against a diverged base, verify error indicates conflict and no DAG entry is created
- [ ] T008 [P] [US1] Write integration test `test_import_custom_base` in `tests/patch_import.rs`: import with `base = "develop"`, verify `base_ref` in DAG entry matches "develop"
- [ ] T009 [P] [US1] Write integration test `test_import_missing_base_branch` in `tests/patch_import.rs`: import with a nonexistent base branch, verify error contains "base branch"
- [ ] T010 [P] [US1] Write integration test `test_import_empty_file` in `tests/patch_import.rs`: import an empty (0-byte) file, verify error contains "not a valid patch file"

### Implementation for User Story 1

- [ ] T011 [US1] Add `patch::parse_patch_file()` helper in `src/patch.rs`: read file from disk, validate `From ` line prefix and `Subject:` header, parse subject (strip `[PATCH]` prefix), parse body (between headers and `---`), extract raw diff content (from `diff --git` onward). Return a struct with `title`, `body`, `diff_bytes`, `author_name`, `author_email`, `author_date`.
- [ ] T012 [US1] Add `patch::apply_patch_to_tree()` helper in `src/patch.rs`: resolve base branch OID via `repo.revparse_single()`, get base commit tree, call `git2::Diff::from_buffer(diff_bytes)`, call `repo.apply_to_tree(&base_tree, &diff, None)` to get new index, write tree via `index.write_tree_to(repo)`, create commit with original author signature and maintainer as committer, parent = base commit. Return the new commit OID.
- [ ] T013 [US1] Add `patch::import()` public function in `src/patch.rs`: validate file exists (return `PatchFileNotFound` if not), validate base branch exists (return `BaseBranchNotFound` if not), call `parse_patch_file()`, call `apply_patch_to_tree()`, create temp branch ref `refs/heads/collab/imported/<short-oid>`, call existing `patch::create(repo, &title, &body, &base, &commit_oid_str)`, print patch ID to stdout. Return `Ok(patch_id)`.
- [ ] T014 [US1] Update match arm in `src/main.rs` to call `patch::import()` with the parsed arguments (replacing the stub from T003)

**Checkpoint**: Single `.patch` file import works end-to-end. All US1 tests pass. `cargo clippy` clean.

---

## Phase 3: User Story 2 - Review Imported Patches Using Existing Review Flow (Priority: P2)

**Goal**: Verify that imported patches are fully compatible with existing review, comment, diff, and merge commands. No new code expected -- this phase is primarily verification.

**Independent Test**: Import a patch, then run `patch::show()`, `patch::review()`, `patch::comment()`, and `patch::merge()` on it.

### Tests for User Story 2

- [ ] T015 [P] [US2] Write integration test `test_imported_patch_show` in `tests/patch_import.rs`: import a patch, call `patch::show()` on it, verify it displays title, author, base, head commit
- [ ] T016 [P] [US2] Write integration test `test_imported_patch_review` in `tests/patch_import.rs`: import a patch, call `patch::review()` with approve verdict, verify review is appended to DAG
- [ ] T017 [P] [US2] Write integration test `test_imported_patch_comment` in `tests/patch_import.rs`: import a patch, call `patch::comment()`, verify comment is appended to DAG
- [ ] T018 [P] [US2] Write integration test `test_imported_patch_merge` in `tests/patch_import.rs`: import a patch, call `patch::merge()`, verify base branch is updated to the imported commit OID

**Checkpoint**: All review flow operations verified on imported patches. No new code needed if tests pass.

---

## Phase 4: User Story 3 - Multi-Patch Series Import with Rollback (Priority: P3)

**Goal**: Import multiple `.patch` files as a sequential series, creating a single patch DAG entry. Roll back on failure.

**Independent Test**: Generate a 3-commit series with `git format-patch -3`, import all three, verify a single patch DAG entry with head commit pointing to the last applied commit.

### Tests for User Story 3

> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**

- [ ] T019 [P] [US3] Write integration test `test_import_series` in `tests/patch_import.rs`: create 3 sequential commits, generate patch files, import with `--series`, verify single DAG entry with head commit = final applied commit
- [ ] T020 [P] [US3] Write integration test `test_import_series_rollback` in `tests/patch_import.rs`: create a series where patch 2 conflicts, verify error, no DAG entry created, no orphan temp branches remain
- [ ] T021 [P] [US3] Write integration test `test_import_series_cover_letter` in `tests/patch_import.rs`: create a series with cover letter, import, verify title comes from cover letter subject and body from cover letter body

### Implementation for User Story 3

- [ ] T022 [US3] Update `Import` variant in `src/cli.rs` to accept multiple files: change `file: PathBuf` to `files: Vec<PathBuf>` (positional args, `num_args = 1..`)
- [ ] T023 [US3] Update match arm in `src/main.rs` to dispatch to `patch::import()` for single file or `patch::import_series()` for `--series` with multiple files
- [ ] T024 [US3] Add `patch::import_series()` function in `src/patch.rs`: sort files by name, detect cover letter (`0000-cover-letter.patch`), extract title/body from cover letter or first patch, apply patches sequentially (each patch's parent = previous commit), on failure delete temp branch and return error with which patch failed, on success create single DAG entry with head_commit = final commit OID

**Checkpoint**: Multi-patch series import works. Rollback on failure verified. All US3 tests pass.

---

## Phase 5: Polish and Edge Cases

**Purpose**: Error message quality, edge cases, final cleanup

- [ ] T025 [P] Ensure `patch::import()` does not modify working tree or current branch (verify FR-007) -- add assertion test in `tests/patch_import.rs`
- [ ] T026 [P] Handle binary diffs gracefully in `src/patch.rs`: if `git2::Diff::from_buffer()` or `apply_to_tree()` fails on binary content, return a clear error message mentioning binary diffs
- [ ] T027 Verify duplicate import creates separate DAG entries (edge case EC-004) -- add test in `tests/patch_import.rs`
- [ ] T028 Run `cargo clippy` and fix any warnings introduced by the new code
- [ ] T029 Run `cargo test` and verify all existing and new tests pass

---

## Dependencies & Execution Order

### Phase Dependencies

- **Setup (Phase 1)**: No dependencies -- can start immediately
- **User Story 1 (Phase 2)**: Depends on Phase 1 -- BLOCKS US2 and US3
- **User Story 2 (Phase 3)**: Depends on Phase 2 (needs working import)
- **User Story 3 (Phase 4)**: Depends on Phase 2 (extends import to series)
- **Polish (Phase 5)**: Depends on all user stories

### Parallel Opportunities

- **Phase 1**: T001 and T002 can run in parallel (different files: `src/cli.rs` vs `src/error.rs`). T003 depends on T001.
- **Phase 2 tests**: T004-T010 can all run in parallel (same test file but independent tests)
- **Phase 2 implementation**: T011-T013 are sequential (each builds on previous). T014 depends on T013.
- **Phase 3 tests**: T015-T018 can all run in parallel
- **Phase 4 tests**: T019-T021 can all run in parallel
- **Phase 4 implementation**: T022-T024 are sequential
- **Phase 5**: T025-T027 can run in parallel

### Within Each User Story

- Tests MUST be written and FAIL before implementation
- Implementation tasks are sequential within a story

---

## Implementation Strategy

### MVP First (User Story 1 Only)

1. Complete Phase 1: Setup (T001-T003)
2. Complete Phase 2: User Story 1 tests + implementation (T004-T014)
3. **STOP and VALIDATE**: Test single-file import end-to-end
4. Deploy/demo if ready

### Incremental Delivery

1. Setup + US1 -> Single patch import works -> Validate
2. Add US2 -> Review flow verified -> Validate
3. Add US3 -> Series import with rollback -> Validate
4. Polish -> Edge cases, cleanup -> Final validation

---

## Notes

- 29 total tasks: 3 setup + 7 US1 tests + 4 US1 impl + 4 US2 tests + 3 US3 tests + 3 US3 impl + 5 polish
- Primary files: `src/patch.rs` (core logic), `src/cli.rs` (CLI wiring), `src/main.rs` (dispatch), `src/error.rs` (error variants), `tests/patch_import.rs` (all tests)
- No new crates required -- `git2::Diff::from_buffer()` and `repo.apply_to_tree()` are available in git2 0.19
- No new `Action` variant needed -- uses existing `Action::PatchCreate`
- TDD approach per user preference: tests first, then implement
diff --git a/specs/011-dashboard-testing/checklists/requirements.md b/specs/011-dashboard-testing/checklists/requirements.md
new file mode 100644
index 0000000..e99c0d8
--- /dev/null
+++ b/specs/011-dashboard-testing/checklists/requirements.md
@@ -0,0 +1,29 @@
# Requirements Checklist: 010-dashboard-testing

## Functional Requirements

- [x] **FR-001**: App struct and its enums (Tab, Pane, ViewMode) MUST be accessible from test code, either via pub(crate) visibility or an in-module test submodule
- [x] **FR-002**: Unit tests MUST cover all state-mutating methods: move_selection, switch_tab, toggle show_all, toggle ViewMode, pane switching
- [x] **FR-003**: Unit tests MUST verify boundary conditions: selection at 0 moving up, selection at last moving down, empty lists
- [x] **FR-004**: Render tests MUST use ratatui::backend::TestBackend to capture rendered output without a real terminal
- [x] **FR-005**: Render tests MUST verify presence of key UI elements (tab bar, list items, detail content, footer) in the buffer
- [x] **FR-006**: Integration tests MUST support feeding sequences of key events through the key-handling logic
- [x] **FR-007**: All tests MUST be isolated -- no shared mutable state between tests
- [x] **FR-008**: Tests requiring git repos MUST use tempfile + git2::Repository::init for ephemeral repos
- [x] **FR-009**: Tests MUST NOT require a real terminal or user interaction
- [x] **FR-010**: All tests MUST pass with `cargo test` and produce no warnings with `cargo clippy`

## User Stories

- [x] **US-001 (P1)**: Unit test App state transitions (key handling, mode changes, filtering) without rendering
- [x] **US-002 (P2)**: Snapshot/render tests to verify TUI output for known states
- [x] **US-003 (P3)**: Integration tests that simulate key sequences and verify final state

## Success Criteria

- [x] **SC-001**: At least 10 unit tests covering all App state transition methods pass in `cargo test`
- [x] **SC-002**: At least 3 render/snapshot tests verify TUI layout for different App states
- [x] **SC-003**: At least 2 integration tests verify multi-step key sequences produce correct final state
- [x] **SC-004**: All tests complete in under 5 seconds total
- [x] **SC-005**: Zero clippy warnings in test code
- [x] **SC-006**: Tests run successfully in CI without a real terminal (headless)
diff --git a/specs/011-dashboard-testing/plan.md b/specs/011-dashboard-testing/plan.md
new file mode 100644
index 0000000..06aa847
--- /dev/null
+++ b/specs/011-dashboard-testing/plan.md
@@ -0,0 +1,145 @@
# Implementation Plan: Dashboard Testing Infrastructure

**Branch**: `011-dashboard-testing` | **Date**: 2026-03-21 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/011-dashboard-testing/spec.md`

## Summary

Add automated testing infrastructure for the TUI dashboard. The App struct in `src/tui.rs` currently has private visibility and tightly couples state management with rendering, making it untestable from outside the module. This plan refactors App to be testable and adds three layers of tests: unit tests for state transitions, render tests using ratatui's TestBackend, and integration tests simulating key sequences.

## Technical Context

**Language/Version**: Rust 2021 edition
**Primary Dependencies**: ratatui 0.30 (includes `backend::TestBackend` for headless render testing), crossterm 0.29, git2 0.19
**Storage**: N/A (tests use ephemeral tempfile repos)
**Testing**: `cargo test`, `cargo clippy`
**Target Platform**: Linux (CI-compatible, headless)
**Project Type**: CLI tool with TUI
**Performance Goals**: All tests complete in under 5 seconds
**Constraints**: No real terminal required; tests must be deterministic and isolated
**Scale/Scope**: ~15-20 test functions across 3 test layers

## Constitution Check

No violations. This feature adds tests only -- no new runtime dependencies, no architectural changes beyond visibility adjustments.

## Project Structure

### Documentation (this feature)

```text
specs/010-dashboard-testing/
├── spec.md
├── plan.md
└── checklists/
    └── requirements.md
```

### Source Code (repository root)

```text
src/
├── tui.rs           # Refactored: make App, Tab, Pane, ViewMode pub(crate);
│                    #   extract handle_key() from run_loop() inline match
└── ...              # No other source changes

src/tui/
└── tests.rs         # In-module #[cfg(test)] tests (unit + render + integration)
                     # OR: tests added as #[cfg(test)] mod tests at bottom of tui.rs
```

**Structure Decision**: Tests will live inside `src/tui.rs` as a `#[cfg(test)] mod tests` block. This is the simplest approach because it gives tests direct access to private types (App, Tab, Pane, ViewMode) without requiring visibility changes. If the test module grows too large, it can later be extracted to `src/tui/tests.rs` by converting `tui.rs` to `tui/mod.rs`.

## Implementation Phases

### Phase 1: Refactor for testability (P1 prerequisite)

**Goal**: Make the App struct and key-handling logic testable without a terminal.

**Changes to `src/tui.rs`**:

1. Extract key-handling logic from the inline `match key.code` block in `run_loop()` into a standalone method `App::handle_key(&mut self, key: KeyCode, modifiers: KeyModifiers) -> bool` that returns `false` when the app should quit. This separates input processing from the event polling loop and terminal I/O.

2. No visibility changes needed since tests will be in-module (`#[cfg(test)] mod tests` inside `tui.rs`).

**Rationale**: The current `run_loop()` reads terminal events, handles keys, and renders -- all in one loop. Extracting `handle_key()` lets tests call it directly with synthetic key events, without needing crossterm's event system or a real terminal.

### Phase 2: Test helpers and fixtures

**Goal**: Create reusable test helper functions for constructing App instances with known data.

**Helper functions** (inside `#[cfg(test)] mod tests`):

- `make_test_issues(n: usize) -> Vec<IssueState>` -- creates `n` issues with predictable IDs, titles, and statuses (alternating open/closed)
- `make_test_patches(n: usize) -> Vec<PatchState>` -- creates `n` patches with predictable data
- `make_app(issues: usize, patches: usize) -> App` -- convenience wrapper
- `buffer_to_string(buf: &Buffer) -> String` -- extracts text content from a ratatui Buffer for assertions
- `assert_buffer_contains(buf: &Buffer, expected: &str)` -- asserts a string appears somewhere in the rendered buffer

**Data construction**: Test fixtures use `IssueState` and `PatchState` structs directly (from `crate::state`), populated with hardcoded values. No git repos needed for unit or render tests -- only integration tests that exercise `App::reload()` would need a repo, and those can use `tempfile::TempDir` + `git2::Repository::init()`.

### Phase 3: Unit tests for state transitions (P1)

**Goal**: Cover all state-mutating methods with fast, isolated unit tests.

**Tests** (each is a `#[test]` function):

| Test name | What it verifies |
|-----------|-----------------|
| `test_new_app_defaults` | Initial state: tab=Issues, selection=Some(0), scroll=0, pane=ItemList, mode=Details, show_all=false |
| `test_new_app_empty` | When created with empty issues/patches, selection=None |
| `test_move_selection_down` | move_selection(1) increments selected index |
| `test_move_selection_up` | move_selection(-1) decrements selected index |
| `test_move_selection_clamp_bottom` | move_selection(1) at last index stays at last index |
| `test_move_selection_clamp_top` | move_selection(-1) at index 0 stays at 0 |
| `test_move_selection_empty_list` | move_selection on empty list is a no-op |
| `test_switch_tab_issues_to_patches` | switch_tab resets selection, scroll, mode, pane |
| `test_switch_tab_same_tab_noop` | switch_tab to current tab is a no-op |
| `test_toggle_show_all` | Toggling show_all changes visible count and resets selection |
| `test_visible_issues_filters_closed` | visible_issues() excludes closed when show_all=false |
| `test_visible_patches_filters_closed` | visible_patches() excludes closed/merged when show_all=false |
| `test_handle_key_quit` | handle_key('q') returns false (quit signal) |
| `test_handle_key_tab_switch` | handle_key('1'/'2') switches tabs |
| `test_handle_key_diff_toggle_patches` | handle_key('d') on Patches tab toggles ViewMode |
| `test_handle_key_diff_noop_issues` | handle_key('d') on Issues tab does nothing |
| `test_handle_key_pane_toggle` | handle_key(Tab/Enter) toggles pane |
| `test_scroll_in_detail_pane` | j/k in Detail pane changes scroll, not selection |

### Phase 4: Render/snapshot tests (P2)

**Goal**: Verify TUI layout renders correctly for known states.

**Approach**: Use `ratatui::Terminal::new(TestBackend::new(width, height))` to create an in-memory terminal. Call `terminal.draw(|frame| ui(frame, &mut app))` and inspect `terminal.backend().buffer()`.

**Tests**:

| Test name | What it verifies |
|-----------|-----------------|
| `test_render_issues_tab` | Tab bar shows "1:Issues", list shows issue titles, detail shows selected issue info |
| `test_render_patches_tab` | Tab bar highlights "2:Patches", list shows patch titles |
| `test_render_empty_state` | Detail pane shows "No issues to display." |
| `test_render_footer_keys` | Footer contains key hints (j/k, Tab, q, etc.) |
| `test_render_small_terminal` | Rendering to a 20x10 backend does not panic |

### Phase 5: Integration tests with key sequences (P3)

**Goal**: Verify that realistic multi-step user interactions produce correct final state.

**Tests**:

| Test name | What it verifies |
|-----------|-----------------|
| `test_navigate_to_patch_and_view_diff` | Key sequence: j, j, 2, j, d -> Patches tab, idx 1, Diff mode |
| `test_toggle_filter_and_navigate` | Key sequence: a, j, j -> show_all=true, sees closed items, selection at 2 |
| `test_pane_switching_scroll_isolation` | Tab into detail, scroll down, Tab back -> selection unchanged |

### Phase 6: CI considerations

- All tests use `#[cfg(test)]` and run with `cargo test` -- no special CI setup needed
- `TestBackend` requires no terminal, so tests work in headless CI environments
- No new dev-dependencies needed (`tempfile` is already in Cargo.toml dev-dependencies)
- ratatui 0.30's `TestBackend` is included in the default ratatui crate (no feature flags needed)

## Complexity Tracking

No constitution violations. This feature adds only test code and a minor refactor (extracting `handle_key()`). No new dependencies, no new modules, no architectural changes.
diff --git a/specs/011-dashboard-testing/review.md b/specs/011-dashboard-testing/review.md
new file mode 100644
index 0000000..d25482d
--- /dev/null
+++ b/specs/011-dashboard-testing/review.md
@@ -0,0 +1,112 @@
# Fleet Review: 011-dashboard-testing

**Reviewed**: spec.md, plan.md, tasks.md
**Date**: 2026-03-21

---

## 1. Spec-Plan Alignment

**Status**: PASS with minor note

The plan covers all three user stories (US1: unit tests, US2: render tests, US3: integration tests) and maps directly to the spec's acceptance scenarios. All functional requirements (FR-001 through FR-010) are addressed.

**Minor discrepancy**: The plan's project structure section references `specs/010-dashboard-testing/` (wrong number -- should be `011`). This is cosmetic and does not affect implementation.

**Decision on visibility**: The spec (edge cases section, FR-001) suggests making App `pub(crate)` OR using an in-module test submodule. The plan correctly chooses the in-module approach, which is simpler and avoids polluting the crate's public API. Good decision.

---

## 2. Plan-Tasks Completeness

**Status**: PASS

All plan phases are covered by tasks:

| Plan Phase | Tasks | Coverage |
|-----------|-------|----------|
| Phase 1: Refactor for testability | T001-T004 | handle_key extraction, run_loop update, helpers |
| Phase 2: Test helpers | T003, T026, T033 | Spread across phases where needed |
| Phase 3: Unit tests (US1) | T005-T025 | 21 tests -- exceeds spec's SC-001 (10 minimum) |
| Phase 4: Render tests (US2) | T026-T032 | 6 render tests -- exceeds spec's SC-002 (3 minimum) |
| Phase 5: Integration tests (US3) | T033-T036 | 3 integration tests -- exceeds spec's SC-003 (2 minimum) |
| Phase 6: CI/Polish | T037-T039 | Clippy, timing, headless verification |

All spec edge cases are addressed:
- Empty list navigation: T011, T023
- show_all persistence across tabs: T024
- Selection clamping on filter toggle: T025
- Small terminal: T032

---

## 3. Dependency Ordering

**Status**: PASS

- Phase 1 (refactor) has no prerequisites and must complete before any tests.
- T001 (handle_key extraction) is correctly ordered before T017-T022 (handle_key tests) and T033-T036 (integration tests).
- T003 (helpers) is correctly ordered before Phase 2 tests.
- T026 (buffer helpers) is correctly ordered before Phase 3 render tests.
- T033 (apply_keys helper) is correctly ordered before Phase 4 integration tests.
- Phases 2 and 3 can run in parallel after Phase 1, which is correctly noted.

---

## 4. Feasibility

### 4a. ratatui TestBackend in 0.30

**Status**: VERIFIED

`ratatui::backend::TestBackend` is present in the ratatui 0.30.0 source at `~/.cargo/registry`. It is used extensively in ratatui's own test suite and doc examples. Usage pattern: `Terminal::new(TestBackend::new(width, height))`, then inspect `terminal.backend().buffer()`. No feature flags needed.

### 4b. App struct visibility

**Status**: VERIFIED -- No changes needed

The plan correctly identifies that in-module `#[cfg(test)] mod tests` has access to all private types in `src/tui.rs`. The App struct, Tab, Pane, ViewMode enums, and all methods are accessible without any visibility changes. This is the simplest approach.

### 4c. handle_key extraction feasibility

**Status**: VERIFIED -- Straightforward

The current key handling in `run_loop()` (lines 258-308 of `src/tui.rs`) is a self-contained `match` block on `key.code`. It only mutates `app` fields and calls `app.reload(repo)` for the 'r' key. Extraction into `App::handle_key()` is clean with one caveat:

**Caveat**: The 'r' (reload) key calls `app.reload(repo)` which requires a `&Repository`. The extracted `handle_key()` method either needs a `repo` parameter or the reload case needs special handling. The plan's signature `handle_key(&mut self, code: KeyCode, modifiers: KeyModifiers) -> bool` does not include `repo`. Options:
1. Add `repo: Option<&Repository>` parameter
2. Return an enum `HandleResult { Continue, Quit, Reload }` and let `run_loop` handle reload
3. Accept `repo: &Repository` as a parameter

**Recommendation**: Option 2 (return enum) is cleanest -- it keeps handle_key free of git2 dependency for testing, and most tests don't need reload behavior. This is a minor design decision to resolve during T001.

### 4d. Test data construction

**Status**: VERIFIED with note

`IssueState` and `PatchState` are `pub` structs with `pub` fields, so they can be constructed directly in tests. The `Comment` struct contains a `commit_id: Oid` field from git2. This can be satisfied with `Oid::from_bytes(b"00000000000000000000").unwrap()` (20 bytes) without needing a real repository. No blocker.

---

## 5. Risk Assessment

| Risk | Severity | Mitigation |
|------|----------|------------|
| handle_key extraction breaks existing behavior | Low | T004 verifies cargo test + clippy pass after refactor; run_loop logic is straightforward |
| TestBackend API differences in ratatui 0.30 vs docs | Low | Verified in source; ratatui's own tests use the same pattern |
| Comment::commit_id requires Oid construction | Low | Oid::from_bytes with 20 zero bytes works without a repo |
| Test module grows too large in tui.rs | Medium | Plan acknowledges this; can extract to tui/tests.rs later if needed |
| 'r' key (reload) handling in handle_key | Low | Addressed in feasibility 4c; minor design decision for T001 |

---

## 6. Summary

The spec, plan, and tasks are well-aligned and feasible. The feature is low-risk since it adds only test code and a minor refactor. Key findings:

1. **One design decision needed**: handle_key() must handle the 'r'/reload case that requires a Repository. Recommend returning an action enum instead of a bool.
2. **One typo in plan**: `specs/010-dashboard-testing/` should be `specs/011-dashboard-testing/`.
3. **All external dependencies verified**: TestBackend exists in ratatui 0.30, tempfile is in dev-deps, Oid can be constructed without a repo.
4. **Task count exceeds all success criteria minimums**: 21 unit tests (min 10), 6 render tests (min 3), 3 integration tests (min 2).

**Verdict**: Ready for implementation.
diff --git a/specs/011-dashboard-testing/spec.md b/specs/011-dashboard-testing/spec.md
new file mode 100644
index 0000000..ad10f68
--- /dev/null
+++ b/specs/011-dashboard-testing/spec.md
@@ -0,0 +1,104 @@
# Feature Specification: Dashboard Testing Infrastructure

**Feature Branch**: `011-dashboard-testing`
**Created**: 2026-03-21
**Status**: Draft
**Input**: User description: "Add automated testing infrastructure for the TUI dashboard"

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Unit test App state transitions (Priority: P1)

As a developer, I want to unit test all App state transitions (key handling, mode changes, tab switching, filtering, selection movement) without rendering, so that state logic bugs are caught before they affect the UI.

**Why this priority**: State transitions are the core logic of the dashboard. Testing them without rendering is the simplest, fastest, and highest-value test layer. Every other test layer depends on correct state.

**Independent Test**: Can be fully tested by constructing an App with known data, calling state-mutating methods (move_selection, switch_tab, toggling show_all, toggling ViewMode), and asserting the resulting field values. Delivers confidence that keyboard handling produces correct state changes.

**Acceptance Scenarios**:

1. **Given** an App with 3 open issues, **When** move_selection(1) is called twice, **Then** list_state.selected() == Some(2)
2. **Given** an App on the Issues tab, **When** switch_tab(Tab::Patches) is called, **Then** tab == Tab::Patches, list_state resets to 0, scroll == 0, mode == ViewMode::Details
3. **Given** an App with show_all == false and 2 open + 1 closed issue, **When** show_all is toggled to true, **Then** visible_issue_count() == 3
4. **Given** an App with selection at index 0, **When** move_selection(-1) is called, **Then** selection remains at 0 (no underflow)
5. **Given** an App with selection at last index, **When** move_selection(1) is called, **Then** selection remains at last index (no overflow)
6. **Given** an App in Pane::ItemList, **When** Tab/Enter is pressed, **Then** pane switches to Pane::Detail
7. **Given** an App on Patches tab in ViewMode::Details, **When** 'd' is pressed, **Then** mode switches to ViewMode::Diff and scroll resets to 0
8. **Given** an App on Issues tab, **When** 'd' is pressed, **Then** mode does NOT change (diff toggle only applies to Patches)

---

### User Story 2 - Snapshot/render tests for TUI output (Priority: P2)

As a developer, I want snapshot/render tests that verify the TUI output for known states using ratatui's TestBackend, so that visual regressions in the dashboard layout are detected automatically.

**Why this priority**: Render tests build on correct state (P1) and verify that the visual output matches expectations. They catch layout bugs, missing widgets, and styling issues that unit tests cannot.

**Independent Test**: Can be tested by creating an App with known data, rendering via Terminal<TestBackend>, and asserting that specific strings appear in the rendered buffer at expected positions.

**Acceptance Scenarios**:

1. **Given** an App with 2 issues on the Issues tab, **When** ui() is rendered to a TestBackend, **Then** the buffer contains "1:Issues" and "2:Patches" in the tab bar, and both issue titles appear in the list area
2. **Given** an App with the detail pane focused, **When** rendered, **Then** the detail pane border is yellow and the list pane border is dark gray
3. **Given** an App on Patches tab with a selected patch, **When** rendered in ViewMode::Details, **Then** the detail pane shows "Patch Details" as title and displays patch metadata
4. **Given** an App with show_all == true, **When** rendered, **Then** the list block title contains "(all)"
5. **Given** an empty App (no issues, no patches), **When** rendered, **Then** the detail pane shows "No issues to display." or "No patches to display."

---

### User Story 3 - Integration tests simulating key sequences (Priority: P3)

As a developer, I want integration tests that simulate multi-step key sequences and verify the final App state, so that realistic user interaction flows are validated end-to-end without manual testing.

**Why this priority**: Integration tests exercise the full key-handling path including multiple sequential actions. They catch interaction bugs that individual unit tests miss, but are slower and depend on P1/P2 infrastructure.

**Independent Test**: Can be tested by constructing an App, feeding a sequence of simulated key events through the key-handling logic, and asserting the final state matches expectations.

**Acceptance Scenarios**:

1. **Given** an App with 3 issues and 2 patches, **When** the key sequence [j, j, 2, j, d] is processed, **Then** the app is on Patches tab, selection is at index 1, mode is ViewMode::Diff
2. **Given** an App with 2 open and 1 closed issue, **When** the key sequence [a, j, j] is processed, **Then** show_all is true and selection is at index 2 (the closed issue is now visible)
3. **Given** an App, **When** the key sequence [Tab, k, k, k] is processed in the detail pane, **Then** scroll decreases (saturating at 0) and selection does not change

---

### Edge Cases

- What happens when all items are filtered out (0 visible items) and a navigation key is pressed? Selection should remain None.
- What happens when switching tabs while show_all is true? show_all should persist, but selection resets.
- What happens when toggling show_all reduces visible count below current selection? Selection should clamp to last visible index.
- How does test isolation work when multiple tests run concurrently? Each test must use its own temp git repo or no repo at all.
- What happens when the terminal size is very small (e.g., 10x5)? Render tests should not panic.
- How do we handle the App struct being private to tui.rs? Tests need access -- either make App pub(crate) visibility or add a test module inside tui.rs.

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: App struct and its enums (Tab, Pane, ViewMode) MUST be accessible from test code, either via pub(crate) visibility or an in-module test submodule
- **FR-002**: Unit tests MUST cover all state-mutating methods: move_selection, switch_tab, toggle show_all, toggle ViewMode, pane switching
- **FR-003**: Unit tests MUST verify boundary conditions: selection at 0 moving up, selection at last moving down, empty lists
- **FR-004**: Render tests MUST use ratatui::backend::TestBackend to capture rendered output without a real terminal
- **FR-005**: Render tests MUST verify presence of key UI elements (tab bar, list items, detail content, footer) in the buffer
- **FR-006**: Integration tests MUST support feeding sequences of key events through the key-handling logic
- **FR-007**: All tests MUST be isolated -- no shared mutable state between tests
- **FR-008**: Tests requiring git repos MUST use tempfile + git2::Repository::init for ephemeral repos
- **FR-009**: Tests MUST NOT require a real terminal or user interaction
- **FR-010**: All tests MUST pass with `cargo test` and produce no warnings with `cargo clippy`

### Key Entities

- **App**: The central state struct holding tab, issues, patches, list_state, scroll, pane, mode, show_all, diff_cache
- **TestBackend**: ratatui's in-memory backend that captures rendered frames as a Buffer for assertions
- **IssueState / PatchState**: Data structs from src/state.rs used to populate App for testing

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: At least 10 unit tests covering all App state transition methods pass in `cargo test`
- **SC-002**: At least 3 render/snapshot tests verify TUI layout for different App states
- **SC-003**: At least 2 integration tests verify multi-step key sequences produce correct final state
- **SC-004**: All tests complete in under 5 seconds total
- **SC-005**: Zero clippy warnings in test code
- **SC-006**: Tests run successfully in CI without a real terminal (headless)
diff --git a/specs/011-dashboard-testing/tasks.md b/specs/011-dashboard-testing/tasks.md
new file mode 100644
index 0000000..976b3de
--- /dev/null
+++ b/specs/011-dashboard-testing/tasks.md
@@ -0,0 +1,152 @@
# Tasks: Dashboard Testing Infrastructure

**Input**: Design documents from `/specs/011-dashboard-testing/`
**Prerequisites**: plan.md (required), spec.md (required for user stories)

**Tests**: Tests ARE the deliverable for this feature. All phases include test tasks.

**Organization**: Tasks are grouped by phase matching the plan, with user story traceability.

## Format: `[ID] [P?] [Story] Description`

- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions

---

## Phase 1: Setup -- Refactor for Testability

**Purpose**: Extract handle_key() and prepare App for direct testing from in-module tests.

- [ ] T001 [US1] Extract key-handling logic from `run_loop()` inline match into `App::handle_key(&mut self, code: KeyCode, modifiers: KeyModifiers) -> bool` in `src/tui.rs` (returns false on quit)
- [ ] T002 [US1] Update `run_loop()` in `src/tui.rs` to call `app.handle_key(key.code, key.modifiers)` instead of inline match
- [ ] T003 [US1] Add `#[cfg(test)] mod tests` block at the bottom of `src/tui.rs` with test helper functions: `make_test_issues(n)`, `make_test_patches(n)`, `make_app(issues, patches)`
- [ ] T004 [US1] Verify `cargo test` and `cargo clippy` pass after refactor (no behavioral changes)

**Checkpoint**: handle_key() is callable directly; test helpers exist; all existing behavior preserved.

---

## Phase 2: User Story 1 -- Unit Tests for State Transitions (Priority: P1)

**Goal**: Cover all state-mutating methods with fast, isolated unit tests in `src/tui.rs` `mod tests`.

**Independent Test**: Construct App with known data, call state-mutating methods, assert field values.

### App Construction Tests

- [ ] T005 [P] [US1] Test `test_new_app_defaults` -- initial state: tab=Issues, selection=Some(0), scroll=0, pane=ItemList, mode=Details, show_all=false in `src/tui.rs`
- [ ] T006 [P] [US1] Test `test_new_app_empty` -- empty issues/patches yields selection=None in `src/tui.rs`

### Selection Movement Tests

- [ ] T007 [P] [US1] Test `test_move_selection_down` -- move_selection(1) increments selected index in `src/tui.rs`
- [ ] T008 [P] [US1] Test `test_move_selection_up` -- move_selection(-1) decrements selected index in `src/tui.rs`
- [ ] T009 [P] [US1] Test `test_move_selection_clamp_bottom` -- move_selection(1) at last index stays at last in `src/tui.rs`
- [ ] T010 [P] [US1] Test `test_move_selection_clamp_top` -- move_selection(-1) at index 0 stays at 0 in `src/tui.rs`
- [ ] T011 [P] [US1] Test `test_move_selection_empty_list` -- move_selection on empty list is a no-op in `src/tui.rs`

### Tab Switching Tests

- [ ] T012 [P] [US1] Test `test_switch_tab_issues_to_patches` -- switch_tab resets selection, scroll, mode, pane in `src/tui.rs`
- [ ] T013 [P] [US1] Test `test_switch_tab_same_tab_noop` -- switch_tab to current tab is a no-op in `src/tui.rs`

### Filtering Tests

- [ ] T014 [P] [US1] Test `test_toggle_show_all` -- toggling show_all changes visible count and resets selection in `src/tui.rs`
- [ ] T015 [P] [US1] Test `test_visible_issues_filters_closed` -- visible_issues() excludes closed when show_all=false in `src/tui.rs`
- [ ] T016 [P] [US1] Test `test_visible_patches_filters_closed` -- visible_patches() excludes non-open when show_all=false in `src/tui.rs`

### handle_key() Tests

- [ ] T017 [P] [US1] Test `test_handle_key_quit` -- handle_key('q') returns false in `src/tui.rs`
- [ ] T018 [P] [US1] Test `test_handle_key_tab_switch` -- handle_key('1'/'2') switches tabs in `src/tui.rs`
- [ ] T019 [P] [US1] Test `test_handle_key_diff_toggle_patches` -- handle_key('d') on Patches tab toggles ViewMode in `src/tui.rs`
- [ ] T020 [P] [US1] Test `test_handle_key_diff_noop_issues` -- handle_key('d') on Issues tab does nothing in `src/tui.rs`
- [ ] T021 [P] [US1] Test `test_handle_key_pane_toggle` -- handle_key(Tab/Enter) toggles pane in `src/tui.rs`
- [ ] T022 [P] [US1] Test `test_scroll_in_detail_pane` -- j/k in Detail pane changes scroll, not selection in `src/tui.rs`

### Edge Case Tests

- [ ] T023 [P] [US1] Test `test_navigate_empty_after_filter` -- selection remains None when all items filtered out in `src/tui.rs`
- [ ] T024 [P] [US1] Test `test_show_all_persists_across_tab_switch` -- show_all stays true after switch_tab in `src/tui.rs`
- [ ] T025 [P] [US1] Test `test_toggle_show_all_clamps_selection` -- if selection > new visible count, clamp to last in `src/tui.rs`

**Checkpoint**: At least 10 unit tests pass covering all state transition methods. SC-001 satisfied.

---

## Phase 3: User Story 2 -- Snapshot/Render Tests (Priority: P2)

**Goal**: Verify TUI layout renders correctly for known states using ratatui TestBackend.

**Independent Test**: Create App, render via Terminal<TestBackend>, assert buffer contents.

- [ ] T026 [US2] Add `buffer_to_string()` and `assert_buffer_contains()` helper functions in `src/tui.rs` `mod tests`
- [ ] T027 [P] [US2] Test `test_render_issues_tab` -- tab bar shows "1:Issues", list shows issue titles, detail shows selected issue in `src/tui.rs`
- [ ] T028 [P] [US2] Test `test_render_patches_tab` -- tab bar highlights "2:Patches", list shows patch titles in `src/tui.rs`
- [ ] T029 [P] [US2] Test `test_render_empty_state` -- detail pane shows "No issues to display." in `src/tui.rs`
- [ ] T030 [P] [US2] Test `test_render_footer_keys` -- footer contains key hints (j/k, Tab, q) in `src/tui.rs`
- [ ] T031 [P] [US2] Test `test_render_show_all_title` -- list block title contains "(all)" when show_all=true in `src/tui.rs`
- [ ] T032 [P] [US2] Test `test_render_small_terminal` -- rendering to 20x10 TestBackend does not panic in `src/tui.rs`

**Checkpoint**: At least 3 render tests verify TUI layout. SC-002 satisfied.

---

## Phase 4: User Story 3 -- Integration Tests with Key Sequences (Priority: P3)

**Goal**: Verify realistic multi-step user interactions produce correct final state.

**Independent Test**: Construct App, feed key sequence through handle_key(), assert final state.

- [ ] T033 [US3] Add `apply_keys(app, keys: &[(KeyCode, KeyModifiers)])` helper in `src/tui.rs` `mod tests`
- [ ] T034 [P] [US3] Test `test_navigate_to_patch_and_view_diff` -- key sequence [j, j, 2, j, d] produces Patches tab, idx 1, Diff mode in `src/tui.rs`
- [ ] T035 [P] [US3] Test `test_toggle_filter_and_navigate` -- key sequence [a, j, j] with mixed statuses, show_all=true, selection at 2 in `src/tui.rs`
- [ ] T036 [P] [US3] Test `test_pane_switching_scroll_isolation` -- Tab into detail, scroll down, Tab back, selection unchanged in `src/tui.rs`

**Checkpoint**: At least 2 integration tests verify multi-step key sequences. SC-003 satisfied.

---

## Phase 5: Polish -- CI and Cross-Cutting

**Purpose**: Ensure all tests are CI-ready and meet quality bar.

- [ ] T037 Run `cargo clippy` and fix any warnings in test code in `src/tui.rs`
- [ ] T038 Run full `cargo test` and verify all tests pass in under 5 seconds (SC-004)
- [ ] T039 Verify tests work headless (no real terminal required) -- confirm TestBackend usage throughout (SC-006)

---

## Dependencies & Execution Order

### Phase Dependencies

- **Phase 1 (Setup)**: No dependencies -- start immediately
- **Phase 2 (US1)**: Depends on Phase 1 (T001-T003 for handle_key and helpers)
- **Phase 3 (US2)**: Depends on Phase 1 (T003 for helpers); can run in parallel with Phase 2
- **Phase 4 (US3)**: Depends on Phase 1 (T001 for handle_key) and T033 helper
- **Phase 5 (Polish)**: Depends on all previous phases

### Parallel Opportunities

- All tests within Phase 2 (T005-T025) are independent and can be written in parallel
- All tests within Phase 3 (T027-T032) are independent and can be written in parallel
- Phase 2 and Phase 3 can proceed in parallel after Phase 1 completes
- All tests within Phase 4 (T034-T036) are independent and can be written in parallel

### Within Each Phase

- Helper functions must be written before tests that use them
- All test code lives in `src/tui.rs` `#[cfg(test)] mod tests` block

---

## Notes

- All test code lives in-module (`#[cfg(test)] mod tests` in `src/tui.rs`) to access private types
- No new dependencies needed -- ratatui 0.30 includes TestBackend, tempfile already in dev-deps
- No visibility changes to App/Tab/Pane/ViewMode required (in-module tests have access)
- Commit after each phase or logical group
diff --git a/src/cli.rs b/src/cli.rs
index 8760148..e7f547b 100644
--- a/src/cli.rs
+++ b/src/cli.rs
@@ -237,4 +237,9 @@ pub enum PatchCmd {
        #[arg(short, long)]
        reason: Option<String>,
    },
    /// Import patches from format-patch files
    Import {
        /// One or more .patch files to import
        files: Vec<std::path::PathBuf>,
    },
}
diff --git a/src/editor.rs b/src/editor.rs
new file mode 100644
index 0000000..1c499fc
--- /dev/null
+++ b/src/editor.rs
@@ -0,0 +1,275 @@
use std::process::Command;

use crate::error::Error;

/// Resolve the user's preferred editor by checking $VISUAL, then $EDITOR.
/// Returns `None` if neither is set or both are empty.
pub fn resolve_editor() -> Option<String> {
    resolve_editor_from(
        std::env::var("VISUAL").ok().as_deref(),
        std::env::var("EDITOR").ok().as_deref(),
    )
}

/// Inner helper: resolve editor from explicit values (testable without env mutation).
fn resolve_editor_from(visual: Option<&str>, editor: Option<&str>) -> Option<String> {
    for val in [visual, editor].into_iter().flatten() {
        let trimmed = val.trim();
        if !trimmed.is_empty() {
            return Some(trimmed.to_string());
        }
    }
    None
}

/// Launch the editor at a specific file and line number.
///
/// The editor string is split on whitespace to support editors like `code --wait`.
/// The command is invoked as: `<editor...> +{line} {file}`.
pub fn open_editor_at(file: &str, line: u32) -> Result<(), Error> {
    let editor_str = resolve_editor().ok_or_else(|| {
        Error::Cmd("No editor configured. Set $EDITOR or $VISUAL.".to_string())
    })?;

    open_editor_at_with(file, line, &editor_str)
}

/// Inner helper: launch an editor command at a specific file and line.
/// Separated from `open_editor_at` so tests can pass an explicit editor string
/// without mutating environment variables.
fn open_editor_at_with(file: &str, line: u32, editor_str: &str) -> Result<(), Error> {
    if !std::path::Path::new(file).exists() {
        return Err(Error::Cmd(format!("File not found: {}", file)));
    }

    let parts: Vec<&str> = editor_str.split_whitespace().collect();
    if parts.is_empty() {
        return Err(Error::Cmd("Editor command is empty".to_string()));
    }

    let program = parts[0];
    let extra_args = &parts[1..];

    let status = Command::new(program)
        .args(extra_args)
        .arg(format!("+{}", line))
        .arg(file)
        .status()
        .map_err(|e| Error::Cmd(format!("Failed to launch editor '{}': {}", program, e)))?;

    if !status.success() {
        let code = status.code().unwrap_or(-1);
        return Err(Error::Cmd(format!("Editor exited with status: {}", code)));
    }

    Ok(())
}

/// Given a list of comment positions `(file, line, rendered_row)` and the
/// current scroll position, find the comment whose rendered row is closest to
/// and at or above the scroll position. Returns `(file, line)` if found.
///
/// The `rendered_position` (third tuple element) is the row index within the
/// rendered diff text where this inline comment appears.
pub fn find_comment_at_scroll(
    comments: &[(String, u32, usize)],
    scroll_pos: u16,
) -> Option<(String, u32)> {
    let scroll = scroll_pos as usize;
    let mut best: Option<&(String, u32, usize)> = None;

    for entry in comments {
        let rendered = entry.2;
        if rendered <= scroll {
            match best {
                Some(prev) if rendered > prev.2 => best = Some(entry),
                None => best = Some(entry),
                _ => {}
            }
        }
    }

    best.map(|e| (e.0.clone(), e.1))
}

#[cfg(test)]
mod tests {
    use super::*;
    use std::io::Write;

    // ---- resolve_editor tests (pure, no env mutation) ----

    #[test]
    fn test_resolve_editor_visual_takes_precedence() {
        let result = resolve_editor_from(Some("nvim"), Some("vi"));
        assert_eq!(result, Some("nvim".to_string()));
    }

    #[test]
    fn test_resolve_editor_falls_back_to_editor() {
        let result = resolve_editor_from(None, Some("nano"));
        assert_eq!(result, Some("nano".to_string()));
    }

    #[test]
    fn test_resolve_editor_none_when_unset() {
        let result = resolve_editor_from(None, None);
        assert_eq!(result, None);
    }

    #[test]
    fn test_resolve_editor_skips_empty_visual() {
        let result = resolve_editor_from(Some("  "), Some("vim"));
        assert_eq!(result, Some("vim".to_string()));
    }

    #[test]
    fn test_resolve_editor_both_empty() {
        let result = resolve_editor_from(Some(""), Some(""));
        assert_eq!(result, None);
    }

    #[test]
    fn test_resolve_editor_trims_whitespace() {
        let result = resolve_editor_from(None, Some("  vim  "));
        assert_eq!(result, Some("vim".to_string()));
    }

    // ---- find_comment_at_scroll tests ----

    #[test]
    fn test_find_comment_at_scroll_empty() {
        let comments: Vec<(String, u32, usize)> = vec![];
        assert_eq!(find_comment_at_scroll(&comments, 10), None);
    }

    #[test]
    fn test_find_comment_at_scroll_exact_match() {
        let comments = vec![
            ("src/main.rs".to_string(), 42, 10),
            ("src/lib.rs".to_string(), 15, 20),
        ];
        let result = find_comment_at_scroll(&comments, 10);
        assert_eq!(result, Some(("src/main.rs".to_string(), 42)));
    }

    #[test]
    fn test_find_comment_at_scroll_picks_closest_above() {
        let comments = vec![
            ("a.rs".to_string(), 1, 5),
            ("b.rs".to_string(), 2, 15),
            ("c.rs".to_string(), 3, 25),
        ];
        // Scroll is at 20, closest at-or-above is rendered_pos=15
        let result = find_comment_at_scroll(&comments, 20);
        assert_eq!(result, Some(("b.rs".to_string(), 2)));
    }

    #[test]
    fn test_find_comment_at_scroll_all_below() {
        let comments = vec![
            ("a.rs".to_string(), 1, 30),
            ("b.rs".to_string(), 2, 40),
        ];
        // Scroll at 10, all comments are below
        let result = find_comment_at_scroll(&comments, 10);
        assert_eq!(result, None);
    }

    #[test]
    fn test_find_comment_at_scroll_first_wins_on_tie() {
        // Two comments at same rendered position -- first in list wins
        // because our > comparison doesn't replace when equal.
        let comments = vec![
            ("a.rs".to_string(), 1, 10),
            ("b.rs".to_string(), 2, 10),
        ];
        let result = find_comment_at_scroll(&comments, 10);
        assert_eq!(result, Some(("a.rs".to_string(), 1)));
    }

    #[test]
    fn test_find_comment_at_scroll_scroll_at_zero() {
        let comments = vec![
            ("a.rs".to_string(), 1, 0),
            ("b.rs".to_string(), 2, 5),
        ];
        let result = find_comment_at_scroll(&comments, 0);
        assert_eq!(result, Some(("a.rs".to_string(), 1)));
    }

    #[test]
    fn test_find_comment_at_scroll_single_comment_above() {
        let comments = vec![("only.rs".to_string(), 99, 3)];
        let result = find_comment_at_scroll(&comments, 100);
        assert_eq!(result, Some(("only.rs".to_string(), 99)));
    }

    // ---- open_editor_at tests (using inner helper, no env mutation) ----

    #[test]
    fn test_open_editor_at_success_with_true() {
        let mut tmp = tempfile::NamedTempFile::new().unwrap();
        writeln!(tmp, "hello").unwrap();
        let path = tmp.path().to_str().unwrap().to_string();

        let result = open_editor_at_with(&path, 1, "true");
        assert!(result.is_ok(), "Expected Ok, got {:?}", result);
    }

    #[test]
    fn test_open_editor_at_file_not_found() {
        let result =
            open_editor_at_with("/tmp/nonexistent_file_for_test_abc123xyz.txt", 1, "true");
        assert!(result.is_err());
        let err_msg = format!("{}", result.unwrap_err());
        assert!(
            err_msg.contains("File not found"),
            "Unexpected error: {}",
            err_msg
        );
    }

    #[test]
    fn test_open_editor_at_nonzero_exit() {
        let mut tmp = tempfile::NamedTempFile::new().unwrap();
        writeln!(tmp, "hello").unwrap();
        let path = tmp.path().to_str().unwrap().to_string();

        let result = open_editor_at_with(&path, 1, "false");
        assert!(result.is_err());
        let err_msg = format!("{}", result.unwrap_err());
        assert!(
            err_msg.contains("Editor exited with status"),
            "Unexpected error: {}",
            err_msg
        );
    }

    #[test]
    fn test_open_editor_at_with_args_in_editor_string() {
        // `true` ignores all arguments, so "true --wait" should succeed
        let mut tmp = tempfile::NamedTempFile::new().unwrap();
        writeln!(tmp, "hello").unwrap();
        let path = tmp.path().to_str().unwrap().to_string();

        let result = open_editor_at_with(&path, 42, "true --wait");
        assert!(result.is_ok(), "Expected Ok, got {:?}", result);
    }

    #[test]
    fn test_open_editor_at_bad_command() {
        let mut tmp = tempfile::NamedTempFile::new().unwrap();
        writeln!(tmp, "hello").unwrap();
        let path = tmp.path().to_str().unwrap().to_string();

        let result = open_editor_at_with(&path, 1, "nonexistent_editor_binary_xyz");
        assert!(result.is_err());
        let err_msg = format!("{}", result.unwrap_err());
        assert!(
            err_msg.contains("Failed to launch editor"),
            "Unexpected error: {}",
            err_msg
        );
    }
}
diff --git a/src/error.rs b/src/error.rs
index 2295ef2..f0ccc73 100644
--- a/src/error.rs
+++ b/src/error.rs
@@ -25,4 +25,10 @@ pub enum Error {

    #[error("untrusted key: {0}")]
    UntrustedKey(String),

    #[error("malformed patch: {0}")]
    MalformedPatch(String),

    #[error("patch apply failed: {0}")]
    PatchApplyFailed(String),
}
diff --git a/src/lib.rs b/src/lib.rs
index 81bdf42..8909477 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,5 +1,6 @@
pub mod cli;
pub mod dag;
pub mod editor;
pub mod error;
pub mod event;
pub mod identity;
@@ -258,6 +259,13 @@ pub fn run(cli: cli::Cli, repo: &Repository) -> Result<(), error::Error> {
                println!("Patch closed.");
                Ok(())
            }
            PatchCmd::Import { files } => {
                let ids = patch::import_series(repo, &files)?;
                for id in &ids {
                    println!("Imported patch {:.8}", id);
                }
                Ok(())
            }
        },
        Commands::Dashboard => tui::run(repo),
        Commands::Sync { remote } => sync::sync(repo, &remote),
diff --git a/src/patch.rs b/src/patch.rs
index b5eb128..6247f81 100644
--- a/src/patch.rs
+++ b/src/patch.rs
@@ -1,4 +1,6 @@
use git2::{DiffFormat, Repository};
use std::path::Path;

use git2::{Diff, DiffFormat, Repository};

use crate::dag;
use crate::error::Error;
@@ -281,3 +283,187 @@ pub fn close(
    dag::append_event(repo, &ref_name, &event, &sk)?;
    Ok(())
}

// ---------------------------------------------------------------------------
// Patch import from format-patch files
// ---------------------------------------------------------------------------

/// Parsed metadata from a git format-patch mbox header.
struct PatchHeader {
    subject: String,
    body: String,
}

/// Parse a format-patch file into its mbox header metadata and the raw diff portion.
fn parse_format_patch(content: &str) -> Result<(PatchHeader, String), Error> {
    // Find the "---" separator that divides the commit message from the diffstat/diff.
    // The diff starts at the first line matching "diff --git".
    let diff_start = content
        .find("\ndiff --git ")
        .map(|i| i + 1) // skip the leading newline
        .ok_or_else(|| Error::MalformedPatch("no 'diff --git' found in patch file".to_string()))?;

    let header_section = &content[..diff_start];
    let diff_section = &content[diff_start..];

    // Extract Subject line
    let subject_line = header_section
        .lines()
        .find(|l| l.starts_with("Subject:"))
        .ok_or_else(|| Error::MalformedPatch("no Subject header found".to_string()))?;

    // Strip "Subject: " prefix and optional "[PATCH] " or "[PATCH n/m] " prefix
    let subject = subject_line
        .strip_prefix("Subject:")
        .unwrap()
        .trim();
    let subject = if let Some(rest) = subject.strip_prefix("[PATCH") {
        // Skip to the "] " closing bracket
        if let Some(idx) = rest.find("] ") {
            rest[idx + 2..].to_string()
        } else {
            subject.to_string()
        }
    } else {
        subject.to_string()
    };

    // Extract body: everything between the blank line after headers and the "---" separator
    let body = extract_body(header_section);

    // Trim trailing "-- \n2.xx.x\n" signature from diff
    let diff_clean = trim_patch_signature(diff_section);

    Ok((PatchHeader { subject, body }, diff_clean))
}

/// Extract the commit message body from the header section.
/// The body is between the first blank line after headers and the "---" line.
fn extract_body(header_section: &str) -> String {
    let lines: Vec<&str> = header_section.lines().collect();
    let mut body_start = None;
    let mut body_end = None;

    // Find first blank line (end of mail headers)
    for (i, line) in lines.iter().enumerate() {
        if line.is_empty() && body_start.is_none() {
            body_start = Some(i + 1);
        }
    }

    // Find the "---" separator line (start of diffstat)
    for (i, line) in lines.iter().enumerate().rev() {
        if *line == "---" {
            body_end = Some(i);
            break;
        }
    }

    match (body_start, body_end) {
        (Some(start), Some(end)) if start < end => {
            lines[start..end].join("\n").trim().to_string()
        }
        (Some(start), None) => {
            // No "---" separator, take everything after headers
            lines[start..].join("\n").trim().to_string()
        }
        _ => String::new(),
    }
}

/// Remove trailing git patch signature ("-- \n2.xx.x\n") from diff content.
fn trim_patch_signature(diff: &str) -> String {
    if let Some(idx) = diff.rfind("\n-- \n") {
        diff[..idx + 1].to_string() // keep the trailing newline
    } else {
        diff.to_string()
    }
}

/// Import a single format-patch file, creating a commit and DAG entry.
/// Returns the patch ID.
pub fn import(repo: &Repository, patch_path: &Path) -> Result<String, Error> {
    let content = std::fs::read_to_string(patch_path)?;
    let (header, diff_text) = parse_format_patch(&content)?;

    // Parse the diff with git2
    let diff = Diff::from_buffer(diff_text.as_bytes())
        .map_err(|e| Error::MalformedPatch(format!("invalid diff: {}", e)))?;

    // Get the base (HEAD) commit and its tree
    let head_ref = repo
        .head()
        .map_err(|e| Error::PatchApplyFailed(format!("cannot resolve HEAD: {}", e)))?;
    let head_commit = head_ref
        .peel_to_commit()
        .map_err(|e| Error::PatchApplyFailed(format!("HEAD is not a commit: {}", e)))?;
    let base_tree = head_commit.tree()?;

    // Apply the diff to the base tree in-memory
    let new_index = repo
        .apply_to_tree(&base_tree, &diff, None)
        .map_err(|e| Error::PatchApplyFailed(format!("apply failed: {}", e)))?;

    // Write the index to a tree
    let tree_oid = {
        let mut idx = new_index;
        idx.write_tree_to(repo)?
    };
    let new_tree = repo.find_tree(tree_oid)?;

    // Create a commit on a detached ref (no branch update)
    let author = get_author(repo)?;
    let sig = crate::identity::author_signature(&author)?;
    let commit_msg = format!("imported: {}", header.subject);
    let commit_oid = repo.commit(
        None, // don't update any ref
        &sig,
        &sig,
        &commit_msg,
        &new_tree,
        &[&head_commit],
    )?;

    // Determine the base branch name from HEAD
    let base_ref = repo
        .head()?
        .shorthand()
        .unwrap_or("main")
        .to_string();

    // Create a DAG entry using the existing patch create infrastructure
    let id = create(
        repo,
        &header.subject,
        &header.body,
        &base_ref,
        &commit_oid.to_string(),
        None,
    )?;

    Ok(id)
}

/// Import a series of format-patch files. If any fails, rolls back all
/// previously imported patches from this series.
pub fn import_series(repo: &Repository, files: &[impl AsRef<Path>]) -> Result<Vec<String>, Error> {
    let mut imported_ids: Vec<String> = Vec::new();

    for file in files {
        match import(repo, file.as_ref()) {
            Ok(id) => imported_ids.push(id),
            Err(e) => {
                // Rollback: delete all refs created in this series
                for id in &imported_ids {
                    let ref_name = format!("refs/collab/patches/{}", id);
                    if let Ok(mut reference) = repo.find_reference(&ref_name) {
                        let _ = reference.delete();
                    }
                }
                return Err(e);
            }
        }
    }

    Ok(imported_ids)
}
diff --git a/src/tui.rs b/src/tui.rs
index c3c5d8b..0d0840d 100644
--- a/src/tui.rs
+++ b/src/tui.rs
@@ -5,31 +5,42 @@ use std::time::Duration;
use crossterm::event::{self, Event, KeyCode, KeyModifiers};
use crossterm::terminal::{self, EnterAlternateScreen, LeaveAlternateScreen};
use crossterm::ExecutableCommand;
use git2::Repository;
use git2::{Oid, Repository};
use ratatui::prelude::*;
use ratatui::widgets::{Block, Borders, List, ListItem, ListState, Paragraph, Tabs, Wrap};

use crate::error::Error;
use crate::event::Action;
use crate::issue as issue_mod;
use crate::patch as patch_mod;
use crate::state::{self, IssueState, IssueStatus, PatchState, PatchStatus};

#[derive(PartialEq)]
#[derive(Debug, PartialEq)]
enum Pane {
    ItemList,
    Detail,
}

#[derive(PartialEq, Clone, Copy)]
#[derive(Debug, PartialEq, Clone, Copy)]
enum Tab {
    Issues,
    Patches,
}

#[derive(PartialEq)]
#[derive(Debug, PartialEq)]
enum ViewMode {
    Details,
    Diff,
    CommitList,
    CommitDetail,
}

#[derive(Debug, PartialEq)]
enum KeyAction {
    Continue,
    Quit,
    Reload,
    OpenCommitBrowser,
}

#[derive(Debug, PartialEq, Clone, Copy)]
@@ -80,6 +91,8 @@ struct App {
    input_buf: String,
    create_title: String,
    status_msg: Option<String>,
    event_history: Vec<(Oid, crate::event::Event)>,
    event_list_state: ListState,
}

impl App {
@@ -103,6 +116,8 @@ impl App {
            input_buf: String::new(),
            create_title: String::new(),
            status_msg: None,
            event_history: Vec::new(),
            event_list_state: ListState::default(),
        }
    }

@@ -248,6 +263,183 @@ impl App {
        }
    }

    fn handle_key(&mut self, code: KeyCode, modifiers: KeyModifiers) -> KeyAction {
        // Handle CommitDetail mode first
        if self.mode == ViewMode::CommitDetail {
            match code {
                KeyCode::Esc => {
                    self.mode = ViewMode::CommitList;
                    self.scroll = 0;
                    return KeyAction::Continue;
                }
                KeyCode::Char('q') => return KeyAction::Quit,
                KeyCode::Char('c') if modifiers.contains(KeyModifiers::CONTROL) => {
                    return KeyAction::Quit;
                }
                KeyCode::Char('j') | KeyCode::Down => {
                    self.scroll = self.scroll.saturating_add(1);
                    return KeyAction::Continue;
                }
                KeyCode::Char('k') | KeyCode::Up => {
                    self.scroll = self.scroll.saturating_sub(1);
                    return KeyAction::Continue;
                }
                KeyCode::PageDown => {
                    self.scroll = self.scroll.saturating_add(20);
                    return KeyAction::Continue;
                }
                KeyCode::PageUp => {
                    self.scroll = self.scroll.saturating_sub(20);
                    return KeyAction::Continue;
                }
                _ => return KeyAction::Continue,
            }
        }

        // Handle CommitList mode
        if self.mode == ViewMode::CommitList {
            match code {
                KeyCode::Esc => {
                    self.event_history.clear();
                    self.event_list_state = ListState::default();
                    self.mode = ViewMode::Details;
                    self.scroll = 0;
                    return KeyAction::Continue;
                }
                KeyCode::Char('q') => return KeyAction::Quit,
                KeyCode::Char('c') if modifiers.contains(KeyModifiers::CONTROL) => {
                    return KeyAction::Quit;
                }
                KeyCode::Char('j') | KeyCode::Down => {
                    let len = self.event_history.len();
                    if len > 0 {
                        let current = self.event_list_state.selected().unwrap_or(0);
                        let new = (current + 1).min(len - 1);
                        self.event_list_state.select(Some(new));
                    }
                    return KeyAction::Continue;
                }
                KeyCode::Char('k') | KeyCode::Up => {
                    if !self.event_history.is_empty() {
                        let current = self.event_list_state.selected().unwrap_or(0);
                        let new = current.saturating_sub(1);
                        self.event_list_state.select(Some(new));
                    }
                    return KeyAction::Continue;
                }
                KeyCode::Enter => {
                    if self.event_list_state.selected().is_some() {
                        self.mode = ViewMode::CommitDetail;
                        self.scroll = 0;
                    }
                    return KeyAction::Continue;
                }
                _ => return KeyAction::Continue,
            }
        }

        // Normal Details/Diff mode handling
        match code {
            KeyCode::Char('q') | KeyCode::Esc => KeyAction::Quit,
            KeyCode::Char('c') if modifiers.contains(KeyModifiers::CONTROL) => KeyAction::Quit,
            KeyCode::Char('c') => {
                // Open commit browser: only when in detail pane with an item selected
                if self.pane == Pane::Detail && self.list_state.selected().is_some() {
                    KeyAction::OpenCommitBrowser
                } else {
                    KeyAction::Continue
                }
            }
            KeyCode::Char('1') => {
                self.switch_tab(Tab::Issues);
                KeyAction::Continue
            }
            KeyCode::Char('2') => {
                self.switch_tab(Tab::Patches);
                KeyAction::Continue
            }
            KeyCode::Char('j') | KeyCode::Down => {
                if self.pane == Pane::ItemList {
                    self.move_selection(1);
                } else {
                    self.scroll = self.scroll.saturating_add(1);
                }
                KeyAction::Continue
            }
            KeyCode::Char('k') | KeyCode::Up => {
                if self.pane == Pane::ItemList {
                    self.move_selection(-1);
                } else {
                    self.scroll = self.scroll.saturating_sub(1);
                }
                KeyAction::Continue
            }
            KeyCode::PageDown => {
                self.scroll = self.scroll.saturating_add(20);
                KeyAction::Continue
            }
            KeyCode::PageUp => {
                self.scroll = self.scroll.saturating_sub(20);
                KeyAction::Continue
            }
            KeyCode::Tab | KeyCode::Enter => {
                self.pane = match self.pane {
                    Pane::ItemList => Pane::Detail,
                    Pane::Detail => Pane::ItemList,
                };
                KeyAction::Continue
            }
            KeyCode::Char('d') => {
                if self.tab == Tab::Patches {
                    match self.mode {
                        ViewMode::Details => {
                            self.mode = ViewMode::Diff;
                            self.scroll = 0;
                        }
                        ViewMode::Diff => {
                            self.mode = ViewMode::Details;
                            self.scroll = 0;
                        }
                        _ => {}
                    }
                }
                KeyAction::Continue
            }
            KeyCode::Char('a') => {
                self.status_filter = self.status_filter.next();
                let count = self.visible_count();
                self.list_state
                    .select(if count > 0 { Some(0) } else { None });
                KeyAction::Continue
            }
            KeyCode::Char('r') => KeyAction::Reload,
            _ => KeyAction::Continue,
        }
    }

    fn selected_item_id(&self) -> Option<String> {
        let idx = self.list_state.selected()?;
        match self.tab {
            Tab::Issues => {
                let visible = self.visible_issues();
                visible.get(idx).map(|i| i.id.clone())
            }
            Tab::Patches => {
                let visible = self.visible_patches();
                visible.get(idx).map(|p| p.id.clone())
            }
        }
    }

    fn selected_ref_name(&self) -> Option<String> {
        let id = self.selected_item_id()?;
        let prefix = match self.tab {
            Tab::Issues => "refs/collab/issues",
            Tab::Patches => "refs/collab/patches",
        };
        Some(format!("{}/{}", prefix, id))
    }

    fn reload(&mut self, repo: &Repository) {
        if let Ok(issues) = state::list_issues(repo) {
            self.issues = issues;
@@ -268,6 +460,115 @@ impl App {
    }
}

fn action_type_label(action: &Action) -> &str {
    match action {
        Action::IssueOpen { .. } => "Issue Open",
        Action::IssueComment { .. } => "Issue Comment",
        Action::IssueClose { .. } => "Issue Close",
        Action::IssueReopen => "Issue Reopen",
        Action::PatchCreate { .. } => "Patch Create",
        Action::PatchRevise { .. } => "Patch Revise",
        Action::PatchReview { .. } => "Patch Review",
        Action::PatchComment { .. } => "Patch Comment",
        Action::PatchInlineComment { .. } => "Inline Comment",
        Action::PatchClose { .. } => "Patch Close",
        Action::PatchMerge => "Patch Merge",
        Action::Merge => "Merge",
        Action::IssueEdit { .. } => "Issue Edit",
        Action::IssueLabel { .. } => "Issue Label",
        Action::IssueUnlabel { .. } => "Issue Unlabel",
        Action::IssueAssign { .. } => "Issue Assign",
        Action::IssueUnassign { .. } => "Issue Unassign",
    }
}

fn format_event_detail(oid: &Oid, event: &crate::event::Event) -> String {
    let short_oid = &oid.to_string()[..7];
    let action_label = action_type_label(&event.action);

    let mut detail = format!(
        "Commit:  {}\nAuthor:  {} <{}>\nDate:    {}\nType:    {}\n",
        short_oid, event.author.name, event.author.email, event.timestamp, action_label,
    );

    // Action-specific payload
    match &event.action {
        Action::IssueOpen { title, body } => {
            detail.push_str(&format!("\nTitle: {}\n", title));
            if !body.is_empty() {
                detail.push_str(&format!("\n{}\n", body));
            }
        }
        Action::IssueComment { body } | Action::PatchComment { body } => {
            detail.push_str(&format!("\n{}\n", body));
        }
        Action::IssueClose { reason } | Action::PatchClose { reason } => {
            if let Some(r) = reason {
                detail.push_str(&format!("\nReason: {}\n", r));
            }
        }
        Action::PatchCreate {
            title,
            body,
            base_ref,
            head_commit,
            ..
        } => {
            detail.push_str(&format!("\nTitle: {}\n", title));
            detail.push_str(&format!("Base:  {}\n", base_ref));
            detail.push_str(&format!("Head:  {}\n", head_commit));
            if !body.is_empty() {
                detail.push_str(&format!("\n{}\n", body));
            }
        }
        Action::PatchRevise { body, head_commit } => {
            detail.push_str(&format!("\nHead: {}\n", head_commit));
            if let Some(b) = body {
                if !b.is_empty() {
                    detail.push_str(&format!("\n{}\n", b));
                }
            }
        }
        Action::PatchReview { verdict, body } => {
            detail.push_str(&format!("\nVerdict: {:?}\n", verdict));
            if !body.is_empty() {
                detail.push_str(&format!("\n{}\n", body));
            }
        }
        Action::PatchInlineComment { file, line, body } => {
            detail.push_str(&format!("\nFile: {}:{}\n", file, line));
            if !body.is_empty() {
                detail.push_str(&format!("\n{}\n", body));
            }
        }
        Action::IssueEdit { title, body } => {
            if let Some(t) = title {
                detail.push_str(&format!("\nNew Title: {}\n", t));
            }
            if let Some(b) = body {
                if !b.is_empty() {
                    detail.push_str(&format!("\nNew Body: {}\n", b));
                }
            }
        }
        Action::IssueLabel { label } => {
            detail.push_str(&format!("\nLabel: {}\n", label));
        }
        Action::IssueUnlabel { label } => {
            detail.push_str(&format!("\nRemoved Label: {}\n", label));
        }
        Action::IssueAssign { assignee } => {
            detail.push_str(&format!("\nAssignee: {}\n", assignee));
        }
        Action::IssueUnassign { assignee } => {
            detail.push_str(&format!("\nRemoved Assignee: {}\n", assignee));
        }
        Action::IssueReopen | Action::PatchMerge | Action::Merge => {}
    }

    detail
}

pub fn run(repo: &Repository) -> Result<(), Error> {
    let issues = state::list_issues(repo)?;
    let patches = state::list_patches(repo)?;
@@ -423,123 +724,112 @@ fn run_loop(
                    InputMode::Normal => {}
                }

                match key.code {
                    KeyCode::Char('q') | KeyCode::Esc => return Ok(()),
                    KeyCode::Char('c') if key.modifiers.contains(KeyModifiers::CONTROL) => {
                        return Ok(())
                    }
                    KeyCode::Char('/') => {
                        app.input_mode = InputMode::Search;
                        app.search_query.clear();
                    }
                    KeyCode::Char('n') => {
                        if app.tab != Tab::Issues {
                            app.switch_tab(Tab::Issues);
                // Handle keys that need repo access or are run_loop-specific
                // before delegating to handle_key
                if app.mode == ViewMode::Details || app.mode == ViewMode::Diff {
                    match key.code {
                        KeyCode::Char('/') => {
                            app.input_mode = InputMode::Search;
                            app.search_query.clear();
                            continue;
                        }
                        app.input_mode = InputMode::CreateTitle;
                        app.input_buf.clear();
                        app.create_title.clear();
                    }
                    KeyCode::Char('1') => app.switch_tab(Tab::Issues),
                    KeyCode::Char('2') => app.switch_tab(Tab::Patches),
                    KeyCode::Char('j') | KeyCode::Down => {
                        if app.pane == Pane::ItemList {
                            app.move_selection(1);
                        } else {
                            app.scroll = app.scroll.saturating_add(1);
                        KeyCode::Char('n') => {
                            if app.tab != Tab::Issues {
                                app.switch_tab(Tab::Issues);
                            }
                            app.input_mode = InputMode::CreateTitle;
                            app.input_buf.clear();
                            app.create_title.clear();
                            continue;
                        }
                    }
                    KeyCode::Char('k') | KeyCode::Up => {
                        if app.pane == Pane::ItemList {
                            app.move_selection(-1);
                        } else {
                            app.scroll = app.scroll.saturating_sub(1);
                        KeyCode::Char('g') => {
                            if !app.follow_link() {
                                app.status_msg = Some("No linked item to follow".to_string());
                            }
                            continue;
                        }
                    }
                    KeyCode::PageDown => app.scroll = app.scroll.saturating_add(20),
                    KeyCode::PageUp => app.scroll = app.scroll.saturating_sub(20),
                    KeyCode::Tab | KeyCode::Enter => {
                        app.pane = match app.pane {
                            Pane::ItemList => Pane::Detail,
                            Pane::Detail => Pane::ItemList,
                        };
                    }
                    KeyCode::Char('d') => {
                        if app.tab == Tab::Patches {
                            app.mode = match app.mode {
                                ViewMode::Details => ViewMode::Diff,
                                ViewMode::Diff => ViewMode::Details,
                        KeyCode::Char('o') => {
                            // Check out the relevant commit for local browsing
                            let checkout_target = match app.tab {
                                Tab::Patches => {
                                    let visible = app.visible_patches();
                                    app.list_state
                                        .selected()
                                        .and_then(|idx| visible.get(idx))
                                        .map(|p| p.head_commit.clone())
                                }
                                Tab::Issues => {
                                    // Find linked patch's head commit, or fall back to closing commit
                                    let visible = app.visible_issues();
                                    app.list_state
                                        .selected()
                                        .and_then(|idx| visible.get(idx))
                                        .and_then(|issue| {
                                            // Try linked patch first
                                            app.patches
                                                .iter()
                                                .find(|p| p.fixes.as_deref() == Some(&issue.id))
                                                .map(|p| p.head_commit.clone())
                                                // Fall back to closing commit
                                                .or_else(|| {
                                                    issue.closed_by.map(|oid| oid.to_string())
                                                })
                                        })
                                }
                            };
                            app.scroll = 0;
                        }
                    }
                    KeyCode::Char('a') => {
                        app.status_filter = app.status_filter.next();
                        let count = app.visible_count();
                        app.list_state
                            .select(if count > 0 { Some(0) } else { None });
                    }
                    KeyCode::Char('g') => {
                        if !app.follow_link() {
                            app.status_msg = Some("No linked item to follow".to_string());
                            if let Some(head) = checkout_target {
                                // Exit TUI, checkout, and return
                                terminal::disable_raw_mode()?;
                                stdout().execute(LeaveAlternateScreen)?;
                                let status = std::process::Command::new("git")
                                    .args(["checkout", &head])
                                    .status();
                                match status {
                                    Ok(s) if s.success() => {
                                        println!("Checked out commit: {:.8}", head);
                                        println!("Use 'git checkout -' to return.");
                                    }
                                    Ok(s) => {
                                        eprintln!("git checkout exited with {}", s);
                                    }
                                    Err(e) => {
                                        eprintln!("Failed to run git checkout: {}", e);
                                    }
                                }
                                return Ok(());
                            } else {
                                app.status_msg =
                                    Some("No linked patch to check out".to_string());
                            }
                            continue;
                        }
                        _ => {}
                    }
                    KeyCode::Char('o') => {
                        // Check out the relevant commit for local browsing
                        let checkout_target = match app.tab {
                            Tab::Patches => {
                                let visible = app.visible_patches();
                                app.list_state
                                    .selected()
                                    .and_then(|idx| visible.get(idx))
                                    .map(|p| p.head_commit.clone())
                            }
                            Tab::Issues => {
                                // Find linked patch's head commit, or fall back to closing commit
                                let visible = app.visible_issues();
                                app.list_state
                                    .selected()
                                    .and_then(|idx| visible.get(idx))
                                    .and_then(|issue| {
                                        // Try linked patch first
                                        app.patches
                                            .iter()
                                            .find(|p| p.fixes.as_deref() == Some(&issue.id))
                                            .map(|p| p.head_commit.clone())
                                            // Fall back to closing commit
                                            .or_else(|| issue.closed_by.map(|oid| oid.to_string()))
                                    })
                            }
                        };
                        if let Some(head) = checkout_target {
                            // Exit TUI, checkout, and return
                            terminal::disable_raw_mode()?;
                            stdout().execute(LeaveAlternateScreen)?;
                            let status = std::process::Command::new("git")
                                .args(["checkout", &head])
                                .status();
                            match status {
                                Ok(s) if s.success() => {
                                    println!("Checked out commit: {:.8}", head);
                                    println!("Use 'git checkout -' to return.");
                                }
                                Ok(s) => {
                                    eprintln!("git checkout exited with {}", s);
                }

                match app.handle_key(key.code, key.modifiers) {
                    KeyAction::Quit => return Ok(()),
                    KeyAction::Reload => app.reload(repo),
                    KeyAction::OpenCommitBrowser => {
                        if let Some(ref_name) = app.selected_ref_name() {
                            match crate::dag::walk_events(repo, &ref_name) {
                                Ok(events) => {
                                    app.event_history = events;
                                    app.event_list_state = ListState::default();
                                    if !app.event_history.is_empty() {
                                        app.event_list_state.select(Some(0));
                                    }
                                    app.mode = ViewMode::CommitList;
                                    app.scroll = 0;
                                }
                                Err(e) => {
                                    eprintln!("Failed to run git checkout: {}", e);
                                    app.status_msg =
                                        Some(format!("Error loading events: {}", e));
                                }
                            }
                            return Ok(());
                        } else {
                            app.status_msg =
                                Some("No linked patch to check out".to_string());
                        }
                    }
                    KeyCode::Char('r') => {
                        app.reload(repo);
                    }
                    _ => {}
                    KeyAction::Continue => {}
                }
            }
        }
@@ -669,17 +959,75 @@ fn render_list(frame: &mut Frame, app: &mut App, area: Rect) {
    }
}

fn render_detail(frame: &mut Frame, app: &App, area: Rect) {
fn render_detail(frame: &mut Frame, app: &mut App, area: Rect) {
    let border_style = if app.pane == Pane::Detail {
        Style::default().fg(Color::Yellow)
    } else {
        Style::default().fg(Color::DarkGray)
    };

    // Handle commit browser modes
    if app.mode == ViewMode::CommitList {
        let items: Vec<ListItem> = app
            .event_history
            .iter()
            .map(|(_oid, evt)| {
                let label = action_type_label(&evt.action);
                ListItem::new(format!(
                    "{} | {} | {}",
                    label, evt.author.name, evt.timestamp
                ))
            })
            .collect();

        let list = List::new(items)
            .block(
                Block::default()
                    .borders(Borders::ALL)
                    .title("Event History")
                    .border_style(border_style),
            )
            .highlight_style(
                Style::default()
                    .bg(Color::DarkGray)
                    .add_modifier(Modifier::BOLD),
            )
            .highlight_symbol("> ");

        frame.render_stateful_widget(list, area, &mut app.event_list_state);
        return;
    }

    if app.mode == ViewMode::CommitDetail {
        let content = if let Some(idx) = app.event_list_state.selected() {
            if let Some((oid, evt)) = app.event_history.get(idx) {
                format_event_detail(oid, evt)
            } else {
                "No event selected.".to_string()
            }
        } else {
            "No event selected.".to_string()
        };

        let block = Block::default()
            .borders(Borders::ALL)
            .title("Event Detail")
            .border_style(border_style);

        let para = Paragraph::new(content)
            .block(block)
            .wrap(Wrap { trim: false })
            .scroll((app.scroll, 0));

        frame.render_widget(para, area);
        return;
    }

    let title = match (&app.tab, &app.mode) {
        (Tab::Issues, _) => "Issue Details",
        (Tab::Patches, ViewMode::Details) => "Patch Details",
        (Tab::Patches, ViewMode::Diff) => "Diff",
        _ => "Details",
    };

    let content: Text = match app.tab {
@@ -705,6 +1053,7 @@ fn render_detail(frame: &mut Frame, app: &App, area: Rect) {
                            .unwrap_or("Loading...");
                        colorize_diff(diff_text, &patch.inline_comments)
                    }
                    _ => Text::raw(""),
                },
                None => Text::raw("No matches for current filter."),
            }
@@ -1166,33 +1515,39 @@ fn render_footer(frame: &mut Frame, app: &App, area: Rect) {
        InputMode::Normal => {}
    }

    let mode_hint = if app.tab == Tab::Patches {
        match app.mode {
            ViewMode::Details => "  d:diff",
            ViewMode::Diff => "  d:details",
    // Show status message if present
    if let Some(ref msg) = app.status_msg {
        let para =
            Paragraph::new(format!(" {}", msg)).style(Style::default().bg(Color::Yellow).fg(Color::Black));
        frame.render_widget(para, area);
        return;
    }

    let mode_hint = match app.mode {
        ViewMode::CommitList => "  Esc:back",
        ViewMode::CommitDetail => "  Esc:back  j/k:scroll",
        _ => {
            if app.tab == Tab::Patches {
                match app.mode {
                    ViewMode::Details => "  d:diff  c:events",
                    ViewMode::Diff => "  d:details  c:events",
                    _ => "",
                }
            } else {
                "  c:events"
            }
        }
    } else {
        ""
    };
    let filter_hint = match app.status_filter {
        StatusFilter::Open => "a:show all",
        StatusFilter::All => "a:closed",
        StatusFilter::Closed => "a:open only",
    };
    let text = if let Some(ref msg) = app.status_msg {
        format!(" {}", msg)
    } else {
        format!(
            " 1:issues  2:patches  j/k:navigate  Tab:pane  {}{}  /:search  n:new issue  g:follow  o:checkout  r:refresh  q:quit",
            filter_hint, mode_hint
        )
    };
    let style = if app.status_msg.is_some() {
        Style::default().bg(Color::Yellow).fg(Color::Black)
    } else {
        Style::default().bg(Color::DarkGray).fg(Color::White)
    };
    let para = Paragraph::new(text).style(style);
    let text = format!(
        " 1:issues  2:patches  j/k:navigate  Tab:pane  {}{}  /:search  n:new issue  g:follow  o:checkout  r:refresh  q:quit",
        filter_hint, mode_hint
    );
    let para = Paragraph::new(text).style(Style::default().bg(Color::DarkGray).fg(Color::White));
    frame.render_widget(para, area);
}

@@ -1396,4 +1751,734 @@ mod tests {

        assert_eq!(app.input_mode, InputMode::Normal);
    }

    // ── Commit browser test helpers ─────────────────────────────────────

    use crate::event::ReviewVerdict;
    use ratatui::backend::TestBackend;
    use ratatui::buffer::Buffer;

    fn test_author() -> Author {
        Author {
            name: "Test User".to_string(),
            email: "test@example.com".to_string(),
        }
    }

    fn make_test_issues(n: usize) -> Vec<IssueState> {
        (0..n)
            .map(|i| IssueState {
                id: format!("{:08x}", i),
                title: format!("Issue {}", i),
                body: format!("Body for issue {}", i),
                status: if i % 2 == 0 {
                    IssueStatus::Open
                } else {
                    IssueStatus::Closed
                },
                close_reason: if i % 2 == 1 {
                    Some("done".to_string())
                } else {
                    None
                },
                closed_by: None,
                labels: vec![],
                assignees: vec![],
                comments: Vec::new(),
                created_at: "2026-01-01T00:00:00Z".to_string(),
                author: test_author(),
            })
            .collect()
    }

    fn make_test_patches(n: usize) -> Vec<PatchState> {
        (0..n)
            .map(|i| PatchState {
                id: format!("p{:07x}", i),
                title: format!("Patch {}", i),
                body: format!("Body for patch {}", i),
                status: if i % 2 == 0 {
                    PatchStatus::Open
                } else {
                    PatchStatus::Closed
                },
                base_ref: "main".to_string(),
                head_commit: format!("h{:07x}", i),
                fixes: None,
                comments: Vec::new(),
                inline_comments: Vec::new(),
                reviews: Vec::new(),
                created_at: "2026-01-01T00:00:00Z".to_string(),
                author: test_author(),
            })
            .collect()
    }

    fn make_app(issues: usize, patches: usize) -> App {
        App::new(make_test_issues(issues), make_test_patches(patches))
    }

    fn render_app(app: &mut App) -> Buffer {
        let backend = TestBackend::new(80, 24);
        let mut terminal = Terminal::new(backend).unwrap();
        terminal.draw(|frame| ui(frame, app)).unwrap();
        terminal.backend().buffer().clone()
    }

    fn buffer_to_string(buf: &Buffer) -> String {
        let area = buf.area;
        let mut s = String::new();
        for y in area.y..area.y + area.height {
            for x in area.x..area.x + area.width {
                let cell = buf.cell((x, y)).unwrap();
                s.push_str(cell.symbol());
            }
            s.push('\n');
        }
        s
    }

    fn assert_buffer_contains(buf: &Buffer, expected: &str) {
        let text = buffer_to_string(buf);
        assert!(
            text.contains(expected),
            "expected buffer to contain {:?}, but it was not found in:\n{}",
            expected,
            text
        );
    }

    /// Create sample event history for testing commit browser
    fn make_test_event_history() -> Vec<(Oid, crate::event::Event)> {
        let oid1 = Oid::from_str("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa").unwrap();
        let oid2 = Oid::from_str("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb").unwrap();
        let oid3 = Oid::from_str("cccccccccccccccccccccccccccccccccccccccc").unwrap();

        vec![
            (
                oid1,
                crate::event::Event {
                    timestamp: "2026-01-01T00:00:00Z".to_string(),
                    author: test_author(),
                    action: Action::IssueOpen {
                        title: "Test Issue".to_string(),
                        body: "This is the body".to_string(),
                    },
                },
            ),
            (
                oid2,
                crate::event::Event {
                    timestamp: "2026-01-02T00:00:00Z".to_string(),
                    author: Author {
                        name: "Other User".to_string(),
                        email: "other@example.com".to_string(),
                    },
                    action: Action::IssueComment {
                        body: "A comment on the issue".to_string(),
                    },
                },
            ),
            (
                oid3,
                crate::event::Event {
                    timestamp: "2026-01-03T00:00:00Z".to_string(),
                    author: test_author(),
                    action: Action::IssueClose {
                        reason: Some("fixed".to_string()),
                    },
                },
            ),
        ]
    }

    // ── action_type_label tests ──────────────────────────────────────────

    #[test]
    fn test_action_type_label_issue_open() {
        let action = Action::IssueOpen {
            title: "t".to_string(),
            body: "b".to_string(),
        };
        assert_eq!(action_type_label(&action), "Issue Open");
    }

    #[test]
    fn test_action_type_label_issue_comment() {
        let action = Action::IssueComment {
            body: "b".to_string(),
        };
        assert_eq!(action_type_label(&action), "Issue Comment");
    }

    #[test]
    fn test_action_type_label_issue_close() {
        let action = Action::IssueClose { reason: None };
        assert_eq!(action_type_label(&action), "Issue Close");
    }

    #[test]
    fn test_action_type_label_issue_reopen() {
        assert_eq!(action_type_label(&Action::IssueReopen), "Issue Reopen");
    }

    #[test]
    fn test_action_type_label_patch_create() {
        let action = Action::PatchCreate {
            title: "t".to_string(),
            body: "b".to_string(),
            base_ref: "main".to_string(),
            head_commit: "abc".to_string(),
            fixes: None,
        };
        assert_eq!(action_type_label(&action), "Patch Create");
    }

    #[test]
    fn test_action_type_label_patch_review() {
        let action = Action::PatchReview {
            verdict: ReviewVerdict::Approve,
            body: "lgtm".to_string(),
        };
        assert_eq!(action_type_label(&action), "Patch Review");
    }

    #[test]
    fn test_action_type_label_inline_comment() {
        let action = Action::PatchInlineComment {
            file: "src/main.rs".to_string(),
            line: 42,
            body: "nit".to_string(),
        };
        assert_eq!(action_type_label(&action), "Inline Comment");
    }

    #[test]
    fn test_action_type_label_merge() {
        assert_eq!(action_type_label(&Action::Merge), "Merge");
    }

    // ── format_event_detail tests ────────────────────────────────────────

    #[test]
    fn test_format_event_detail_issue_open() {
        let oid = Oid::from_str("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa").unwrap();
        let event = crate::event::Event {
            timestamp: "2026-01-01T00:00:00Z".to_string(),
            author: test_author(),
            action: Action::IssueOpen {
                title: "My Issue".to_string(),
                body: "Description here".to_string(),
            },
        };
        let detail = format_event_detail(&oid, &event);
        assert!(detail.contains("aaaaaaa"));
        assert!(detail.contains("Test User <test@example.com>"));
        assert!(detail.contains("2026-01-01T00:00:00Z"));
        assert!(detail.contains("Issue Open"));
        assert!(detail.contains("Title: My Issue"));
        assert!(detail.contains("Description here"));
    }

    #[test]
    fn test_format_event_detail_issue_close_with_reason() {
        let oid = Oid::from_str("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb").unwrap();
        let event = crate::event::Event {
            timestamp: "2026-02-01T00:00:00Z".to_string(),
            author: test_author(),
            action: Action::IssueClose {
                reason: Some("resolved".to_string()),
            },
        };
        let detail = format_event_detail(&oid, &event);
        assert!(detail.contains("Issue Close"));
        assert!(detail.contains("Reason: resolved"));
    }

    #[test]
    fn test_format_event_detail_patch_review() {
        let oid = Oid::from_str("cccccccccccccccccccccccccccccccccccccccc").unwrap();
        let event = crate::event::Event {
            timestamp: "2026-03-01T00:00:00Z".to_string(),
            author: test_author(),
            action: Action::PatchReview {
                verdict: ReviewVerdict::Approve,
                body: "Looks good!".to_string(),
            },
        };
        let detail = format_event_detail(&oid, &event);
        assert!(detail.contains("Patch Review"));
        assert!(detail.contains("Approve"));
        assert!(detail.contains("Looks good!"));
    }

    #[test]
    fn test_format_event_detail_short_oid() {
        let oid = Oid::from_str("1234567890abcdef1234567890abcdef12345678").unwrap();
        let event = crate::event::Event {
            timestamp: "2026-01-01T00:00:00Z".to_string(),
            author: test_author(),
            action: Action::IssueReopen,
        };
        let detail = format_event_detail(&oid, &event);
        assert!(detail.contains("1234567"));
        assert!(detail.contains("Commit:  1234567\n"));
    }

    // ── handle_key tests for 'c' key ─────────────────────────────────────

    #[test]
    fn test_handle_key_c_in_detail_pane_returns_open_commit_browser() {
        let mut app = make_app(3, 0);
        app.pane = Pane::Detail;
        app.list_state.select(Some(0));
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::OpenCommitBrowser);
    }

    #[test]
    fn test_handle_key_c_in_item_list_pane_is_noop() {
        let mut app = make_app(3, 0);
        app.pane = Pane::ItemList;
        app.list_state.select(Some(0));
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
    }

    #[test]
    fn test_handle_key_c_no_selection_is_noop() {
        let mut app = make_app(0, 0);
        app.pane = Pane::Detail;
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
    }

    #[test]
    fn test_handle_key_ctrl_c_still_quits() {
        let mut app = make_app(3, 0);
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::CONTROL);
        assert_eq!(result, KeyAction::Quit);
    }

    // ── CommitList navigation tests ──────────────────────────────────────

    #[test]
    fn test_commit_list_navigate_down() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitList;

        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.event_list_state.selected(), Some(1));
    }

    #[test]
    fn test_commit_list_navigate_up() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(2));
        app.mode = ViewMode::CommitList;

        app.handle_key(KeyCode::Char('k'), KeyModifiers::empty());
        assert_eq!(app.event_list_state.selected(), Some(1));
    }

    #[test]
    fn test_commit_list_navigate_clamp_bottom() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(2));
        app.mode = ViewMode::CommitList;

        app.handle_key(KeyCode::Down, KeyModifiers::empty());
        assert_eq!(app.event_list_state.selected(), Some(2));
    }

    #[test]
    fn test_commit_list_navigate_clamp_top() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitList;

        app.handle_key(KeyCode::Up, KeyModifiers::empty());
        assert_eq!(app.event_list_state.selected(), Some(0));
    }

    #[test]
    fn test_commit_list_escape_returns_to_details() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(1));
        app.mode = ViewMode::CommitList;

        let result = app.handle_key(KeyCode::Esc, KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
        assert_eq!(app.mode, ViewMode::Details);
        assert!(app.event_history.is_empty());
        assert_eq!(app.event_list_state.selected(), None);
    }

    #[test]
    fn test_commit_list_q_quits() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitList;
        let result = app.handle_key(KeyCode::Char('q'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Quit);
    }

    // ── CommitDetail tests ───────────────────────────────────────────────

    #[test]
    fn test_commit_list_enter_opens_detail() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(1));
        app.mode = ViewMode::CommitList;

        let result = app.handle_key(KeyCode::Enter, KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
        assert_eq!(app.mode, ViewMode::CommitDetail);
        assert_eq!(app.scroll, 0);
    }

    #[test]
    fn test_commit_list_enter_no_selection_stays() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state = ListState::default();
        app.mode = ViewMode::CommitList;

        app.handle_key(KeyCode::Enter, KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::CommitList);
    }

    #[test]
    fn test_commit_detail_escape_returns_to_list() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitDetail;
        app.scroll = 5;

        let result = app.handle_key(KeyCode::Esc, KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
        assert_eq!(app.mode, ViewMode::CommitList);
        assert_eq!(app.scroll, 0);
        assert_eq!(app.event_history.len(), 3);
    }

    #[test]
    fn test_commit_detail_scroll() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitDetail;
        app.scroll = 0;

        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.scroll, 1);
        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.scroll, 2);
        app.handle_key(KeyCode::Char('k'), KeyModifiers::empty());
        assert_eq!(app.scroll, 1);
    }

    #[test]
    fn test_commit_detail_page_scroll() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitDetail;
        app.scroll = 0;

        app.handle_key(KeyCode::PageDown, KeyModifiers::empty());
        assert_eq!(app.scroll, 20);
        app.handle_key(KeyCode::PageUp, KeyModifiers::empty());
        assert_eq!(app.scroll, 0);
    }

    #[test]
    fn test_commit_detail_q_quits() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitDetail;
        let result = app.handle_key(KeyCode::Char('q'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Quit);
    }

    // ── Guard tests ──────────────────────────────────────────────────────

    #[test]
    fn test_c_ignored_in_commit_list_mode() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitList;
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
        assert_eq!(app.mode, ViewMode::CommitList);
    }

    #[test]
    fn test_c_ignored_in_commit_detail_mode() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitDetail;
        let result = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(result, KeyAction::Continue);
        assert_eq!(app.mode, ViewMode::CommitDetail);
    }

    // ── handle_key basic tests ───────────────────────────────────────────

    #[test]
    fn test_handle_key_quit() {
        let mut app = make_app(3, 3);
        assert_eq!(
            app.handle_key(KeyCode::Char('q'), KeyModifiers::empty()),
            KeyAction::Quit
        );
    }

    #[test]
    fn test_handle_key_quit_esc() {
        let mut app = make_app(3, 3);
        assert_eq!(
            app.handle_key(KeyCode::Esc, KeyModifiers::empty()),
            KeyAction::Quit
        );
    }

    #[test]
    fn test_handle_key_quit_ctrl_c() {
        let mut app = make_app(3, 3);
        assert_eq!(
            app.handle_key(KeyCode::Char('c'), KeyModifiers::CONTROL),
            KeyAction::Quit
        );
    }

    #[test]
    fn test_handle_key_tab_switch() {
        let mut app = make_app(3, 3);
        app.handle_key(KeyCode::Char('2'), KeyModifiers::empty());
        assert_eq!(app.tab, Tab::Patches);
        app.handle_key(KeyCode::Char('1'), KeyModifiers::empty());
        assert_eq!(app.tab, Tab::Issues);
    }

    #[test]
    fn test_handle_key_diff_toggle_patches() {
        let mut app = make_app(0, 3);
        app.switch_tab(Tab::Patches);
        assert_eq!(app.mode, ViewMode::Details);
        app.handle_key(KeyCode::Char('d'), KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::Diff);
        app.handle_key(KeyCode::Char('d'), KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::Details);
    }

    #[test]
    fn test_handle_key_diff_noop_issues() {
        let mut app = make_app(3, 0);
        assert_eq!(app.tab, Tab::Issues);
        assert_eq!(app.mode, ViewMode::Details);
        app.handle_key(KeyCode::Char('d'), KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::Details);
    }

    #[test]
    fn test_handle_key_pane_toggle() {
        let mut app = make_app(3, 3);
        assert_eq!(app.pane, Pane::ItemList);
        app.handle_key(KeyCode::Tab, KeyModifiers::empty());
        assert_eq!(app.pane, Pane::Detail);
        app.handle_key(KeyCode::Enter, KeyModifiers::empty());
        assert_eq!(app.pane, Pane::ItemList);
    }

    #[test]
    fn test_scroll_in_detail_pane() {
        let mut app = make_app(3, 0);
        app.list_state.select(Some(0));
        app.pane = Pane::Detail;
        app.scroll = 0;

        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.scroll, 1);
        assert_eq!(app.list_state.selected(), Some(0));

        app.handle_key(KeyCode::Char('k'), KeyModifiers::empty());
        assert_eq!(app.scroll, 0);
    }

    #[test]
    fn test_handle_key_reload() {
        let mut app = make_app(3, 3);
        assert_eq!(
            app.handle_key(KeyCode::Char('r'), KeyModifiers::empty()),
            KeyAction::Reload
        );
    }

    // ── selected_ref_name tests ──────────────────────────────────────────

    #[test]
    fn test_selected_ref_name_issues() {
        let app = make_app(3, 0);
        let ref_name = app.selected_ref_name();
        assert_eq!(
            ref_name,
            Some("refs/collab/issues/00000000".to_string())
        );
    }

    #[test]
    fn test_selected_ref_name_patches() {
        let mut app = make_app(0, 3);
        app.switch_tab(Tab::Patches);
        let ref_name = app.selected_ref_name();
        assert_eq!(
            ref_name,
            Some("refs/collab/patches/p0000000".to_string())
        );
    }

    #[test]
    fn test_selected_ref_name_none_when_empty() {
        let app = make_app(0, 0);
        assert_eq!(app.selected_ref_name(), None);
    }

    // ── Render tests ─────────────────────────────────────────────────────

    #[test]
    fn test_render_issues_tab() {
        let mut app = make_app(3, 2);
        app.status_filter = StatusFilter::All;
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "1:Issues");
        assert_buffer_contains(&buf, "2:Patches");
        assert_buffer_contains(&buf, "00000000");
        assert_buffer_contains(&buf, "00000001");
        assert_buffer_contains(&buf, "00000002");
        assert_buffer_contains(&buf, "Issue 0");
    }

    #[test]
    fn test_render_patches_tab() {
        let mut app = make_app(2, 3);
        app.status_filter = StatusFilter::All;
        app.switch_tab(Tab::Patches);
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "p0000000");
        assert_buffer_contains(&buf, "p0000001");
        assert_buffer_contains(&buf, "p0000002");
        assert_buffer_contains(&buf, "Patch 0");
    }

    #[test]
    fn test_render_empty_state() {
        let mut app = make_app(0, 0);
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "No matches for current filter.");
    }

    #[test]
    fn test_render_footer_keys() {
        let mut app = make_app(3, 3);
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "j/k:navigate");
        assert_buffer_contains(&buf, "Tab:pane");
        assert_buffer_contains(&buf, "c:events");
    }

    #[test]
    fn test_render_commit_list() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitList;
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "Event History");
        assert_buffer_contains(&buf, "Issue Open");
        assert_buffer_contains(&buf, "Issue Comment");
        assert_buffer_contains(&buf, "Issue Close");
    }

    #[test]
    fn test_render_commit_detail() {
        let mut app = make_app(3, 0);
        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitDetail;
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "Event Detail");
        assert_buffer_contains(&buf, "aaaaaaa");
        assert_buffer_contains(&buf, "Test User");
        assert_buffer_contains(&buf, "Issue Open");
    }

    #[test]
    fn test_render_commit_list_footer() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitList;
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "Esc:back");
    }

    #[test]
    fn test_render_commit_detail_footer() {
        let mut app = make_app(3, 0);
        app.mode = ViewMode::CommitDetail;
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "Esc:back");
        assert_buffer_contains(&buf, "j/k:scroll");
    }

    #[test]
    fn test_render_small_terminal() {
        let mut app = make_app(3, 3);
        let backend = TestBackend::new(20, 10);
        let mut terminal = Terminal::new(backend).unwrap();
        terminal.draw(|frame| ui(frame, &mut app)).unwrap();
    }

    // ── Integration: full browse flow ────────────────────────────────────

    #[test]
    fn test_full_commit_browse_flow() {
        let mut app = make_app(3, 0);
        app.pane = Pane::Detail;
        app.list_state.select(Some(0));

        let action = app.handle_key(KeyCode::Char('c'), KeyModifiers::empty());
        assert_eq!(action, KeyAction::OpenCommitBrowser);

        app.event_history = make_test_event_history();
        app.event_list_state.select(Some(0));
        app.mode = ViewMode::CommitList;
        app.scroll = 0;

        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.event_list_state.selected(), Some(1));

        app.handle_key(KeyCode::Enter, KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::CommitDetail);

        app.handle_key(KeyCode::Char('j'), KeyModifiers::empty());
        assert_eq!(app.scroll, 1);

        app.handle_key(KeyCode::Esc, KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::CommitList);
        assert_eq!(app.scroll, 0);

        app.handle_key(KeyCode::Esc, KeyModifiers::empty());
        assert_eq!(app.mode, ViewMode::Details);
        assert!(app.event_history.is_empty());
    }

    // ── Status message render test ───────────────────────────────────────

    #[test]
    fn test_render_status_message() {
        let mut app = make_app(3, 0);
        app.status_msg = Some("Error loading events: ref not found".to_string());
        let buf = render_app(&mut app);
        assert_buffer_contains(&buf, "Error loading events");
    }
}
diff --git a/tests/patch_import_test.rs b/tests/patch_import_test.rs
new file mode 100644
index 0000000..0dafb6d
--- /dev/null
+++ b/tests/patch_import_test.rs
@@ -0,0 +1,298 @@
use git2::Repository;
use std::path::{Path, PathBuf};
use tempfile::TempDir;

use git_collab::event::Author;
use git_collab::patch;
use git_collab::state::{self, PatchStatus};

// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------

fn alice() -> Author {
    Author {
        name: "Alice".to_string(),
        email: "alice@example.com".to_string(),
    }
}

/// Create a repo with an initial commit so we have a valid HEAD and tree.
fn init_repo_with_commit(dir: &Path, author: &Author) -> Repository {
    let repo = Repository::init(dir).expect("init repo");
    {
        let mut config = repo.config().unwrap();
        config.set_str("user.name", &author.name).unwrap();
        config.set_str("user.email", &author.email).unwrap();
    }

    // Create an initial commit with a file so we have a valid tree/HEAD
    let sig = git2::Signature::now(&author.name, &author.email).unwrap();
    let tree_oid = {
        let blob_oid = repo.blob(b"initial content\n").unwrap();
        let mut tb = repo.treebuilder(None).unwrap();
        tb.insert("README", blob_oid, 0o100644).unwrap();
        tb.write().unwrap()
    };
    {
        let tree = repo.find_tree(tree_oid).unwrap();
        repo.commit(Some("refs/heads/main"), &sig, &sig, "Initial commit", &tree, &[])
            .unwrap();
    }

    // Set HEAD to point to main
    repo.set_head("refs/heads/main").unwrap();

    repo
}

/// Generate a valid git format-patch style .patch file content.
/// This creates a patch that adds a new file called `filename` with `content`.
fn make_format_patch(
    from_name: &str,
    from_email: &str,
    subject: &str,
    body: &str,
    filename: &str,
    content: &str,
) -> String {
    let date = "Thu, 19 Mar 2026 10:30:00 +0000";
    // Build the diff portion
    let lines: Vec<&str> = content.lines().collect();
    let mut diff_lines = String::new();
    for line in &lines {
        diff_lines.push_str(&format!("+{}\n", line));
    }
    let line_count = lines.len();

    format!(
        "From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001\n\
         From: {} <{}>\n\
         Date: {}\n\
         Subject: [PATCH] {}\n\
         \n\
         {}\n\
         ---\n\
         {filename} | {line_count} +\n\
         1 file changed, {line_count} insertions(+)\n\
         create mode 100644 {filename}\n\
         \n\
         diff --git a/{filename} b/{filename}\n\
         new file mode 100644\n\
         index 0000000..1234567\n\
         --- /dev/null\n\
         +++ b/{filename}\n\
         @@ -0,0 +1,{line_count} @@\n\
         {diff_lines}\
         -- \n\
         2.40.0\n",
        from_name, from_email, date, subject, body,
        filename = filename,
        line_count = line_count,
        diff_lines = diff_lines,
    )
}

/// Write patch content to a file and return its path.
fn write_patch_file(dir: &Path, name: &str, content: &str) -> PathBuf {
    let path = dir.join(name);
    std::fs::write(&path, content).unwrap();
    path
}

// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------

#[test]
fn test_import_single_patch_success() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let patch_content = make_format_patch(
        "Bob",
        "bob@example.com",
        "Add hello.txt",
        "This patch adds a hello file.",
        "hello.txt",
        "Hello, world!\n",
    );

    let patch_dir = TempDir::new().unwrap();
    let patch_file = write_patch_file(patch_dir.path(), "0001-add-hello.patch", &patch_content);

    let id = patch::import(&repo, &patch_file).unwrap();

    // Verify the patch was created in the DAG
    let ref_name = format!("refs/collab/patches/{}", id);
    let patch_state = state::PatchState::from_ref(&repo, &ref_name, &id).unwrap();

    assert_eq!(patch_state.title, "Add hello.txt");
    assert_eq!(patch_state.status, PatchStatus::Open);
    assert_eq!(patch_state.author.name, "Alice"); // importer is the author in the DAG
    assert!(!patch_state.head_commit.is_empty());
    assert_eq!(patch_state.base_ref, "main");
}

#[test]
fn test_import_file_not_found() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let nonexistent = PathBuf::from("/tmp/does-not-exist-12345.patch");
    let result = patch::import(&repo, &nonexistent);
    assert!(result.is_err());
}

#[test]
fn test_import_malformed_patch() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let patch_dir = TempDir::new().unwrap();
    let patch_file = write_patch_file(
        patch_dir.path(),
        "bad.patch",
        "This is not a valid patch file at all.\nJust random text.\n",
    );

    let result = patch::import(&repo, &patch_file);
    assert!(result.is_err());
}

#[test]
fn test_import_creates_dag_entry_readable_by_show() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let patch_content = make_format_patch(
        "Charlie",
        "charlie@example.com",
        "Fix bug in parser",
        "Fixes an off-by-one error in the parser module.",
        "parser.txt",
        "fixed parser code\n",
    );

    let patch_dir = TempDir::new().unwrap();
    let patch_file = write_patch_file(patch_dir.path(), "0001-fix-bug.patch", &patch_content);

    let id = patch::import(&repo, &patch_file).unwrap();

    // Verify it can be resolved and read back through state infrastructure
    let (ref_name, resolved_id) = state::resolve_patch_ref(&repo, &id[..8]).unwrap();
    assert_eq!(resolved_id, id);

    let patch_state = state::PatchState::from_ref(&repo, &ref_name, &resolved_id).unwrap();
    assert_eq!(patch_state.title, "Fix bug in parser");
    assert!(
        patch_state.body.contains("off-by-one"),
        "body should contain the patch description"
    );
}

#[test]
fn test_import_series_multiple_files() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let patch1 = make_format_patch(
        "Bob",
        "bob@example.com",
        "Add file one",
        "First patch in series.",
        "one.txt",
        "one\n",
    );

    let patch2 = make_format_patch(
        "Bob",
        "bob@example.com",
        "Add file two",
        "Second patch in series.",
        "two.txt",
        "two\n",
    );

    let patch_dir = TempDir::new().unwrap();
    let f1 = write_patch_file(patch_dir.path(), "0001-add-one.patch", &patch1);
    let f2 = write_patch_file(patch_dir.path(), "0002-add-two.patch", &patch2);

    let ids = patch::import_series(&repo, &[f1, f2]).unwrap();
    assert_eq!(ids.len(), 2);

    // Both should be valid patches in the DAG
    for id in &ids {
        let (ref_name, _) = state::resolve_patch_ref(&repo, &id[..8]).unwrap();
        let ps = state::PatchState::from_ref(&repo, &ref_name, id).unwrap();
        assert_eq!(ps.status, PatchStatus::Open);
    }
}

#[test]
fn test_import_series_rollback_on_failure() {
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    let good_patch = make_format_patch(
        "Bob",
        "bob@example.com",
        "Add good file",
        "A good patch.",
        "good.txt",
        "good\n",
    );

    let patch_dir = TempDir::new().unwrap();
    let f1 = write_patch_file(patch_dir.path(), "0001-good.patch", &good_patch);
    let f2 = PathBuf::from("/tmp/nonexistent-bad-patch-12345.patch");

    let result = patch::import_series(&repo, &[f1, f2]);
    assert!(result.is_err());

    // After rollback, no patches should exist
    let patches = state::list_patches(&repo).unwrap();
    assert_eq!(patches.len(), 0, "rollback should remove all imported patches");
}

#[test]
fn test_import_patch_with_modification() {
    // Test importing a patch that modifies an existing file (not just new files)
    let tmp = TempDir::new().unwrap();
    let repo = init_repo_with_commit(tmp.path(), &alice());

    // Create a patch that modifies README (which exists in our initial commit)
    let date = "Thu, 19 Mar 2026 10:30:00 +0000";
    let patch_content = format!(
        "From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001\n\
         From: Bob <bob@example.com>\n\
         Date: {}\n\
         Subject: [PATCH] Update README\n\
         \n\
         Updated the README with more info.\n\
         ---\n\
         README | 2 +-\n\
         1 file changed, 1 insertion(+), 1 deletion(-)\n\
         \n\
         diff --git a/README b/README\n\
         index 1234567..abcdef0 100644\n\
         --- a/README\n\
         +++ b/README\n\
         @@ -1 +1 @@\n\
         -initial content\n\
         +updated content\n\
         -- \n\
         2.40.0\n",
        date,
    );

    let patch_dir = TempDir::new().unwrap();
    let patch_file = write_patch_file(patch_dir.path(), "0001-update-readme.patch", &patch_content);

    let id = patch::import(&repo, &patch_file).unwrap();

    let ref_name = format!("refs/collab/patches/{}", id);
    let ps = state::PatchState::from_ref(&repo, &ref_name, &id).unwrap();
    assert_eq!(ps.title, "Update README");
    assert_eq!(ps.status, PatchStatus::Open);
}