24ac2e14
Add spec for Gerrit-style patchsets
a73x 2026-03-21 17:17
Replaces the current branch-wrapper patch model with revision-aware review: numbered patchsets, per-patchset comments, interdiffs, and configurable merge policy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
diff --git a/specs/015-gerrit-style-patchsets/spec.md b/specs/015-gerrit-style-patchsets/spec.md new file mode 100644 index 0000000..d41a0bb --- /dev/null +++ b/specs/015-gerrit-style-patchsets/spec.md @@ -0,0 +1,220 @@ # Feature Specification: Gerrit-Style Patchsets **Feature Branch**: `015-gerrit-style-patchsets` **Created**: 2026-03-21 **Status**: Draft **Input**: Patches-as-branches are redundant when users share a git host — branches already exist. The differentiator is Gerrit-style change tracking: numbered patchsets, comments anchored to specific revisions, and interdiffs between revisions. This is the thing GitHub PRs are actually bad at. ## Motivation The current patch model wraps a git branch with a review conversation. When collaborators share a remote, the branch already exists — the collab layer adds review metadata but doesn't solve any problem that GitHub/GitLab PRs don't already handle better. What Gerrit gets right (and GitHub gets wrong) is **revision-aware review**: 1. A reviewer leaves inline comments on version 1 of the code 2. The author amends/rebases and pushes version 2 3. The reviewer can see: what changed between v1 and v2 (interdiff), which comments are on stale code, and which lines are untouched 4. Comments stay anchored to the patchset they were made on — they don't float to wrong line numbers This spec transforms git-collab patches into Gerrit-style changes with numbered patchsets. ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Automatic Patchset Recording on Write Operations (Priority: P1) The system automatically detects when the branch tip has changed since the last recorded patchset. This detection happens only during **write operations** that touch the patch DAG — `patch comment`, `patch review`, `patch merge`, and `patch create`. Before appending their own event, these commands check the branch tip against the last recorded patchset's commit OID. If they differ, a `PatchUpdate` event is appended first, then the actual event. Read operations (`patch show`, `patch list`, `patch diff`, TUI) never modify the DAG. Patchset numbers are **not stored in events** — they are derived from DAG order during materialization. The Nth `PatchUpdate` event (in topological order) is patchset N+1 (patchset 1 comes from `PatchCreate`). This avoids conflicts when two collaborators independently detect the same branch change. **Why this priority**: Without patchset recording, nothing else (anchored comments, interdiff) works. Automatic recording on writes means developers don't need to learn a new command — patchsets appear naturally as the review conversation progresses. **Independent Test**: Create a patch (patchset 1 from create), push a commit, run `patch comment`, verify a PatchUpdate event was auto-inserted before the comment event. **Acceptance Scenarios**: 1. **Given** a patch exists with patchset 1, **When** the developer pushes new commits and runs `patch comment <id>`, **Then** a PatchUpdate event is auto-inserted before the comment, and the comment is anchored to patchset 2. 2. **Given** the branch hasn't changed since the last patchset, **When** the developer runs `patch comment`, **Then** no PatchUpdate is inserted — the comment is anchored to the current patchset. 3. **Given** a patch is created with `patch create`, **Then** patchset 1 is recorded in the PatchCreate event with the branch tip commit OID and tree OID. 4. **Given** a patch has 3 patchsets, **When** viewing the patch, **Then** all 3 patchsets are listed with their number (derived from DAG order), commit OID, and timestamp. 5. **Given** a developer rebases their branch (force-push), **When** the next write operation touches the patch, **Then** a PatchUpdate event is auto-inserted capturing the post-rebase state. 6. **Given** the branch has been deleted, **When** a write operation attempts auto-detection, **Then** no PatchUpdate is inserted — the branch is simply unavailable. 7. **Given** two collaborators independently detect the same branch change and create PatchUpdate events, **When** their DAGs are synced and merged, **Then** patchset numbers are derived from topological order — both events contribute but the numbering remains consistent after materialization. --- ### User Story 2 - Inline Comments Anchored to Patchsets (Priority: P1) A reviewer leaves an inline comment on a specific file and line. The comment is anchored to the current patchset number. Comments belong to the patchset they were made on — period. There is no attempt to track or map comments across patchsets. When viewing a specific patchset, you see its comments. When viewing a different patchset, you see that patchset's comments. This is the Gerrit model: each patchset is its own self-contained review surface. **Why this priority**: This is the core value proposition — without patchset-anchored comments, we're just a worse GitHub PR. **Independent Test**: Create a patch, leave an inline comment on patchset 1, publish patchset 2, leave a different comment on patchset 2, verify each patchset shows only its own comments. **Acceptance Scenarios**: 1. **Given** a patch is on patchset 3, **When** a reviewer adds an inline comment, **Then** the comment is recorded with `patchset: 3` in the event. 2. **Given** comments exist on patchsets 1 and 3, **When** viewing with `patch show <id> --patchset 1`, **Then** only patchset 1's comments are shown. 3. **Given** `patch show <id>` is run without `--patchset`, **Then** comments from all patchsets are shown, grouped by patchset number. 4. **Given** an inline comment was made without a patchset number (legacy), **Then** it is displayed with "(patchset: unknown)" and grouped separately. 5. **Given** a reviewer is viewing patchset 2, **When** they add an inline comment, **Then** the comment is anchored to patchset 2 regardless of what the latest patchset is. --- ### User Story 3 - View Interdiff Between Patchsets (Priority: P1) A reviewer wants to see only what changed between two revisions of the patch, not the full diff against the base. They run `patch diff <id> --between 1 2` and see the diff between patchset 1's tree and patchset 2's tree. This is the critical "what did the author actually change in response to my feedback" view. **Why this priority**: This is the killer feature that makes patchsets useful. Without interdiff, numbered patchsets are just bookmarks. **Independent Test**: Create a patch with 3 patchsets where each modifies different files, run `patch diff --between 1 3`, verify the diff shows only the changes between those two snapshots. **Acceptance Scenarios**: 1. **Given** a patch has patchsets 1 and 2, **When** running `patch diff <id> --between 1 2`, **Then** the output is the unified diff between patchset 1's tree and patchset 2's tree. 2. **Given** a patch has patchsets 1, 2, and 3, **When** running `patch diff <id> --between 1 3`, **Then** the diff spans from patchset 1 to 3, skipping 2. 3. **Given** a patch has only 1 patchset, **When** running `patch diff <id> --between 1 2`, **Then** the system errors with "patchset 2 not found". 4. **Given** `patch diff <id>` is run without `--between`, **Then** behavior is unchanged — shows the diff between merge-base and current branch tip (the latest code). 5. **Given** patchsets 1 and 2 have identical trees (edge case: metadata-only revision), **Then** interdiff is empty. --- ### User Story 4 - View Patchset History (Priority: P2) A reviewer or author wants to see the revision history of a patch — all patchsets with their numbers, timestamps, commit OIDs, and which files changed between each consecutive pair. This gives a timeline of how the patch evolved. **Why this priority**: Useful context for reviewers picking up a patch mid-conversation, but not blocking for the core review workflow. **Independent Test**: Create a patch with 3 patchsets, run `patch history <id>`, verify all 3 are listed with timestamps and file-change summaries. **Acceptance Scenarios**: 1. **Given** a patch with 3 patchsets, **When** running `patch history <id>`, **Then** the output lists each patchset with number, timestamp, commit short OID, and a summary like "+2 -1 files changed" relative to the previous patchset. 2. **Given** `--json` flag, **Then** the output is a JSON array of patchset objects. 3. **Given** a patch with only 1 patchset, **Then** the history shows just patchset 1 with "(initial)" label. --- ### User Story 5 - Review Anchored to Patchsets (Priority: P2) A reviewer submits a review (approve/request-changes). The review is anchored to the current patchset. Reviews belong to their patchset, same as comments. `patch show` displays the latest review per reviewer with its patchset number, so it's clear whether the approval is on the current code or an older version. **Why this priority**: Important for understanding review state, but the core approve/request-changes mechanics already exist. **Acceptance Scenarios**: 1. **Given** a reviewer approves on patchset 3 and patchset 4 is recorded, **When** viewing the patch, **Then** the approval is shown as "approved (patchset 3)". 2. **Given** a reviewer requests changes on patchset 2, **When** viewing on patchset 4, **Then** the request-changes review is shown with its patchset number. 3. **Given** `patch show`, **Then** the latest review per reviewer is highlighted, with patchset context. 4. **Given** a configurable merge policy exists, **When** merging, **Then** the policy determines whether approval must be on the latest patchset. See FR-014. --- ### Edge Cases - What if the branch is deleted when auto-detection runs? No new patchset is created. The last recorded patchset remains current. - What if the base branch has moved significantly between patchsets? The interdiff still works — it compares the two patchset trees directly, independent of the base. - What about merge commits in the branch? The patchset snapshots the tree at the branch tip, regardless of history shape. - What if a reviewer comments on a patchset that doesn't exist yet (race condition in distributed sync)? The comment is stored with its patchset number; if the patchset doesn't exist locally, it's displayed with "(patchset N — not yet synced)". - What if the tree hasn't changed but commits were amended (message-only change)? Auto-detection compares commit OID, not tree OID, so a new patchset is recorded. The interdiff will show no code changes. - What about backward compat with existing patches that have no patchsets? They work as today. Inline comments without a patchset field are treated as "patchset: unknown". The `--between` flag is unavailable. The first read that detects a branch tip different from the stored one bootstraps patchset tracking. - What about auto-recording requiring a signing key? The auto-detection must have access to the signing key to append a DAG event. If the key is unavailable (e.g., read-only clone), patchset detection is skipped silently — the patch still works, just without new patchset tracking. ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: `patch create` MUST record patchset 1 in the PatchCreate event, capturing the branch tip commit OID and tree OID. - **FR-002**: Write operations (`patch comment`, `patch review`, `patch merge`) MUST automatically detect when the branch tip differs from the last recorded patchset and prepend a `PatchUpdate` event before their own event. - **FR-003**: Auto-detection MUST compare commit OIDs (not tree OIDs) to decide whether a new patchset is needed. - **FR-003a**: Patchset numbers MUST be derived from topological DAG order during materialization, NOT stored in events. The Nth `PatchUpdate` in topological order is patchset N+1. - **FR-004**: Inline comments (`patch.inline_comment` events) MUST include a `patchset` field recording the current patchset number at the time of commenting. - **FR-005**: Reviews (`patch.review` events) MUST include a `patchset` field recording the current patchset number at the time of review. - **FR-006**: `patch diff <id> --between N M` MUST compute the diff between patchset N's tree and patchset M's tree. - **FR-007**: `patch diff <id>` without `--between` MUST continue to show the merge-base diff against the current branch tip (existing behavior). - **FR-008**: `patch show` MUST display the current patchset number, list all patchsets, and annotate comments/reviews with their patchset context. - **FR-009**: `patch show --patchset N` MUST filter to show only comments and reviews from patchset N. - **FR-010**: `patch history <id>` MUST list all patchsets with number, timestamp, commit OID, and file-change summary. - **FR-011**: System MUST handle existing patches without patchsets gracefully — they continue to work as today, with comments shown as "patchset: unknown". - **FR-012**: Patchset data MUST be stored in the collab DAG (not external state), so it syncs via the existing sync mechanism. - **FR-013**: Auto-detection on a legacy patch (no patchsets) MUST bootstrap by treating the current branch tip as patchset 1. - **FR-014**: Merge policy MUST be configurable via collab config: `merge.require_approval_on_latest` (boolean, default false). When true, `patch merge` requires at least one approval on the latest patchset. When false, any prior approval suffices (current behavior). ### Key Entities - **Patchset**: A numbered snapshot of the branch state at a point in time. Contains: patchset number (1-indexed, monotonically increasing), commit OID (branch tip), tree OID (for fast interdiff), and timestamp. Stored as `PatchUpdate` events in the review DAG. Patchset 1 is embedded in the `PatchCreate` event. - **Anchored Comment**: An inline comment or review that references a specific patchset number. The patchset field is optional for backward compatibility with legacy comments. - **Interdiff**: The diff between two patchset trees. Computed on-the-fly from stored tree OIDs, not pre-computed. ### Event Schema Changes New event type (patchset number omitted — derived from DAG order): ```json { "type": "patch.update", "commit": "abc123...", "tree": "def456..." } ``` Modified `patch.create` — add optional snapshot fields: ```json { "type": "patch.create", "title": "...", "body": "...", "base_ref": "main", "branch": "feature-foo", "commit": "abc123...", "tree": "def456..." } ``` Modified `patch.inline_comment` — add patchset field (derived number at time of event creation): ```json { "type": "patch.inline_comment", "file": "src/lib.rs", "line": 42, "body": "...", "patchset": 3 } ``` Modified `patch.review` — add patchset field (derived number at time of event creation): ```json { "type": "patch.review", "verdict": "approve", "body": "...", "patchset": 3 } ``` Note: The `patchset` field on comments and reviews stores the derived patchset number at the time the event was created. This is a convenience for display — the authoritative patchset boundaries are always the PatchCreate + PatchUpdate events in topological order. ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: `patch update` correctly records patchset snapshots; `patch history` shows all patchsets in order. - **SC-002**: Inline comments reference their patchset number; `patch show` displays patchset context for every comment. - **SC-003**: `patch diff --between N M` produces the correct interdiff between any two patchsets. - **SC-004**: Staleness detection correctly identifies when a comment's file has changed in later patchsets. - **SC-005**: Existing patches without patchsets continue to function identically to current behavior. - **SC-006**: All patchset data survives sync (push/fetch) between collaborators. - **SC-007**: The TUI dashboard displays patchset count and current patchset number for each patch. ## Assumptions - Patchset recording is automatic on write operations only. Reads never modify the DAG. This means a patchset boundary is only recorded when someone actually interacts with the patch (comments, reviews, merges), not on passive viewing. - Tree OIDs are stored alongside commit OIDs for fast interdiff computation (avoids re-walking commit history to find the tree). - Comments and reviews belong to their patchset. No cross-patchset tracking, no line-mapping, no staleness detection. Each patchset is a self-contained review surface. The interdiff is how you see what changed. - Patchset numbers are derived from topological DAG order during materialization, not stored in events. This makes concurrent auto-detection safe — duplicate PatchUpdate events for the same branch change result in consistent numbering after sync. - `PatchRevise` is superseded by `PatchUpdate` — the revise command can be kept as an alias or deprecated. - Patchset 1 data in `PatchCreate` is optional for backward compatibility — old patches without it simply don't have patchset tracking until auto-detection bootstraps it. - Auto-detection requires a signing key to append events. If unavailable, the write operation proceeds without inserting a PatchUpdate (the comment/review is anchored to the last known patchset). - Merge policy is configurable, not hardcoded. Default is permissive (any approval suffices). ## Resolved Questions 1. **Gap between push and first write**: `patch update <id>` (replacing `patch revise`) serves as the explicit "I've pushed new commits, please re-review" signal. It appends a `PatchUpdate` event if the branch has changed. Auto-detection on write operations (`comment`, `review`, `merge`) catches cases where the developer forgets. Both paths produce the same event.