docs/superpowers/plans/2026-04-10-incremental-atlas-upload-implementation.md

Ref: Size: 16.2 KiB
# Incremental Atlas Upload Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Reduce atlas upload cost from ~1.7ms to near-zero by precomputing ASCII glyphs at startup and uploading only dirty atlas rows incrementally.

**Architecture:** Add `last_uploaded_y` and `needs_full_upload` tracking fields to the Atlas struct in `font.zig`. Add `uploadAtlasRegion` to `renderer.zig` with a persistent staging buffer, content-preserving layout transitions, and a dedicated transfer fence. Wire the precompute loop and incremental upload into `main.zig`.

**Tech Stack:** Zig 0.15, Vulkan host-visible staging buffers, image layout transitions, fence synchronization.

---

## File Structure

- Modify: `src/font.zig`
  - Add `last_uploaded_y: u32` and `needs_full_upload: bool` to `Atlas`
  - Update `init()` and `reset()` to set these fields
- Modify: `src/renderer.zig`
  - Add persistent staging buffer + dedicated transfer command buffer + transfer fence to `Context`
  - Add `uploadAtlasRegion(pixels, y_start, y_end, full)` method
  - Keep existing `uploadAtlas` as full-upload convenience wrapper
- Modify: `src/main.zig`
  - Add ASCII precompute loop at startup
  - Replace render-loop atlas upload with incremental path

### Task 1: Add dirty-region tracking fields to Atlas with tests

**Files:**
- Modify: `src/font.zig`
- Test: `src/font.zig`

- [ ] **Step 1: Write the failing tests**

Add at the bottom of `src/font.zig`, after the existing test blocks:

```zig
test "Atlas dirty tracking fields initialized correctly" {
    var atlas = try Atlas.init(std.testing.allocator, 256, 256);
    defer atlas.deinit();

    try std.testing.expectEqual(@as(u32, 0), atlas.last_uploaded_y);
    try std.testing.expect(atlas.needs_full_upload);
}

test "Atlas dirty region covers new glyphs" {
    var atlas = try Atlas.init(std.testing.allocator, 256, 256);
    defer atlas.deinit();

    // After init, cursor_y=0, row_height=1 (for the white pixel)
    const y_start = atlas.last_uploaded_y;
    const y_end = atlas.cursor_y + atlas.row_height;
    try std.testing.expectEqual(@as(u32, 0), y_start);
    try std.testing.expect(y_end > 0);
}

test "Atlas reset restores dirty tracking fields" {
    var atlas = try Atlas.init(std.testing.allocator, 256, 256);
    defer atlas.deinit();

    // Simulate having uploaded some region
    atlas.last_uploaded_y = 50;
    atlas.needs_full_upload = false;

    atlas.reset();

    try std.testing.expectEqual(@as(u32, 0), atlas.last_uploaded_y);
    try std.testing.expect(atlas.needs_full_upload);
}
```

- [ ] **Step 2: Run test to verify it fails**

Run: `zig build test 2>&1 | head -20`
Expected: FAIL — `last_uploaded_y` field does not exist.

- [ ] **Step 3: Add the fields to Atlas**

In `src/font.zig`, add to the `Atlas` struct fields (after `dirty: bool`):

```zig
    last_uploaded_y: u32,
    needs_full_upload: bool,
```

In `Atlas.init` (the return struct literal), add:

```zig
            .last_uploaded_y = 0,
            .needs_full_upload = true,
```

In `Atlas.reset`, add at the end (after `self.dirty = true;`):

```zig
        self.last_uploaded_y = 0;
        self.needs_full_upload = true;
```

- [ ] **Step 4: Run test to verify it passes**

Run: `zig build test 2>&1 | tail -5`
Expected: PASS

- [ ] **Step 5: Commit**

```bash
git add src/font.zig
git commit -m "Add dirty-region tracking fields to Atlas"
```

### Task 2: Add persistent staging buffer and transfer fence to renderer

**Files:**
- Modify: `src/renderer.zig`

- [ ] **Step 1: Add fields to Context struct**

In `src/renderer.zig`, add three new fields to the `Context` struct after `atlas_height: u32`:

```zig
    // Persistent atlas staging buffer (reused across frames)
    atlas_staging_buffer: vk.Buffer,
    atlas_staging_memory: vk.DeviceMemory,
    // Dedicated transfer command buffer + fence
    atlas_transfer_cb: vk.CommandBuffer,
    atlas_transfer_fence: vk.Fence,
```

- [ ] **Step 2: Allocate resources in Context.init**

In `Context.init`, after the atlas sampler creation and before the descriptor set update (around line 910), add:

```zig
        // --- Atlas staging buffer (persistent, reused across frames) ---
        const atlas_staging_size: vk.DeviceSize = @as(vk.DeviceSize, atlas_width) * atlas_height;
        const atlas_staging = try createHostVisibleBuffer(vki, pd_info.physical, vkd, device, atlas_staging_size, .{ .transfer_src_bit = true });
        errdefer {
            vkd.destroyBuffer(device, atlas_staging.buffer, null);
            vkd.freeMemory(device, atlas_staging.memory, null);
        }

        // --- Dedicated atlas transfer command buffer ---
        var atlas_transfer_cb: vk.CommandBuffer = undefined;
        try vkd.allocateCommandBuffers(device, &vk.CommandBufferAllocateInfo{
            .command_pool = command_pool,
            .level = .primary,
            .command_buffer_count = 1,
        }, @ptrCast(&atlas_transfer_cb));

        // --- Atlas transfer fence (starts signaled so first wait is a no-op) ---
        const atlas_transfer_fence = try vkd.createFence(device, &vk.FenceCreateInfo{
            .flags = .{ .signaled_bit = true },
        }, null);
        errdefer vkd.destroyFence(device, atlas_transfer_fence, null);
```

- [ ] **Step 3: Add new fields to the return struct**

In the return struct literal in `Context.init`, add after `.atlas_height = atlas_height`:

```zig
            .atlas_staging_buffer = atlas_staging.buffer,
            .atlas_staging_memory = atlas_staging.memory,
            .atlas_transfer_cb = atlas_transfer_cb,
            .atlas_transfer_fence = atlas_transfer_fence,
```

- [ ] **Step 4: Free resources in Context.deinit**

In `Context.deinit`, add after the atlas memory free (after `self.vkd.freeMemory(self.device, self.atlas_memory, null);`):

```zig
        self.vkd.destroyBuffer(self.device, self.atlas_staging_buffer, null);
        self.vkd.freeMemory(self.device, self.atlas_staging_memory, null);
        self.vkd.destroyFence(self.device, self.atlas_transfer_fence, null);
```

- [ ] **Step 5: Verify it compiles**

Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS

- [ ] **Step 6: Run tests**

Run: `zig build test 2>&1 | tail -5`
Expected: PASS

- [ ] **Step 7: Commit**

```bash
git add src/renderer.zig
git commit -m "Add persistent staging buffer and transfer fence to renderer"
```

### Task 3: Implement uploadAtlasRegion

**Files:**
- Modify: `src/renderer.zig`

- [ ] **Step 1: Add the uploadAtlasRegion method**

Add after the existing `uploadAtlas` method in `Context`:

```zig
    /// Upload a horizontal band of the atlas (y_start..y_end) to the GPU.
    /// Uses the persistent staging buffer and dedicated transfer command buffer.
    /// If `full` is true, transitions from UNDEFINED (for initial/reset uploads).
    /// Otherwise transitions from SHADER_READ_ONLY (preserves existing data).
    pub fn uploadAtlasRegion(
        self: *Context,
        pixels: []const u8,
        y_start: u32,
        y_end: u32,
        full: bool,
    ) !void {
        if (y_start >= y_end) return;

        const byte_offset: usize = @as(usize, y_start) * self.atlas_width;
        const byte_len: usize = @as(usize, y_end - y_start) * self.atlas_width;

        // Wait for any prior atlas transfer to finish before reusing staging buffer
        _ = try self.vkd.waitForFences(self.device, 1, @ptrCast(&self.atlas_transfer_fence), .true, std.math.maxInt(u64));
        try self.vkd.resetFences(self.device, 1, @ptrCast(&self.atlas_transfer_fence));

        // Copy dirty band into staging buffer
        const mapped = try self.vkd.mapMemory(self.device, self.atlas_staging_memory, 0, @intCast(byte_len), .{});
        @memcpy(@as([*]u8, @ptrCast(mapped))[0..byte_len], pixels[byte_offset .. byte_offset + byte_len]);
        self.vkd.unmapMemory(self.device, self.atlas_staging_memory);

        // Record transfer command
        try self.vkd.resetCommandBuffer(self.atlas_transfer_cb, .{});
        try self.vkd.beginCommandBuffer(self.atlas_transfer_cb, &vk.CommandBufferBeginInfo{
            .flags = .{ .one_time_submit_bit = true },
        });

        // Barrier: old_layout -> TRANSFER_DST
        const old_layout: vk.ImageLayout = if (full) .undefined else .shader_read_only_optimal;
        const barrier_to_transfer = vk.ImageMemoryBarrier{
            .src_access_mask = if (full) @as(vk.AccessFlags, .{}) else .{ .shader_read_bit = true },
            .dst_access_mask = .{ .transfer_write_bit = true },
            .old_layout = old_layout,
            .new_layout = .transfer_dst_optimal,
            .src_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
            .dst_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
            .image = self.atlas_image,
            .subresource_range = .{
                .aspect_mask = .{ .color_bit = true },
                .base_mip_level = 0,
                .level_count = 1,
                .base_array_layer = 0,
                .layer_count = 1,
            },
        };
        const src_stage: vk.PipelineStageFlags = if (full) .{ .top_of_pipe_bit = true } else .{ .fragment_shader_bit = true };
        self.vkd.cmdPipelineBarrier(
            self.atlas_transfer_cb,
            src_stage,
            .{ .transfer_bit = true },
            .{},
            0, null,
            0, null,
            1, @ptrCast(&barrier_to_transfer),
        );

        // Copy staging buffer -> image (dirty band only)
        const region = vk.BufferImageCopy{
            .buffer_offset = 0,
            .buffer_row_length = 0,
            .buffer_image_height = 0,
            .image_subresource = .{
                .aspect_mask = .{ .color_bit = true },
                .mip_level = 0,
                .base_array_layer = 0,
                .layer_count = 1,
            },
            .image_offset = .{ .x = 0, .y = @intCast(y_start), .z = 0 },
            .image_extent = .{ .width = self.atlas_width, .height = y_end - y_start, .depth = 1 },
        };
        self.vkd.cmdCopyBufferToImage(
            self.atlas_transfer_cb,
            self.atlas_staging_buffer,
            self.atlas_image,
            .transfer_dst_optimal,
            1,
            @ptrCast(&region),
        );

        // Barrier: TRANSFER_DST -> SHADER_READ_ONLY
        const barrier_to_shader = vk.ImageMemoryBarrier{
            .src_access_mask = .{ .transfer_write_bit = true },
            .dst_access_mask = .{ .shader_read_bit = true },
            .old_layout = .transfer_dst_optimal,
            .new_layout = .shader_read_only_optimal,
            .src_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
            .dst_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
            .image = self.atlas_image,
            .subresource_range = .{
                .aspect_mask = .{ .color_bit = true },
                .base_mip_level = 0,
                .level_count = 1,
                .base_array_layer = 0,
                .layer_count = 1,
            },
        };
        self.vkd.cmdPipelineBarrier(
            self.atlas_transfer_cb,
            .{ .transfer_bit = true },
            .{ .fragment_shader_bit = true },
            .{},
            0, null,
            0, null,
            1, @ptrCast(&barrier_to_shader),
        );

        try self.vkd.endCommandBuffer(self.atlas_transfer_cb);

        // Submit with dedicated fence (no queueWaitIdle)
        try self.vkd.queueSubmit(self.graphics_queue, 1, @ptrCast(&vk.SubmitInfo{
            .command_buffer_count = 1,
            .p_command_buffers = @ptrCast(&self.atlas_transfer_cb),
        }), self.atlas_transfer_fence);
    }
```

- [ ] **Step 2: Verify it compiles**

Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS

- [ ] **Step 3: Run tests**

Run: `zig build test 2>&1 | tail -5`
Expected: PASS

- [ ] **Step 4: Commit**

```bash
git add src/renderer.zig
git commit -m "Implement uploadAtlasRegion with incremental uploads"
```

### Task 4: Add ASCII precompute and wire incremental upload into main.zig

**Files:**
- Modify: `src/main.zig`

- [ ] **Step 1: Add ASCII precompute at startup**

In `src/main.zig`, replace the block at lines 171-172:

```zig
    // Upload empty atlas first (so descriptor set is valid)
    try ctx.uploadAtlas(atlas.pixels);
```

With:

```zig
    // Precompute printable ASCII glyphs (32-126) into atlas
    for (32..127) |cp| {
        _ = atlas.getOrInsert(&face, @intCast(cp)) catch |err| switch (err) {
            error.AtlasFull => break,
            else => return err,
        };
    }
    // Upload warm atlas (full upload — descriptor set needs valid data)
    try ctx.uploadAtlas(atlas.pixels);
    atlas.last_uploaded_y = atlas.cursor_y;
    atlas.needs_full_upload = false;
    atlas.dirty = false;
```

- [ ] **Step 2: Replace the render-loop atlas upload**

In `src/main.zig`, replace the atlas upload block (lines 477-482):

```zig
        // Re-upload atlas if new glyphs were added
        if (atlas.dirty) {
            try ctx.uploadAtlas(atlas.pixels);
            atlas.dirty = false;
            render_cache.layout_dirty = true;
        }
```

With:

```zig
        // Re-upload atlas if new glyphs were added (incremental)
        if (atlas.dirty) {
            const y_start = atlas.last_uploaded_y;
            const y_end = atlas.cursor_y + atlas.row_height;
            if (y_start < y_end) {
                try ctx.uploadAtlasRegion(
                    atlas.pixels,
                    y_start,
                    y_end,
                    atlas.needs_full_upload,
                );
                atlas.last_uploaded_y = atlas.cursor_y;
                atlas.needs_full_upload = false;
                render_cache.layout_dirty = true;
            }
            atlas.dirty = false;
        }
```

- [ ] **Step 3: Verify it compiles**

Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS

- [ ] **Step 4: Run tests**

Run: `zig build test 2>&1 | tail -5`
Expected: PASS

- [ ] **Step 5: Commit**

```bash
git add src/main.zig
git commit -m "Wire ASCII precompute and incremental atlas upload"
```

### Task 5: Full verification

**Files:**
- Test: `src/font.zig`, `src/renderer.zig`, `src/main.zig`

- [ ] **Step 1: Run the full test suite**

Run: `zig build test`
Expected: PASS

- [ ] **Step 2: Manual smoke test — normal run**

Run: `zig build run`
Expected:
- Terminal opens and shows text correctly (precomputed ASCII atlas).
- Typing normal text works. Cursor renders.
- Exit dumps frame timing stats — atlas_upload should be 0 for most frames.

- [ ] **Step 3: Manual smoke test — Unicode character**

Run inside terminal: `echo "★ ← → ★"`
Expected: Characters render correctly (incremental upload fires for the first time these codepoints appear).

- [ ] **Step 4: Manual smoke test — bench comparison**

Run: `make bench`
Expected:
- atlas_upload avg should drop significantly from the baseline ~1700us.
- Steady-state frames should show atlas_upload near 0.

- [ ] **Step 5: Commit if any fixups were needed**

```bash
git add src/font.zig src/renderer.zig src/main.zig
git commit -m "Fix verification issues for incremental atlas upload"
```

## Self-Review

- **Spec coverage:**
  - `last_uploaded_y` + `needs_full_upload` fields: Task 1
  - `reset()` sets both fields: Task 1
  - Persistent staging buffer: Task 2
  - Transfer fence (starts signaled): Task 2
  - `uploadAtlasRegion` with partial copy: Task 3
  - Layout transition: `UNDEFINED` vs `SHADER_READ_ONLY` based on `full` flag: Task 3
  - Post-copy barrier back to `SHADER_READ_ONLY`: Task 3
  - Fence wait before reusing staging buffer: Task 3
  - No `queueWaitIdle`: Task 3
  - ASCII precompute (32-126): Task 4
  - Render-loop incremental wiring with `y_start < y_end` guard: Task 4
  - `last_uploaded_y = cursor_y` (not `cursor_y + row_height`): Task 4
  - Bench comparison: Task 5
- **Placeholder scan:** No TBD/TODO markers. All code blocks are complete.
- **Type consistency:**
  - `Atlas.last_uploaded_y` and `Atlas.needs_full_upload` defined in Task 1, used in Task 4
  - `Context.atlas_staging_buffer`, `atlas_staging_memory`, `atlas_transfer_cb`, `atlas_transfer_fence` defined in Task 2, used in Task 3
  - `uploadAtlasRegion(pixels, y_start, y_end, full)` defined in Task 3, called in Task 4
  - Existing `uploadAtlas` kept unchanged — used for initial full upload in Task 4