docs/superpowers/plans/2026-04-10-incremental-atlas-upload-implementation.md
Ref: Size: 16.2 KiB
# Incremental Atlas Upload Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Reduce atlas upload cost from ~1.7ms to near-zero by precomputing ASCII glyphs at startup and uploading only dirty atlas rows incrementally.
**Architecture:** Add `last_uploaded_y` and `needs_full_upload` tracking fields to the Atlas struct in `font.zig`. Add `uploadAtlasRegion` to `renderer.zig` with a persistent staging buffer, content-preserving layout transitions, and a dedicated transfer fence. Wire the precompute loop and incremental upload into `main.zig`.
**Tech Stack:** Zig 0.15, Vulkan host-visible staging buffers, image layout transitions, fence synchronization.
---
## File Structure
- Modify: `src/font.zig`
- Add `last_uploaded_y: u32` and `needs_full_upload: bool` to `Atlas`
- Update `init()` and `reset()` to set these fields
- Modify: `src/renderer.zig`
- Add persistent staging buffer + dedicated transfer command buffer + transfer fence to `Context`
- Add `uploadAtlasRegion(pixels, y_start, y_end, full)` method
- Keep existing `uploadAtlas` as full-upload convenience wrapper
- Modify: `src/main.zig`
- Add ASCII precompute loop at startup
- Replace render-loop atlas upload with incremental path
### Task 1: Add dirty-region tracking fields to Atlas with tests
**Files:**
- Modify: `src/font.zig`
- Test: `src/font.zig`
- [ ] **Step 1: Write the failing tests**
Add at the bottom of `src/font.zig`, after the existing test blocks:
```zig
test "Atlas dirty tracking fields initialized correctly" {
var atlas = try Atlas.init(std.testing.allocator, 256, 256);
defer atlas.deinit();
try std.testing.expectEqual(@as(u32, 0), atlas.last_uploaded_y);
try std.testing.expect(atlas.needs_full_upload);
}
test "Atlas dirty region covers new glyphs" {
var atlas = try Atlas.init(std.testing.allocator, 256, 256);
defer atlas.deinit();
// After init, cursor_y=0, row_height=1 (for the white pixel)
const y_start = atlas.last_uploaded_y;
const y_end = atlas.cursor_y + atlas.row_height;
try std.testing.expectEqual(@as(u32, 0), y_start);
try std.testing.expect(y_end > 0);
}
test "Atlas reset restores dirty tracking fields" {
var atlas = try Atlas.init(std.testing.allocator, 256, 256);
defer atlas.deinit();
// Simulate having uploaded some region
atlas.last_uploaded_y = 50;
atlas.needs_full_upload = false;
atlas.reset();
try std.testing.expectEqual(@as(u32, 0), atlas.last_uploaded_y);
try std.testing.expect(atlas.needs_full_upload);
}
```
- [ ] **Step 2: Run test to verify it fails**
Run: `zig build test 2>&1 | head -20`
Expected: FAIL — `last_uploaded_y` field does not exist.
- [ ] **Step 3: Add the fields to Atlas**
In `src/font.zig`, add to the `Atlas` struct fields (after `dirty: bool`):
```zig
last_uploaded_y: u32,
needs_full_upload: bool,
```
In `Atlas.init` (the return struct literal), add:
```zig
.last_uploaded_y = 0,
.needs_full_upload = true,
```
In `Atlas.reset`, add at the end (after `self.dirty = true;`):
```zig
self.last_uploaded_y = 0;
self.needs_full_upload = true;
```
- [ ] **Step 4: Run test to verify it passes**
Run: `zig build test 2>&1 | tail -5`
Expected: PASS
- [ ] **Step 5: Commit**
```bash
git add src/font.zig
git commit -m "Add dirty-region tracking fields to Atlas"
```
### Task 2: Add persistent staging buffer and transfer fence to renderer
**Files:**
- Modify: `src/renderer.zig`
- [ ] **Step 1: Add fields to Context struct**
In `src/renderer.zig`, add three new fields to the `Context` struct after `atlas_height: u32`:
```zig
// Persistent atlas staging buffer (reused across frames)
atlas_staging_buffer: vk.Buffer,
atlas_staging_memory: vk.DeviceMemory,
// Dedicated transfer command buffer + fence
atlas_transfer_cb: vk.CommandBuffer,
atlas_transfer_fence: vk.Fence,
```
- [ ] **Step 2: Allocate resources in Context.init**
In `Context.init`, after the atlas sampler creation and before the descriptor set update (around line 910), add:
```zig
// --- Atlas staging buffer (persistent, reused across frames) ---
const atlas_staging_size: vk.DeviceSize = @as(vk.DeviceSize, atlas_width) * atlas_height;
const atlas_staging = try createHostVisibleBuffer(vki, pd_info.physical, vkd, device, atlas_staging_size, .{ .transfer_src_bit = true });
errdefer {
vkd.destroyBuffer(device, atlas_staging.buffer, null);
vkd.freeMemory(device, atlas_staging.memory, null);
}
// --- Dedicated atlas transfer command buffer ---
var atlas_transfer_cb: vk.CommandBuffer = undefined;
try vkd.allocateCommandBuffers(device, &vk.CommandBufferAllocateInfo{
.command_pool = command_pool,
.level = .primary,
.command_buffer_count = 1,
}, @ptrCast(&atlas_transfer_cb));
// --- Atlas transfer fence (starts signaled so first wait is a no-op) ---
const atlas_transfer_fence = try vkd.createFence(device, &vk.FenceCreateInfo{
.flags = .{ .signaled_bit = true },
}, null);
errdefer vkd.destroyFence(device, atlas_transfer_fence, null);
```
- [ ] **Step 3: Add new fields to the return struct**
In the return struct literal in `Context.init`, add after `.atlas_height = atlas_height`:
```zig
.atlas_staging_buffer = atlas_staging.buffer,
.atlas_staging_memory = atlas_staging.memory,
.atlas_transfer_cb = atlas_transfer_cb,
.atlas_transfer_fence = atlas_transfer_fence,
```
- [ ] **Step 4: Free resources in Context.deinit**
In `Context.deinit`, add after the atlas memory free (after `self.vkd.freeMemory(self.device, self.atlas_memory, null);`):
```zig
self.vkd.destroyBuffer(self.device, self.atlas_staging_buffer, null);
self.vkd.freeMemory(self.device, self.atlas_staging_memory, null);
self.vkd.destroyFence(self.device, self.atlas_transfer_fence, null);
```
- [ ] **Step 5: Verify it compiles**
Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS
- [ ] **Step 6: Run tests**
Run: `zig build test 2>&1 | tail -5`
Expected: PASS
- [ ] **Step 7: Commit**
```bash
git add src/renderer.zig
git commit -m "Add persistent staging buffer and transfer fence to renderer"
```
### Task 3: Implement uploadAtlasRegion
**Files:**
- Modify: `src/renderer.zig`
- [ ] **Step 1: Add the uploadAtlasRegion method**
Add after the existing `uploadAtlas` method in `Context`:
```zig
/// Upload a horizontal band of the atlas (y_start..y_end) to the GPU.
/// Uses the persistent staging buffer and dedicated transfer command buffer.
/// If `full` is true, transitions from UNDEFINED (for initial/reset uploads).
/// Otherwise transitions from SHADER_READ_ONLY (preserves existing data).
pub fn uploadAtlasRegion(
self: *Context,
pixels: []const u8,
y_start: u32,
y_end: u32,
full: bool,
) !void {
if (y_start >= y_end) return;
const byte_offset: usize = @as(usize, y_start) * self.atlas_width;
const byte_len: usize = @as(usize, y_end - y_start) * self.atlas_width;
// Wait for any prior atlas transfer to finish before reusing staging buffer
_ = try self.vkd.waitForFences(self.device, 1, @ptrCast(&self.atlas_transfer_fence), .true, std.math.maxInt(u64));
try self.vkd.resetFences(self.device, 1, @ptrCast(&self.atlas_transfer_fence));
// Copy dirty band into staging buffer
const mapped = try self.vkd.mapMemory(self.device, self.atlas_staging_memory, 0, @intCast(byte_len), .{});
@memcpy(@as([*]u8, @ptrCast(mapped))[0..byte_len], pixels[byte_offset .. byte_offset + byte_len]);
self.vkd.unmapMemory(self.device, self.atlas_staging_memory);
// Record transfer command
try self.vkd.resetCommandBuffer(self.atlas_transfer_cb, .{});
try self.vkd.beginCommandBuffer(self.atlas_transfer_cb, &vk.CommandBufferBeginInfo{
.flags = .{ .one_time_submit_bit = true },
});
// Barrier: old_layout -> TRANSFER_DST
const old_layout: vk.ImageLayout = if (full) .undefined else .shader_read_only_optimal;
const barrier_to_transfer = vk.ImageMemoryBarrier{
.src_access_mask = if (full) @as(vk.AccessFlags, .{}) else .{ .shader_read_bit = true },
.dst_access_mask = .{ .transfer_write_bit = true },
.old_layout = old_layout,
.new_layout = .transfer_dst_optimal,
.src_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
.dst_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
.image = self.atlas_image,
.subresource_range = .{
.aspect_mask = .{ .color_bit = true },
.base_mip_level = 0,
.level_count = 1,
.base_array_layer = 0,
.layer_count = 1,
},
};
const src_stage: vk.PipelineStageFlags = if (full) .{ .top_of_pipe_bit = true } else .{ .fragment_shader_bit = true };
self.vkd.cmdPipelineBarrier(
self.atlas_transfer_cb,
src_stage,
.{ .transfer_bit = true },
.{},
0, null,
0, null,
1, @ptrCast(&barrier_to_transfer),
);
// Copy staging buffer -> image (dirty band only)
const region = vk.BufferImageCopy{
.buffer_offset = 0,
.buffer_row_length = 0,
.buffer_image_height = 0,
.image_subresource = .{
.aspect_mask = .{ .color_bit = true },
.mip_level = 0,
.base_array_layer = 0,
.layer_count = 1,
},
.image_offset = .{ .x = 0, .y = @intCast(y_start), .z = 0 },
.image_extent = .{ .width = self.atlas_width, .height = y_end - y_start, .depth = 1 },
};
self.vkd.cmdCopyBufferToImage(
self.atlas_transfer_cb,
self.atlas_staging_buffer,
self.atlas_image,
.transfer_dst_optimal,
1,
@ptrCast(®ion),
);
// Barrier: TRANSFER_DST -> SHADER_READ_ONLY
const barrier_to_shader = vk.ImageMemoryBarrier{
.src_access_mask = .{ .transfer_write_bit = true },
.dst_access_mask = .{ .shader_read_bit = true },
.old_layout = .transfer_dst_optimal,
.new_layout = .shader_read_only_optimal,
.src_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
.dst_queue_family_index = vk.QUEUE_FAMILY_IGNORED,
.image = self.atlas_image,
.subresource_range = .{
.aspect_mask = .{ .color_bit = true },
.base_mip_level = 0,
.level_count = 1,
.base_array_layer = 0,
.layer_count = 1,
},
};
self.vkd.cmdPipelineBarrier(
self.atlas_transfer_cb,
.{ .transfer_bit = true },
.{ .fragment_shader_bit = true },
.{},
0, null,
0, null,
1, @ptrCast(&barrier_to_shader),
);
try self.vkd.endCommandBuffer(self.atlas_transfer_cb);
// Submit with dedicated fence (no queueWaitIdle)
try self.vkd.queueSubmit(self.graphics_queue, 1, @ptrCast(&vk.SubmitInfo{
.command_buffer_count = 1,
.p_command_buffers = @ptrCast(&self.atlas_transfer_cb),
}), self.atlas_transfer_fence);
}
```
- [ ] **Step 2: Verify it compiles**
Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS
- [ ] **Step 3: Run tests**
Run: `zig build test 2>&1 | tail -5`
Expected: PASS
- [ ] **Step 4: Commit**
```bash
git add src/renderer.zig
git commit -m "Implement uploadAtlasRegion with incremental uploads"
```
### Task 4: Add ASCII precompute and wire incremental upload into main.zig
**Files:**
- Modify: `src/main.zig`
- [ ] **Step 1: Add ASCII precompute at startup**
In `src/main.zig`, replace the block at lines 171-172:
```zig
// Upload empty atlas first (so descriptor set is valid)
try ctx.uploadAtlas(atlas.pixels);
```
With:
```zig
// Precompute printable ASCII glyphs (32-126) into atlas
for (32..127) |cp| {
_ = atlas.getOrInsert(&face, @intCast(cp)) catch |err| switch (err) {
error.AtlasFull => break,
else => return err,
};
}
// Upload warm atlas (full upload — descriptor set needs valid data)
try ctx.uploadAtlas(atlas.pixels);
atlas.last_uploaded_y = atlas.cursor_y;
atlas.needs_full_upload = false;
atlas.dirty = false;
```
- [ ] **Step 2: Replace the render-loop atlas upload**
In `src/main.zig`, replace the atlas upload block (lines 477-482):
```zig
// Re-upload atlas if new glyphs were added
if (atlas.dirty) {
try ctx.uploadAtlas(atlas.pixels);
atlas.dirty = false;
render_cache.layout_dirty = true;
}
```
With:
```zig
// Re-upload atlas if new glyphs were added (incremental)
if (atlas.dirty) {
const y_start = atlas.last_uploaded_y;
const y_end = atlas.cursor_y + atlas.row_height;
if (y_start < y_end) {
try ctx.uploadAtlasRegion(
atlas.pixels,
y_start,
y_end,
atlas.needs_full_upload,
);
atlas.last_uploaded_y = atlas.cursor_y;
atlas.needs_full_upload = false;
render_cache.layout_dirty = true;
}
atlas.dirty = false;
}
```
- [ ] **Step 3: Verify it compiles**
Run: `zig build 2>&1 | tail -5`
Expected: BUILD SUCCESS
- [ ] **Step 4: Run tests**
Run: `zig build test 2>&1 | tail -5`
Expected: PASS
- [ ] **Step 5: Commit**
```bash
git add src/main.zig
git commit -m "Wire ASCII precompute and incremental atlas upload"
```
### Task 5: Full verification
**Files:**
- Test: `src/font.zig`, `src/renderer.zig`, `src/main.zig`
- [ ] **Step 1: Run the full test suite**
Run: `zig build test`
Expected: PASS
- [ ] **Step 2: Manual smoke test — normal run**
Run: `zig build run`
Expected:
- Terminal opens and shows text correctly (precomputed ASCII atlas).
- Typing normal text works. Cursor renders.
- Exit dumps frame timing stats — atlas_upload should be 0 for most frames.
- [ ] **Step 3: Manual smoke test — Unicode character**
Run inside terminal: `echo "★ ← → ★"`
Expected: Characters render correctly (incremental upload fires for the first time these codepoints appear).
- [ ] **Step 4: Manual smoke test — bench comparison**
Run: `make bench`
Expected:
- atlas_upload avg should drop significantly from the baseline ~1700us.
- Steady-state frames should show atlas_upload near 0.
- [ ] **Step 5: Commit if any fixups were needed**
```bash
git add src/font.zig src/renderer.zig src/main.zig
git commit -m "Fix verification issues for incremental atlas upload"
```
## Self-Review
- **Spec coverage:**
- `last_uploaded_y` + `needs_full_upload` fields: Task 1
- `reset()` sets both fields: Task 1
- Persistent staging buffer: Task 2
- Transfer fence (starts signaled): Task 2
- `uploadAtlasRegion` with partial copy: Task 3
- Layout transition: `UNDEFINED` vs `SHADER_READ_ONLY` based on `full` flag: Task 3
- Post-copy barrier back to `SHADER_READ_ONLY`: Task 3
- Fence wait before reusing staging buffer: Task 3
- No `queueWaitIdle`: Task 3
- ASCII precompute (32-126): Task 4
- Render-loop incremental wiring with `y_start < y_end` guard: Task 4
- `last_uploaded_y = cursor_y` (not `cursor_y + row_height`): Task 4
- Bench comparison: Task 5
- **Placeholder scan:** No TBD/TODO markers. All code blocks are complete.
- **Type consistency:**
- `Atlas.last_uploaded_y` and `Atlas.needs_full_upload` defined in Task 1, used in Task 4
- `Context.atlas_staging_buffer`, `atlas_staging_memory`, `atlas_transfer_cb`, `atlas_transfer_fence` defined in Task 2, used in Task 3
- `uploadAtlasRegion(pixels, y_start, y_end, full)` defined in Task 3, called in Task 4
- Existing `uploadAtlas` kept unchanged — used for initial full upload in Task 4