docs/superpowers/plans/2026-04-18-input-latency-bench-implementation.md
Ref: Size: 54.8 KiB
# Input-Latency Bench Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Implement a closed-loop keystroke-to-display latency benchmark in waystty, measuring cold (idle) and hot (contended-PTY) latency via in-process `KeyEvent` injection, PUA-codepoint sentinels, and `wp_presentation_time` feedback for the display endpoint.
**Architecture:** A new `WAYSTTY_INPUT_BENCH` mode drives a `BenchDriver` that injects sentinels, scans rendered frames for them, and pairs grid-observations with compositor presentation-feedback to compute latency. A shared grid-lock infrastructure (also applied to the existing output bench) forces a known grid size for reproducibility.
**Tech Stack:** Zig 0.15+, zig-wayland (ifreund), Vulkan WSI (vulkan-zig), Wayland `wp_presentation_time` protocol, existing `bench_stats` module.
**Reference spec:** `docs/superpowers/specs/2026-04-18-input-latency-bench-design.md`
---
## Phase 1 — Shared Bench Infrastructure (grid-lock retrofit)
These tasks apply to *both* existing `WAYSTTY_BENCH` and the new `WAYSTTY_INPUT_BENCH`. They harden reproducibility of what's already there before adding new bench modes.
### Task 1.1: Bench-mode grid-size env vars
**Files:**
- Modify: `src/main.zig:196` (initial_grid constant)
- [ ] **Step 1: Add env parsing for bench grid size**
Replace the `initial_grid` constant and add a helper above `main()`:
```zig
fn benchGridSize() GridSize {
const cols_str = std.posix.getenv("WAYSTTY_BENCH_COLS") orelse "80";
const rows_str = std.posix.getenv("WAYSTTY_BENCH_ROWS") orelse "24";
const cols = std.fmt.parseInt(u16, cols_str, 10) catch 80;
const rows = std.fmt.parseInt(u16, rows_str, 10) catch 24;
return .{ .cols = cols, .rows = rows };
}
fn benchModeActive() bool {
return std.posix.getenv("WAYSTTY_BENCH") != null
or std.posix.getenv("WAYSTTY_INPUT_BENCH") != null;
}
```
Change the `initial_grid` site in `main`:
```zig
// === grid size ===
const initial_grid: GridSize = if (benchModeActive())
benchGridSize()
else
.{ .cols = 80, .rows = 24 };
var cols: u16 = initial_grid.cols;
var rows: u16 = initial_grid.rows;
```
- [ ] **Step 2: Verify build**
Run: `zig build`
Expected: PASS
- [ ] **Step 3: Commit**
```bash
git add src/main.zig
git commit -m "bench: parse WAYSTTY_BENCH_COLS/ROWS for configurable grid"
```
### Task 1.2: `xdg_toplevel` min/max size hints
**Files:**
- Modify: `src/main.zig` around the `xdg_toplevel` setup (after window creation, before the first roundtrip at `src/main.zig:213`)
- [ ] **Step 1: Expose size hints on Window in `src/wayland.zig`**
Add a method on `Window` near the existing `setTitle`:
```zig
pub fn setSizeHints(self: *Window, w: u32, h: u32) void {
const iw = @as(i32, @intCast(w));
const ih = @as(i32, @intCast(h));
self.xdg_toplevel.setMinSize(iw, ih);
self.xdg_toplevel.setMaxSize(iw, ih);
}
```
- [ ] **Step 2: Call from main when bench is active**
After `window.height = initial_h;` (around `src/main.zig:211`) and before the roundtrip:
```zig
if (benchModeActive()) {
window.setSizeHints(initial_w, initial_h);
}
```
- [ ] **Step 3: Build and smoke-test**
Run: `zig build && WAYSTTY_BENCH=1 WAYSTTY_BENCH_ROWS=24 WAYSTTY_BENCH_COLS=80 ./zig-out/bin/waystty 2>/tmp/smoke.log; head -40 /tmp/smoke.log`
Expected: launches and exits cleanly; in a floating window on sway, geometry is respected.
- [ ] **Step 4: Commit**
```bash
git add src/main.zig src/wayland.zig
git commit -m "bench: advertise xdg_toplevel min/max size hints in bench mode"
```
### Task 1.3: Abort on compositor resize in bench mode
**Files:**
- Modify: `src/main.zig:409` (resize observer)
- [ ] **Step 1: Add abort on size mismatch in the resize observer**
Replace the block at `src/main.zig:409`:
```zig
if (window.width != last_window_w or window.height != last_window_h) {
if (benchModeActive()) {
std.debug.print(
"\nwaystty bench: compositor sized window to {}x{}, expected {}x{} ({}x{} grid). " ++
"Run in a floating window or non-tiling compositor for reproducible benchmarks.\n",
.{ window.width, window.height, initial_w, initial_h, cols, rows },
);
std.process.exit(2);
}
resize_pending = true;
render_pending = true;
}
```
- [ ] **Step 2: Build**
Run: `zig build`
Expected: PASS
- [ ] **Step 3: Manual sanity — tiling compositor abort**
On sway (tiling mode), run:
`WAYSTTY_BENCH=1 ./zig-out/bin/waystty 2>/tmp/bench-abort.log`
Expected: exits with code 2 and diagnostic message. (In floating mode, no abort.)
- [ ] **Step 4: Commit**
```bash
git add src/main.zig
git commit -m "bench: abort with diagnostic if compositor resizes during bench"
```
### Task 1.4: Print grid size in bench stats output
**Files:**
- Modify: `src/bench_stats.zig:152-162` (`printFrameStats`)
- [ ] **Step 1: Extend signature to take grid dims**
Replace `printFrameStats`:
```zig
pub fn printFrameStats(stats: FrameTimingStats, cols: u16, rows: u16) void {
const row_fmt = "{s:<20}{d:>6}{d:>6}{d:>6}{d:>6}\n";
std.debug.print("\n=== waystty frame timing ({d} frames, {d}x{d} grid) ===\n", .{ stats.frame_count, cols, rows });
std.debug.print("{s:<20}{s:>6}{s:>6}{s:>6}{s:>6} (us)\n", .{ "section", "min", "avg", "p99", "max" });
std.debug.print(row_fmt, .{ "snapshot", stats.snapshot.min, stats.snapshot.avg, stats.snapshot.p99, stats.snapshot.max });
std.debug.print(row_fmt, .{ "row_rebuild", stats.row_rebuild.min, stats.row_rebuild.avg, stats.row_rebuild.p99, stats.row_rebuild.max });
std.debug.print(row_fmt, .{ "atlas_upload", stats.atlas_upload.min, stats.atlas_upload.avg, stats.atlas_upload.p99, stats.atlas_upload.max });
std.debug.print(row_fmt, .{ "instance_upload", stats.instance_upload.min, stats.instance_upload.avg, stats.instance_upload.p99, stats.instance_upload.max });
std.debug.print(row_fmt, .{ "gpu_submit", stats.gpu_submit.min, stats.gpu_submit.avg, stats.gpu_submit.p99, stats.gpu_submit.max });
std.debug.print("----------------------------------------------------\n", .{});
std.debug.print(row_fmt, .{ "total", stats.total.min, stats.total.avg, stats.total.p99, stats.total.max });
}
```
- [ ] **Step 2: Update all call sites in `src/main.zig`**
Run: `grep -n "printFrameStats" src/main.zig`
For each call, pass `cols, rows` as additional args. E.g. `printFrameStats(computeFrameStats(&frame_ring), cols, rows);`
- [ ] **Step 3: Build and test**
Run: `zig build && zig build test`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/main.zig src/bench_stats.zig
git commit -m "bench: include grid size in stats header"
```
---
## Phase 2 — Frame-counter plumbing
### Task 2.1: Add `frame_counter` field to `FrameTiming`
**Files:**
- Modify: `src/bench_stats.zig:3-24` (`FrameTiming` struct)
- [ ] **Step 1: Extend the struct**
Add `frame_counter: u64 = 0,` as a field on `FrameTiming` (keep `.total()` unchanged — counter is metadata, not timing):
```zig
pub const FrameTiming = struct {
frame_counter: u64 = 0,
snapshot_us: u32 = 0,
row_rebuild_us: u32 = 0,
atlas_upload_us: u32 = 0,
instance_upload_us: u32 = 0,
gpu_submit_us: u32 = 0,
wait_fences_us: u32 = 0,
acquire_us: u32 = 0,
record_us: u32 = 0,
submit_us: u32 = 0,
present_us: u32 = 0,
pub fn total(self: FrameTiming) u32 {
return self.snapshot_us +
self.row_rebuild_us +
self.atlas_upload_us +
self.instance_upload_us +
self.gpu_submit_us;
}
};
```
- [ ] **Step 2: Update CSV writer to include frame_counter column**
In `writeFrameCsv` (around `src/bench_stats.zig:124`), change the header and row:
```zig
_ = try file.write("frame_counter,frame_idx,snapshot_us,row_rebuild_us,atlas_upload_us,instance_upload_us,gpu_submit_us,wait_fences_us,acquire_us,record_us,submit_us,present_us,total_us\n");
for (entries, 0..) |e, i| {
const line = try std.fmt.bufPrint(&buf, "{d},{d},{d},{d},{d},{d},{d},{d},{d},{d},{d},{d},{d}\n", .{
e.frame_counter,
i,
e.snapshot_us,
e.row_rebuild_us,
e.atlas_upload_us,
e.instance_upload_us,
e.gpu_submit_us,
e.wait_fences_us,
e.acquire_us,
e.record_us,
e.submit_us,
e.present_us,
e.total(),
});
_ = try file.write(line);
}
```
- [ ] **Step 3: Increment in main loop**
In `src/main.zig`, add near the other `var` declarations before the main loop (around `src/main.zig:339`):
```zig
var frame_counter: u64 = 0;
```
At the end of each rendered frame (where the ring push happens — grep `frame_ring.push` to find it), set the counter on the timing struct before pushing, then increment:
```zig
timing.frame_counter = frame_counter;
frame_ring.push(timing);
frame_counter +%= 1;
```
(Use the exact local name of the timing variable at that site; adjust if named differently.)
- [ ] **Step 4: Build and test**
Run: `zig build && zig build test`
Expected: PASS. Existing tests continue to pass (they don't touch `frame_counter`).
- [ ] **Step 5: Commit**
```bash
git add src/bench_stats.zig src/main.zig
git commit -m "bench: add frame_counter to FrameTiming for sample correlation"
```
### Task 2.2: Test that `frame_counter` round-trips through the ring
**Files:**
- Modify: `src/bench_stats.zig` (add new test)
- [ ] **Step 1: Add test**
Append to the test block:
```zig
test "FrameTimingRing preserves frame_counter through wrap" {
var ring = FrameTimingRing{};
for (0..FrameTimingRing.capacity + 5) |i| {
ring.push(.{ .frame_counter = i, .snapshot_us = @intCast(i) });
}
var buf: [FrameTimingRing.capacity]FrameTiming = undefined;
const ordered = ring.orderedSlice(&buf);
try std.testing.expectEqual(@as(u64, 5), ordered[0].frame_counter);
try std.testing.expectEqual(@as(u64, FrameTimingRing.capacity + 4), ordered[ordered.len - 1].frame_counter);
}
```
- [ ] **Step 2: Run test**
Run: `zig build test 2>&1 | grep -E "PASS|FAIL|error"`
Expected: test PASSes.
- [ ] **Step 3: Commit**
```bash
git add src/bench_stats.zig
git commit -m "bench: test frame_counter preservation across ring wrap"
```
---
## Phase 3 — `wp_presentation_time` protocol binding
### Task 3.1: Add protocol XML to the Wayland scanner
**Files:**
- Modify: `build.zig:35-40`
- [ ] **Step 1: Register the protocol**
After `scanner.addSystemProtocol("stable/xdg-shell/xdg-shell.xml");`:
```zig
scanner.addSystemProtocol("stable/presentation-time/presentation-time.xml");
```
And after `scanner.generate("xdg_wm_base", 6);`:
```zig
scanner.generate("wp_presentation", 1);
```
- [ ] **Step 2: Build**
Run: `zig build`
Expected: PASS. (Requires `wayland-protocols` system package — if missing, install it via the distro's wayland-protocols dev package.)
- [ ] **Step 3: Commit**
```bash
git add build.zig
git commit -m "bench: register wp_presentation_time protocol in build.zig"
```
### Task 3.2: Bind `wp_presentation` global in the Wayland layer
**Files:**
- Modify: `src/wayland.zig` — find the `Globals` struct and the registry listener
- [ ] **Step 1: Find the Globals struct**
Run: `grep -n "struct.*Globals\|pub const Globals\|globals:\|seat: ?\|compositor: ?" src/wayland.zig | head -20`
Locate where other globals like `seat`, `compositor`, `data_device_manager` are declared. Add a new field:
```zig
wp_presentation: ?*wp.Presentation = null,
```
(Adjust the Wayland protocol import — the generated module exposes `wp` as a namespace; follow the pattern already used for `xdg`/`wl`.)
- [ ] **Step 2: Handle the global in the registry listener**
Find the `registryListener` (or similarly named) that dispatches `registry.global` events. In the switch on interface name, add a branch for `wp_presentation`:
```zig
} else if (std.mem.eql(u8, interface, "wp_presentation")) {
globals.wp_presentation = registry.bind(name, wp.Presentation, 1) catch null;
}
```
(Follow the exact pattern of neighboring `std.mem.eql(u8, interface, "wl_seat")` branches.)
- [ ] **Step 3: Import the generated namespace at the top of `src/wayland.zig`**
Find the existing `const wl = @import("wayland").client.wl;` line and add a parallel:
```zig
const wp = @import("wayland").client.wp;
```
(If the generated module uses a different namespace (e.g. `wp_presentation` rather than `wp`), use whatever the scanner emits — check `zig-cache`'s generated wayland.zig.)
- [ ] **Step 4: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/wayland.zig
git commit -m "bench: bind wp_presentation global"
```
### Task 3.3: Wrap `wp_presentation_feedback` with a Zig-friendly callback
**Files:**
- Modify: `src/wayland.zig`
- [ ] **Step 1: Add a `PresentationFeedback` wrapper type**
Near the existing Window / Keyboard types, add:
```zig
pub const PresentationFeedback = struct {
pub const Event = union(enum) {
presented: struct { tv_sec: u64, tv_nsec: u32, refresh: u32, flags: u32 },
discarded: void,
};
feedback: *wp.PresentationFeedback,
user_data: ?*anyopaque = null,
callback: ?*const fn (user_data: ?*anyopaque, ev: Event) void = null,
pub fn init(
presentation: *wp.Presentation,
surface: *wl.Surface,
user_data: ?*anyopaque,
callback: *const fn (?*anyopaque, Event) void,
) !*PresentationFeedback {
const alloc = std.heap.c_allocator; // arena-free, lives until presented/discarded
const self = try alloc.create(PresentationFeedback);
self.* = .{
.feedback = try presentation.feedback(surface),
.user_data = user_data,
.callback = callback,
};
self.feedback.setListener(*PresentationFeedback, feedbackListener, self);
return self;
}
fn feedbackListener(
_: *wp.PresentationFeedback,
event: wp.PresentationFeedback.Event,
self: *PresentationFeedback,
) void {
switch (event) {
.presented => |p| {
const tv_sec = (@as(u64, p.tv_sec_hi) << 32) | p.tv_sec_lo;
if (self.callback) |cb| {
cb(self.user_data, .{ .presented = .{
.tv_sec = tv_sec,
.tv_nsec = p.tv_nsec,
.refresh = p.refresh,
.flags = @bitCast(p.flags),
} });
}
self.destroy();
},
.discarded => {
if (self.callback) |cb| cb(self.user_data, .discarded);
self.destroy();
},
else => {}, // sync_output events — ignore
}
}
fn destroy(self: *PresentationFeedback) void {
self.feedback.destroy();
std.heap.c_allocator.destroy(self);
}
};
```
(Allocator choice: `c_allocator` because lifetime is tied to async Wayland events, not main-loop ownership. If the project already has a conventional allocator for this, use it.)
- [ ] **Step 2: Build**
Run: `zig build`
Expected: PASS. Fix any compilation errors (wp event names differ slightly in generated bindings — check `zig build --verbose` for exact field names).
- [ ] **Step 3: Commit**
```bash
git add src/wayland.zig
git commit -m "bench: add PresentationFeedback wrapper with typed callback"
```
### Task 3.4: Smoke-test `wp_presentation` under sway
**Files:**
- Create: `src/tools/presentation_smoke.zig` (new, small standalone program)
- [ ] **Step 1: Write a minimal smoke test**
```zig
const std = @import("std");
const wayland_client = @import("wayland-client");
pub fn main() !void {
var gpa: std.heap.DebugAllocator(.{}) = .init;
defer _ = gpa.deinit();
const alloc = gpa.allocator();
const conn = try wayland_client.Connection.init(alloc);
defer conn.deinit();
if (conn.globals.wp_presentation == null) {
std.debug.print("FAIL: wp_presentation global not advertised by compositor\n", .{});
std.process.exit(1);
}
std.debug.print("OK: wp_presentation bound\n", .{});
}
```
- [ ] **Step 2: Add build step in `build.zig`**
After other tools (grep for `bench_baseline` to find the pattern), add:
```zig
const presentation_smoke_exe = b.addExecutable(.{
.name = "presentation-smoke",
.root_source_file = b.path("src/tools/presentation_smoke.zig"),
.target = target,
.optimize = optimize,
});
presentation_smoke_exe.root_module.addImport("wayland-client", wayland_mod);
const run_presentation_smoke = b.addRunArtifact(presentation_smoke_exe);
const smoke_step = b.step("presentation-smoke", "Verify wp_presentation binding");
smoke_step.dependOn(&run_presentation_smoke.step);
```
- [ ] **Step 3: Run**
Run: `zig build presentation-smoke`
Expected: prints `OK: wp_presentation bound`. On compositors without the protocol, FAILs with a clear message.
- [ ] **Step 4: Commit**
```bash
git add src/tools/presentation_smoke.zig build.zig
git commit -m "bench: add presentation-smoke tool for wp_presentation binding check"
```
### Task 3.5: Request feedback per-frame in the renderer
**Files:**
- Modify: `src/renderer.zig` (around the swapchain `queuePresentKHR` call) and `src/main.zig`
- [ ] **Step 1: Find the present call**
Run: `grep -n "queuePresentKHR\|present_info\|queue_present" src/renderer.zig`
Expected: one location where `vkQueuePresentKHR` is invoked.
- [ ] **Step 2: Expose a pre-present hook**
Add a function pointer field on the renderer context (or equivalent) that's called just before `queuePresentKHR`:
```zig
// In the Context struct definition
pre_present_hook: ?*const fn (ctx: ?*anyopaque) void = null,
pre_present_ctx: ?*anyopaque = null,
```
Immediately before the `queuePresentKHR` call site:
```zig
if (ctx.pre_present_hook) |h| h(ctx.pre_present_ctx);
```
- [ ] **Step 3: Wire the hook from main**
In `src/main.zig`, after the bench driver is initialized (Task 4.1+), set:
```zig
ctx.pre_present_hook = &benchPrePresentHook;
ctx.pre_present_ctx = &bench_driver;
```
For now, leave the hook body as a placeholder fn that does nothing — actual feedback-request logic lands in Phase 6. Define:
```zig
fn benchPrePresentHook(opaque_ctx: ?*anyopaque) void {
_ = opaque_ctx;
// Populated in Task 6.3
}
```
- [ ] **Step 4: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/renderer.zig src/main.zig
git commit -m "bench: add pre-present hook in renderer for wp_presentation_feedback"
```
---
## Phase 4 — BenchDriver skeleton + PTY plumbing
### Task 4.1: Env parsing + `Scenario` enum
**Files:**
- Create: `src/bench_input.zig`
- Modify: `build.zig` (register module)
- Modify: `src/main.zig`
- [ ] **Step 1: Create `src/bench_input.zig` scaffold**
```zig
const std = @import("std");
pub const Scenario = enum {
cold,
hot,
both,
pub fn parse(s: []const u8) ?Scenario {
if (std.mem.eql(u8, s, "cold")) return .cold;
if (std.mem.eql(u8, s, "hot")) return .hot;
if (std.mem.eql(u8, s, "both")) return .both;
if (std.mem.eql(u8, s, "1")) return .both; // default when set to any truthy
return null;
}
};
pub const Config = struct {
scenario: Scenario,
samples_per_scenario: u32 = 500,
max_frames_per_sample: u32 = 60,
cols: u16 = 80,
rows: u16 = 24,
};
pub fn readConfigFromEnv() ?Config {
const val = std.posix.getenv("WAYSTTY_INPUT_BENCH") orelse return null;
const sc = Scenario.parse(val) orelse {
std.debug.print("WAYSTTY_INPUT_BENCH: invalid scenario '{s}', expected cold|hot|both\n", .{val});
std.process.exit(2);
};
return .{
.scenario = sc,
.cols = if (std.posix.getenv("WAYSTTY_BENCH_COLS")) |s|
(std.fmt.parseInt(u16, s, 10) catch 80)
else 80,
.rows = if (std.posix.getenv("WAYSTTY_BENCH_ROWS")) |s|
(std.fmt.parseInt(u16, s, 10) catch 24)
else 24,
};
}
test "Scenario.parse" {
try std.testing.expectEqual(@as(?Scenario, .cold), Scenario.parse("cold"));
try std.testing.expectEqual(@as(?Scenario, .hot), Scenario.parse("hot"));
try std.testing.expectEqual(@as(?Scenario, .both), Scenario.parse("both"));
try std.testing.expectEqual(@as(?Scenario, null), Scenario.parse("nope"));
}
```
- [ ] **Step 2: Register module in `build.zig`**
After other module declarations (grep `bench_stats_mod` pattern), add:
```zig
const bench_input_mod = b.createModule(.{
.root_source_file = b.path("src/bench_input.zig"),
.target = target,
.optimize = optimize,
});
```
And at the waystty executable's `addImport` block, add:
```zig
exe.root_module.addImport("bench_input", bench_input_mod);
```
Also add a test step for it:
```zig
const bench_input_tests = b.addTest(.{ .root_module = bench_input_mod });
const run_bench_input_tests = b.addRunArtifact(bench_input_tests);
const test_step = b.step("test", "Run tests"); // if a test step already exists, just add the dep
test_step.dependOn(&run_bench_input_tests.step);
```
(If there's already a `test` step — check with `grep "b.step(\"test\"" build.zig` — add `test_step.dependOn(&run_bench_input_tests.step);` to the existing one.)
- [ ] **Step 3: Run tests**
Run: `zig build test 2>&1 | tail -20`
Expected: `Scenario.parse` passes.
- [ ] **Step 4: Commit**
```bash
git add src/bench_input.zig build.zig
git commit -m "bench: create bench_input module with Scenario enum + Config"
```
### Task 4.2: PTY termios ECHO verification helper
**Files:**
- Modify: `src/pty.zig`
- [ ] **Step 1: Find PTY spawn**
Run: `grep -n "pub fn spawn\|tcsetattr\|termios\|ECHO" src/pty.zig | head`
- [ ] **Step 2: Add a helper**
Near the existing `spawn` method:
```zig
pub fn ensureEcho(slave_fd: std.posix.fd_t) !void {
var tio: std.posix.termios = undefined;
try std.posix.tcgetattr(slave_fd, &tio);
if ((tio.lflag & std.posix.system.linux.ECHO) == 0) {
tio.lflag |= std.posix.system.linux.ECHO;
try std.posix.tcsetattr(slave_fd, .NOW, tio);
}
}
```
(Adjust namespaces if Zig stdlib differs slightly — find by grepping `tcgetattr` in the stdlib.)
- [ ] **Step 3: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/pty.zig
git commit -m "bench: add Pty.ensureEcho helper"
```
### Task 4.3: Cold PTY spawn (`cat > /dev/null`)
**Files:**
- Modify: `src/main.zig` (spawn block around lines 276-304)
- [ ] **Step 1: Extend the shell-selection logic**
Replace the block from `const is_bench = ...` through `defer p.deinit();` (roughly `src/main.zig:276-305`):
```zig
const bench_input_cfg = @import("bench_input").readConfigFromEnv();
const is_output_bench = std.posix.getenv("WAYSTTY_BENCH") != null;
const bench_unthrottled = is_output_bench and std.posix.getenv("WAYSTTY_BENCH_UNTHROTTLED") != null;
// Shell + args choice
const ShellPlan = struct {
shell: [:0]const u8,
args: ?[]const [:0]const u8,
};
const shell_plan: ShellPlan = if (bench_input_cfg) |cfg| blk: {
const sh_args: []const [:0]const u8 = switch (cfg.scenario) {
.cold, .both => &.{ "-c", "cat > /dev/null" },
.hot => &.{ "-c", "yes \"$(printf 'x%.0s' {1..500})\" | pv -qL 24K" },
};
break :blk .{ .shell = try alloc.dupeZ(u8, "/bin/sh"), .args = sh_args };
} else if (is_output_bench) blk: {
break :blk .{ .shell = try alloc.dupeZ(u8, "/bin/sh"), .args = null };
} else blk: {
const shell_env = std.posix.getenv("SHELL") orelse "/bin/sh";
break :blk .{ .shell = try alloc.dupeZ(u8, shell_env), .args = null };
};
defer alloc.free(shell_plan.shell);
const bench_script: ?[:0]const u8 = if (is_output_bench)
@embedFile("bench_workload")
else
null;
if (is_output_bench) {
if (bench_unthrottled) {
std.debug.print("[bench] mode: UNTHROTTLED (not freeze-safe)\n", .{});
} else {
std.debug.print("[bench] mode: THROTTLED (vsync-paced)\n", .{});
}
}
if (bench_input_cfg) |cfg| {
std.debug.print("[input-bench] scenario: {s}, grid: {d}x{d}\n", .{ @tagName(cfg.scenario), cfg.cols, cfg.rows });
}
const pty_args = if (shell_plan.args) |a| a else if (bench_script) |script| &[_][:0]const u8{ "-c", script } else null;
var p = try pty.Pty.spawn(.{
.cols = cols,
.rows = rows,
.shell = shell_plan.shell,
.shell_args = pty_args,
});
defer p.deinit();
try pty.Pty.ensureEcho(p.slave_fd); // if slave_fd isn't public, expose it; otherwise do inside Pty.spawn
term.setWritePtyCallback(&p, &writePtyFromTerminal);
```
(If `p.slave_fd` is not exposed, modify `src/pty.zig` to expose it, or call `ensureEcho` from inside `Pty.spawn`.)
- [ ] **Step 2: Build and smoke-test cold**
Run: `zig build && WAYSTTY_INPUT_BENCH=cold WAYSTTY_BENCH_COLS=80 WAYSTTY_BENCH_ROWS=24 ./zig-out/bin/waystty 2>/tmp/bench-cold.log &`
Let it run ~2s, then kill. Inspect `/tmp/bench-cold.log` — should show the `[input-bench] scenario: cold` line.
- [ ] **Step 3: Commit**
```bash
git add src/main.zig src/pty.zig
git commit -m "bench: spawn bench-specific PTY children for cold/hot scenarios"
```
### Task 4.4: `pv` availability check for hot mode
**Files:**
- Modify: `src/main.zig` (inside the hot-scenario branch)
- [ ] **Step 1: Add a pre-spawn check**
Before the hot branch resolves the args, add:
```zig
fn assertPvAvailable(alloc: std.mem.Allocator) void {
const res = std.process.Child.run(.{
.allocator = alloc,
.argv = &.{ "sh", "-c", "command -v pv" },
}) catch {
std.debug.print("waystty input-bench hot: `pv` not found. Install with your package manager (e.g. `pacman -S pv`).\n", .{});
std.process.exit(2);
};
alloc.free(res.stdout);
alloc.free(res.stderr);
if (res.term != .Exited or res.term.Exited != 0) {
std.debug.print("waystty input-bench hot: `pv` not found. Install with your package manager (e.g. `pacman -S pv`).\n", .{});
std.process.exit(2);
}
}
```
Call it in the hot arm:
```zig
.hot => blk: {
assertPvAvailable(alloc);
break :blk &.{ "-c", "yes \"$(printf 'x%.0s' {1..500})\" | pv -qL 24K" };
},
```
- [ ] **Step 2: Manual test without pv**
Run: `PATH=/usr/bin:/bin WAYSTTY_INPUT_BENCH=hot WAYSTTY_BENCH_COLS=80 ./zig-out/bin/waystty 2>/tmp/bench-hot.log || echo "exit $?"`
Expected: If `pv` is present, runs; otherwise exits with 2 and the diagnostic.
- [ ] **Step 3: Commit**
```bash
git add src/main.zig
git commit -m "bench: fail loudly if pv is missing for hot scenario"
```
### Task 4.5: Child teardown (SIGTERM → 100ms → SIGKILL → waitpid)
**Files:**
- Modify: `src/pty.zig`
- [ ] **Step 1: Add a `gracefulTeardown` method on Pty**
```zig
pub fn gracefulTeardown(self: *Pty) void {
if (self.pid <= 0) return;
_ = std.posix.kill(self.pid, std.posix.SIG.TERM) catch {};
// poll waitpid up to 100ms
var elapsed_ms: u32 = 0;
while (elapsed_ms < 100) : (elapsed_ms += 10) {
const res = std.posix.waitpid(self.pid, std.posix.W.NOHANG);
if (res.pid != 0) return;
std.Thread.sleep(10 * std.time.ns_per_ms);
}
_ = std.posix.kill(self.pid, std.posix.SIG.KILL) catch {};
_ = std.posix.waitpid(self.pid, 0);
}
```
(Cross-check exact `waitpid` / `WNOHANG` / `SIG.TERM` spellings against Zig stdlib.)
- [ ] **Step 2: Call from deinit or bench scenario switch**
Make `Pty.deinit` call `gracefulTeardown` before closing fds if the child is still running. For `both` scenario switching, expose a public method to call explicitly.
- [ ] **Step 3: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/pty.zig
git commit -m "bench: gracefulTeardown for PTY children (SIGTERM→grace→SIGKILL)"
```
### Task 4.6: Suppress `.key` events in bench mode
**Files:**
- Modify: `src/main.zig:372-399` (keyboard event loop)
- [ ] **Step 1: Gate the `.key` processing**
Replace the loop at `src/main.zig:374-398`:
```zig
for (keyboard.event_queue.items) |ev| {
if (ev.action == .release) continue;
if (bench_input_cfg != null) {
// Bench mode: drop real keyboard .key events so ambient typing
// can't perturb measurements. Modifiers/enter/leave/repeat state
// on the Keyboard struct still update via the listener callbacks.
continue;
}
// ... existing clipboard/paste/encode path
}
```
(Rewrap the remaining body unchanged under the `else` / after the continue.)
- [ ] **Step 2: Build and smoke-test**
Run: `zig build && WAYSTTY_INPUT_BENCH=cold ./zig-out/bin/waystty 2>/tmp/sm.log &`
Type in the window (focus must be on it); verify no characters appear. Kill.
- [ ] **Step 3: Commit**
```bash
git add src/main.zig
git commit -m "bench: drop real keyboard .key events in input-bench mode"
```
---
## Phase 5 — Sentinel allocator + injection
### Task 5.1: PUA sentinel allocator
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Add SentinelAlloc**
Append to `src/bench_input.zig`:
```zig
pub const SentinelAlloc = struct {
const PUA_START: u21 = 0xE000;
const PUA_COUNT: u32 = 4096;
next: u32 = 0,
pub fn take(self: *SentinelAlloc) u21 {
const idx = self.next % PUA_COUNT;
self.next +%= 1;
return PUA_START + @as(u21, @intCast(idx));
}
};
/// Encode a codepoint as UTF-8 into `buf`. Returns the length written.
pub fn encodeCodepoint(cp: u21, buf: *[4]u8) u3 {
const n = std.unicode.utf8Encode(cp, buf) catch unreachable;
return @intCast(n);
}
test "SentinelAlloc rotates through 4096 PUA codepoints" {
var a: SentinelAlloc = .{};
const first = a.take();
try std.testing.expectEqual(@as(u21, 0xE000), first);
for (1..4096) |_| _ = a.take();
// Next should wrap to 0xE000 again
try std.testing.expectEqual(@as(u21, 0xE000), a.take());
}
test "encodeCodepoint produces valid 3-byte UTF-8 for PUA" {
var buf: [4]u8 = undefined;
const n = encodeCodepoint(0xE000, &buf);
try std.testing.expectEqual(@as(u3, 3), n);
try std.testing.expectEqual(@as(u8, 0xEE), buf[0]);
try std.testing.expectEqual(@as(u8, 0x80), buf[1]);
try std.testing.expectEqual(@as(u8, 0x80), buf[2]);
}
```
- [ ] **Step 2: Run tests**
Run: `zig build test 2>&1 | tail`
Expected: both tests PASS.
- [ ] **Step 3: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: SentinelAlloc and encodeCodepoint helpers"
```
### Task 5.2: Fabricate `KeyEvent` injector
**Files:**
- Modify: `src/bench_input.zig`
- Review: `src/wayland.zig` (Keyboard.KeyEvent type)
- [ ] **Step 1: Inspect KeyEvent type**
Run: `grep -n "pub const KeyEvent\|KeyEvent = struct\|action:.*\\.\\(press\\|release\\)\|utf8:" src/wayland.zig | head -10`
Note the exact field set — likely something like:
```zig
pub const KeyEvent = struct {
action: enum { press, release },
keysym: u32,
serial: u32,
utf8: [16]u8,
utf8_len: u8,
// ... possibly mods
};
```
- [ ] **Step 2: Add `injectSentinel` in bench_input.zig**
```zig
const wayland_client = @import("wayland-client");
pub fn injectSentinel(
keyboard: *wayland_client.Keyboard,
sentinel_cp: u21,
) !void {
var utf8: [16]u8 = @splat(0);
var enc: [4]u8 = undefined;
const n = encodeCodepoint(sentinel_cp, &enc);
@memcpy(utf8[0..n], enc[0..n]);
const ev = wayland_client.Keyboard.KeyEvent{
.action = .press,
.keysym = 0,
.serial = 0,
.utf8 = utf8,
.utf8_len = n,
};
try keyboard.event_queue.append(ev);
}
```
(Adjust field names to the exact struct — fill with zeros for any required fields not shown above.)
- [ ] **Step 3: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: injectSentinel pushes fabricated KeyEvent onto queue"
```
---
## Phase 6 — Pair-on-arrival matching
### Task 6.1: `Sample` and `BenchDriver` skeletons
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Add Sample + state**
```zig
pub const Sample = struct {
sentinel: u21,
t_inject_ns: u64,
injected_frame: u64,
grid_seen_frame: ?u64 = null,
presented_ns: ?u64 = null,
timed_out: bool = false,
pub fn complete(self: Sample) bool {
return self.timed_out or (self.grid_seen_frame != null and self.presented_ns != null);
}
pub fn latencyNs(self: Sample) ?u64 {
const p = self.presented_ns orelse return null;
return p - self.t_inject_ns;
}
};
pub const SampleBuffer = struct {
const cap = 2000; // 2 scenarios × 500 samples + headroom
items: [cap]Sample = undefined,
count: usize = 0,
pub fn push(self: *SampleBuffer, s: Sample) void {
if (self.count < cap) {
self.items[self.count] = s;
self.count += 1;
}
}
};
```
- [ ] **Step 2: Build + test**
Run: `zig build test 2>&1 | tail -5`
Expected: no new test failures.
- [ ] **Step 3: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: Sample and SampleBuffer data types"
```
### Task 6.2: BenchDriver struct + tick entry points
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Add driver**
```zig
pub const Phase = enum { idle, running, done };
pub const BenchDriver = struct {
cfg: Config,
alloc: std.mem.Allocator,
sentinels: SentinelAlloc = .{},
in_flight: ?Sample = null,
samples: SampleBuffer = .{},
pending_feedback: std.AutoArrayHashMapUnmanaged(u64, u64) = .{}, // frame_counter -> presented_ns
current_phase: Phase = .idle,
current_scenario: Scenario = .cold,
scenario_sample_count: u32 = 0,
early_timeouts: u32 = 0,
early_samples: u32 = 0,
pub fn init(alloc: std.mem.Allocator, cfg: Config) BenchDriver {
return .{
.cfg = cfg,
.alloc = alloc,
.current_scenario = switch (cfg.scenario) {
.cold, .both => .cold,
.hot => .hot,
},
};
}
pub fn deinit(self: *BenchDriver) void {
self.pending_feedback.deinit(self.alloc);
}
/// Called before processing keyboard events; decides whether to inject.
pub fn preTick(
self: *BenchDriver,
keyboard: *wayland_client.Keyboard,
frame_counter: u64,
) !void {
if (self.current_phase != .running) return;
if (self.in_flight != null) return;
const sentinel = self.sentinels.take();
try injectSentinel(keyboard, sentinel);
self.in_flight = .{
.sentinel = sentinel,
.t_inject_ns = @intCast(std.time.Instant.now().timestamp),
.injected_frame = frame_counter,
};
}
/// Called after term.snapshot — scan the grid for the active sentinel.
pub fn postFrameGridScan(
self: *BenchDriver,
frame_counter: u64,
grid_contains_sentinel: bool,
) void {
if (self.in_flight == null) return;
var s = &self.in_flight.?;
if (s.grid_seen_frame != null) return;
if (grid_contains_sentinel) {
s.grid_seen_frame = frame_counter;
} else if (frame_counter - s.injected_frame >= self.cfg.max_frames_per_sample) {
s.timed_out = true;
self.finalizeSample();
}
}
/// Called from the presentation feedback callback.
pub fn recordPresented(self: *BenchDriver, frame_counter: u64, presented_ns: u64) void {
_ = self.pending_feedback.put(self.alloc, frame_counter, presented_ns) catch return;
self.tryFinalize();
}
/// Called on presentation-feedback discarded event (no-op; we simply keep waiting).
pub fn recordDiscarded(self: *BenchDriver, frame_counter: u64) void {
_ = self;
_ = frame_counter;
}
fn tryFinalize(self: *BenchDriver) void {
if (self.in_flight == null) return;
const s = self.in_flight.?;
const gsf = s.grid_seen_frame orelse return;
const p = self.pending_feedback.get(gsf) orelse return;
self.in_flight.?.presented_ns = p;
self.finalizeSample();
}
fn finalizeSample(self: *BenchDriver) void {
const sample = self.in_flight.?;
self.in_flight = null;
self.samples.push(sample);
self.scenario_sample_count += 1;
// WSI fallback detection: if >10% of first 50 time out, abort.
if (self.early_samples < 50) {
self.early_samples += 1;
if (sample.timed_out) self.early_timeouts += 1;
if (self.early_samples == 50 and self.early_timeouts > 5) {
std.debug.print(
"waystty input-bench: {d}/50 early samples timed out. " ++
"Likely wp_presentation.feedback commit race with Mesa WSI. " ++
"Investigate VK_KHR_present_wait as an alternative.\n",
.{self.early_timeouts},
);
std.process.exit(3);
}
}
if (self.scenario_sample_count >= self.cfg.samples_per_scenario) {
self.advanceScenario();
}
}
fn advanceScenario(self: *BenchDriver) void {
if (self.cfg.scenario == .both and self.current_scenario == .cold) {
self.current_scenario = .hot;
self.scenario_sample_count = 0;
// main.zig's scenario sequencer will respawn the child
self.current_phase = .idle; // pauses until sequencer re-arms
} else {
self.current_phase = .done;
}
}
pub fn start(self: *BenchDriver) void {
self.current_phase = .running;
}
pub fn finished(self: *const BenchDriver) bool {
return self.current_phase == .done;
}
};
```
- [ ] **Step 2: Build**
Run: `zig build`
Expected: PASS. Fix minor syntax issues (e.g., `std.time.Instant` API shape).
- [ ] **Step 3: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: BenchDriver skeleton with pre/post-tick entry points"
```
### Task 6.3: Populate pre-present hook to request feedback
**Files:**
- Modify: `src/main.zig`
- [ ] **Step 1: Replace the placeholder `benchPrePresentHook`**
```zig
fn benchPrePresentHook(opaque_ctx: ?*anyopaque) void {
const driver: *bench_input.BenchDriver = @ptrCast(@alignCast(opaque_ctx orelse return));
if (driver.current_phase != .running) return;
// Request feedback on the upcoming commit, associate with the *next* frame_counter
// (the one about to be rendered).
const fc = driver.next_expected_frame orelse return;
const feedback_ctx = blk: {
const ctx = alloc_g.create(FeedbackCtx) catch return;
ctx.* = .{ .driver = driver, .frame_counter = fc };
break :blk ctx;
};
_ = wayland_client.PresentationFeedback.init(
globals_g.wp_presentation.?,
surface_g,
feedback_ctx,
&onPresentationFeedback,
) catch return;
}
const FeedbackCtx = struct { driver: *bench_input.BenchDriver, frame_counter: u64 };
fn onPresentationFeedback(
opaque_ctx: ?*anyopaque,
ev: wayland_client.PresentationFeedback.Event,
) void {
const ctx: *FeedbackCtx = @ptrCast(@alignCast(opaque_ctx orelse return));
defer alloc_g.destroy(ctx);
switch (ev) {
.presented => |p| {
const ns: u64 = p.tv_sec * std.time.ns_per_s + p.tv_nsec;
ctx.driver.recordPresented(ctx.frame_counter, ns);
},
.discarded => ctx.driver.recordDiscarded(ctx.frame_counter),
}
}
```
(`alloc_g`, `globals_g`, `surface_g` are file-scoped globals initialized in `main()`; declare them at the top of `main.zig` and assign during init. If the codebase prefers to avoid globals, thread the context through the hook's opaque pointer as a `struct { driver, alloc, globals, surface }` instead.)
- [ ] **Step 2: Add `next_expected_frame` field and increment logic to `BenchDriver`**
In `src/bench_input.zig`:
```zig
next_expected_frame: ?u64 = null,
```
In `preTick`, after injecting, set:
```zig
self.next_expected_frame = frame_counter; // this frame is the candidate
```
And after each frame renders (called from the main loop), bump:
```zig
pub fn notifyFramePresented(self: *BenchDriver, frame_counter: u64) void {
self.next_expected_frame = frame_counter + 1;
}
```
- [ ] **Step 3: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/main.zig src/bench_input.zig
git commit -m "bench: wire pre-present hook to wp_presentation_feedback"
```
### Task 6.4: Grid scan after each frame
**Files:**
- Modify: `src/main.zig` (around the `term.snapshot` call site)
- [ ] **Step 1: Find snapshot site**
Run: `grep -n "term.snapshot" src/main.zig`
- [ ] **Step 2: After snapshot, if in bench mode, scan for sentinel**
Within the render branch, immediately after the snapshot is taken:
```zig
if (bench_driver_ptr) |drv| {
if (drv.in_flight) |s| {
const found = gridContainsCodepoint(&snapshot_view, s.sentinel);
drv.postFrameGridScan(frame_counter, found);
}
}
```
Add a helper (place near other snapshot utilities):
```zig
fn gridContainsCodepoint(snap: *const vt.Snapshot, cp: u21) bool {
// Iterate every visible cell and compare codepoint. Implementation depends
// on the Snapshot API — use the existing row-iteration pattern.
for (snap.rows) |row| {
for (row.cells) |cell| {
if (cell.codepoint == cp) return true;
}
}
return false;
}
```
(If `Snapshot.rows[*].cells[*].codepoint` has a different shape, adapt to match. Grep for `.snapshot()` in vt.zig and follow the existing walking pattern.)
- [ ] **Step 3: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add src/main.zig
git commit -m "bench: scan rendered frame grid for sentinel codepoint"
```
### Task 6.5: Pre-tick injection wired from main loop
**Files:**
- Modify: `src/main.zig` (keyboard events block)
- [ ] **Step 1: Call `driver.preTick` at the top of each main-loop iteration**
Immediately before the existing `keyboard.tickRepeat()` call (around `src/main.zig:373`):
```zig
if (bench_driver_ptr) |drv| {
try drv.preTick(&keyboard, frame_counter);
}
```
- [ ] **Step 2: Declare & initialize `bench_driver_ptr`**
Near the other `var` declarations before the main loop:
```zig
var bench_driver_storage: ?bench_input.BenchDriver = if (bench_input_cfg) |cfg|
bench_input.BenchDriver.init(alloc, cfg)
else
null;
defer if (bench_driver_storage) |*d| d.deinit();
const bench_driver_ptr: ?*bench_input.BenchDriver = if (bench_driver_storage) |*d| d else null;
if (bench_driver_ptr) |d| d.start();
```
- [ ] **Step 3: Build and run cold smoke**
Run: `zig build && WAYSTTY_INPUT_BENCH=cold ./zig-out/bin/waystty 2>/tmp/bench-in.log`
In a floating window: should run, inject sentinels, and eventually print "done" and exit. If it hangs, check logs for `early_timeouts` diagnostic.
- [ ] **Step 4: Commit**
```bash
git add src/main.zig
git commit -m "bench: inject sentinels per iteration from BenchDriver.preTick"
```
### Task 6.6: Unit test the pair-on-arrival state machine
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Add tests**
```zig
test "BenchDriver completes sample: grid first, then feedback" {
const cfg: Config = .{ .scenario = .cold, .samples_per_scenario = 1 };
var d: BenchDriver = .init(std.testing.allocator, cfg);
defer d.deinit();
d.start();
d.in_flight = .{ .sentinel = 0xE000, .t_inject_ns = 1000, .injected_frame = 10 };
d.postFrameGridScan(10, true);
try std.testing.expect(d.in_flight != null);
d.recordPresented(10, 5000);
try std.testing.expectEqual(@as(usize, 1), d.samples.count);
try std.testing.expectEqual(@as(u64, 4000), d.samples.items[0].latencyNs().?);
}
test "BenchDriver completes sample: feedback first, then grid" {
const cfg: Config = .{ .scenario = .cold, .samples_per_scenario = 1 };
var d: BenchDriver = .init(std.testing.allocator, cfg);
defer d.deinit();
d.start();
d.in_flight = .{ .sentinel = 0xE000, .t_inject_ns = 1000, .injected_frame = 10 };
d.recordPresented(11, 5500);
d.postFrameGridScan(11, true);
try std.testing.expectEqual(@as(usize, 1), d.samples.count);
try std.testing.expectEqual(@as(u64, 4500), d.samples.items[0].latencyNs().?);
}
test "BenchDriver times out after max_frames_per_sample" {
const cfg: Config = .{ .scenario = .cold, .samples_per_scenario = 1, .max_frames_per_sample = 5 };
var d: BenchDriver = .init(std.testing.allocator, cfg);
defer d.deinit();
d.start();
d.in_flight = .{ .sentinel = 0xE000, .t_inject_ns = 1000, .injected_frame = 10 };
var f: u64 = 10;
while (f <= 15) : (f += 1) d.postFrameGridScan(f, false);
try std.testing.expectEqual(@as(usize, 1), d.samples.count);
try std.testing.expect(d.samples.items[0].timed_out);
}
```
- [ ] **Step 2: Run**
Run: `zig build test 2>&1 | tail -10`
Expected: all three new tests PASS.
- [ ] **Step 3: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: test pair-on-arrival state machine and timeout"
```
---
## Phase 7 — Scenario sequencer + Makefile target
### Task 7.1: Restart PTY child between cold and hot
**Files:**
- Modify: `src/main.zig`
- [ ] **Step 1: Detect scenario completion + respawn**
Inside the main loop, after calling `drv.postFrameGridScan` or near the end of the loop body:
```zig
if (bench_driver_ptr) |drv| {
if (drv.current_phase == .idle and drv.cfg.scenario == .both and drv.current_scenario == .hot) {
// Transition cold -> hot: teardown existing child, spawn the hot one.
p.gracefulTeardown();
p.deinit();
assertPvAvailable(alloc);
p = try pty.Pty.spawn(.{
.cols = cols,
.rows = rows,
.shell = shell_plan.shell,
.shell_args = &.{ "-c", "yes \"$(printf 'x%.0s' {1..500})\" | pv -qL 24K" },
});
try pty.Pty.ensureEcho(p.slave_fd);
term.setWritePtyCallback(&p, &writePtyFromTerminal);
drv.start();
}
if (drv.finished()) {
// Print stats and exit — see Task 8.1
bench_input.printStats(drv, cols, rows);
return;
}
}
```
- [ ] **Step 2: Build**
Run: `zig build`
Expected: PASS.
- [ ] **Step 3: Commit**
```bash
git add src/main.zig
git commit -m "bench: restart PTY child between cold and hot scenarios"
```
### Task 7.2: `bench-input` Makefile target
**Files:**
- Modify: `Makefile`
- [ ] **Step 1: Add target**
After the `bench` target, add:
```makefile
# Expected runtime: ~15s cold + ~25s hot = ~40s total
# Requires: pv (for hot-mode rate limiting)
bench-input:
$(ZIG) build -Doptimize=$(OPT)
WAYSTTY_INPUT_BENCH=both ./zig-out/bin/waystty 2>bench-input.log || true
@echo "--- input latency ---"
@grep -A 20 "waystty input latency" bench-input.log || echo "(no timing data found)"
```
Also append `bench-input` to the `.PHONY` line.
- [ ] **Step 2: Commit**
```bash
git add Makefile
git commit -m "bench: add bench-input Makefile target"
```
---
## Phase 8 — Output
### Task 8.1: Print stats with grid header + per-scenario rows
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Add `printStats`**
```zig
pub fn printStats(drv: *const BenchDriver, cols: u16, rows: u16) void {
// Split samples by scenario. With current design, cold samples are pushed
// first, then hot — track via scenario transition.
// For simplicity in v1, we tag samples with their scenario at push time:
// TODO: add `scenario` field to Sample — see Task 8.1 step 2.
_ = drv;
_ = cols;
_ = rows;
}
```
**Wait** — the `Sample` struct as defined in Task 6.1 doesn't carry scenario. Fix before proceeding:
- [ ] **Step 2: Add `scenario: Scenario` to Sample**
Edit Task 6.1's Sample struct:
```zig
pub const Sample = struct {
scenario: Scenario,
sentinel: u21,
t_inject_ns: u64,
injected_frame: u64,
grid_seen_frame: ?u64 = null,
presented_ns: ?u64 = null,
timed_out: bool = false,
// ... rest unchanged
};
```
In `preTick`, when constructing the in-flight sample:
```zig
self.in_flight = .{
.scenario = self.current_scenario,
// ... existing
};
```
- [ ] **Step 3: Implement printStats**
```zig
pub fn printStats(drv: *const BenchDriver, cols: u16, rows: u16) void {
var cold_buf: [2000]u64 = undefined;
var hot_buf: [2000]u64 = undefined;
var cold_to: u32 = 0;
var hot_to: u32 = 0;
var cold_n: usize = 0;
var hot_n: usize = 0;
for (drv.samples.items[0..drv.samples.count]) |s| {
if (s.timed_out) {
switch (s.scenario) {
.cold => cold_to += 1,
.hot => hot_to += 1,
.both => unreachable,
}
continue;
}
const lat = s.latencyNs() orelse continue;
switch (s.scenario) {
.cold => { cold_buf[cold_n] = lat; cold_n += 1; },
.hot => { hot_buf[hot_n] = lat; hot_n += 1; },
.both => unreachable,
}
}
std.debug.print(
"\n=== waystty input latency ({d} cold, {d} hot, {d}x{d} grid) ===\n",
.{ cold_n, hot_n, cols, rows },
);
std.debug.print("{s:<10}{s:>8}{s:>8}{s:>8}{s:>8}{s:>8} (us) timeouts\n",
.{ "scenario", "min", "avg", "p50", "p99", "max" });
printRow("cold", cold_buf[0..cold_n], cold_to);
printRow("hot", hot_buf[0..hot_n], hot_to);
}
fn printRow(label: []const u8, vals: []u64, timeouts: u32) void {
if (vals.len == 0) {
std.debug.print("{s:<10} (no samples) timeouts {d}\n", .{ label, timeouts });
return;
}
std.mem.sort(u64, vals, {}, std.sort.asc(u64));
var sum: u128 = 0;
for (vals) |v| sum += v;
const avg = @as(u64, @intCast(sum / vals.len));
const p50_idx = vals.len / 2;
const p99_idx = (vals.len * 99) / 100;
std.debug.print(
"{s:<10}{d:>8}{d:>8}{d:>8}{d:>8}{d:>8} {d}\n",
.{ label, vals[0] / 1000, avg / 1000, vals[p50_idx] / 1000, vals[p99_idx] / 1000, vals[vals.len - 1] / 1000, timeouts },
);
}
```
- [ ] **Step 4: Build and run full cycle**
Run: `zig build && WAYSTTY_INPUT_BENCH=both ./zig-out/bin/waystty 2>/tmp/in-full.log; tail -20 /tmp/in-full.log`
Expected: output matches the spec's sample stats block.
- [ ] **Step 5: Commit**
```bash
git add src/bench_input.zig
git commit -m "bench: printStats for input-latency with per-scenario rows"
```
### Task 8.2 (post-headline): Per-stage breakdown for p99 samples
**Files:**
- Modify: `src/bench_input.zig`
- [ ] **Step 1: Join samples with `FrameTiming`**
Add to `bench_input.zig`:
```zig
pub fn printP99Breakdown(
drv: *const BenchDriver,
ring: *const bench_stats.FrameTimingRing,
scenario: Scenario,
) void {
// Find p99 sample for the given scenario by latency
var best_idx: ?usize = null;
var best_lat: u64 = 0;
for (drv.samples.items[0..drv.samples.count], 0..) |s, i| {
if (s.scenario != scenario) continue;
const lat = s.latencyNs() orelse continue;
if (lat > best_lat) { best_lat = lat; best_idx = i; }
}
const idx = best_idx orelse return;
const sample = drv.samples.items[idx];
const frame = sample.grid_seen_frame orelse return;
// Find timing entry with that frame_counter
var ordered: [bench_stats.FrameTimingRing.capacity]bench_stats.FrameTiming = undefined;
const entries = ring.orderedSlice(&ordered);
for (entries) |ft| {
if (ft.frame_counter == frame) {
std.debug.print(
"\np99 {s} breakdown (latency {d}us, frame {d}):\n" ++
" snapshot {d}, row_rebuild {d}, atlas_upload {d}, instance_upload {d}, gpu_submit {d}\n",
.{ @tagName(scenario), best_lat / 1000, frame,
ft.snapshot_us, ft.row_rebuild_us, ft.atlas_upload_us,
ft.instance_upload_us, ft.gpu_submit_us },
);
return;
}
}
std.debug.print("(p99 frame {d} already evicted from timing ring)\n", .{ frame });
}
```
- [ ] **Step 2: Call from main after `printStats`**
```zig
bench_input.printStats(drv, cols, rows);
bench_input.printP99Breakdown(drv, &frame_ring, .cold);
bench_input.printP99Breakdown(drv, &frame_ring, .hot);
```
- [ ] **Step 3: Build and run**
Run: `zig build && WAYSTTY_INPUT_BENCH=both ./zig-out/bin/waystty 2>/tmp/in.log; tail -25 /tmp/in.log`
Expected: breakdown lines appear after the main table.
- [ ] **Step 4: Commit**
```bash
git add src/bench_input.zig src/main.zig
git commit -m "bench: p99 per-stage breakdown joined on frame_counter"
```
---
## Phase 9 — Final smoke test + docs
### Task 9.1: End-to-end smoke on floating window
**Files:** none modified.
- [ ] **Step 1: Launch a floating waystty on sway**
```bash
# Ensure sway config has rule: `for_window [app_id="waystty"] floating enable`
zig build -Doptimize=ReleaseFast
WAYSTTY_INPUT_BENCH=both ./zig-out/bin/waystty 2>bench-input.log
```
Expected: runs for ~40s, prints stats with grid=80×24, cold < hot, both with low timeouts (< 5%).
- [ ] **Step 2: Record baseline numbers**
Commit the output to a freeform note or just eyeball for sanity: cold p50 on the order of one refresh interval (~16ms); hot p99 > cold p99.
### Task 9.2: Run full test suite
- [ ] **Step 1: All tests pass**
Run: `zig build test 2>&1 | tail -30`
Expected: every test passes.
- [ ] **Step 2: Existing bench still works**
Run: `make bench`
Expected: output includes the new grid-size line; no regression in numbers.
- [ ] **Step 3: Commit any final adjustments**
```bash
git add -u
git commit -m "bench: final polish"
```
---
## Self-review notes
**Spec coverage:**
- Goal / cold + hot metrics → Tasks 4.3, 7.1, 8.1.
- `wp_presentation_time` endpoint → Phase 3 + Task 6.3.
- In-process KeyEvent injection → Task 5.2.
- Echo-gated closed loop → Task 6.2 (in_flight guard).
- MAILBOX preserved → no change made (default stays).
- PUA sentinels → Task 5.1.
- Fixed grid (shared) → Phase 1.
- Termios ECHO → Task 4.2.
- Child teardown → Task 4.5.
- WSI fallback → Task 6.2 (finalizeSample's early_timeouts check).
- Frame-counter correlation → Phase 2 + Task 8.2.
- Pair-on-arrival + discarded handling → Tasks 6.1, 6.2, 6.3 (driver doesn't advance on discarded; keeps listening).
**Dependencies between tasks:** Phase 3 blocks Phase 6.3. Phase 4 depends on Phase 1 (for env vars). All others are straightforward linear.
**Compositor compatibility:** The whole bench assumes a compositor that (a) honors `xdg_toplevel` size hints for floating surfaces, and (b) implements `wp_presentation_time`. sway does both. Compositors that don't will fail at Task 1.3 (size mismatch) or Task 3.4 (global missing) — both with clear diagnostics.