8ac98377
docs: add exfiltration detection design spec
a73x 2026-03-29 16:00
diff --git a/docs/superpowers/specs/2026-03-29-exfil-detection-design.md b/docs/superpowers/specs/2026-03-29-exfil-detection-design.md new file mode 100644 index 0000000..bc8c27e --- /dev/null +++ b/docs/superpowers/specs/2026-03-29-exfil-detection-design.md @@ -0,0 +1,106 @@ # Exfiltration Detection for nono-proxy ## Problem The nono proxy currently does host-based allowlisting only. A sandboxed process can still exfiltrate sensitive data (SSH keys, passwords, API tokens) to an approved host. We need content inspection to detect and block this. ## Approach Inline MITM in the existing proxy. Extend the single `nono-proxy` binary to do TLS interception on CONNECT requests, enabling full request body scanning for both HTTP and HTTPS traffic. ## CA Certificate Management On first startup, `nono-proxy` checks for `~/.local/share/nono/ca.key` and `~/.local/share/nono/ca.pem`. If missing, it generates: - An ECDSA P-256 CA private key - A self-signed CA certificate ("Nono Proxy CA", 10-year validity) Saved to disk and reused on subsequent runs. Per-host leaf certificates are generated on-the-fly at CONNECT time, signed by this CA, and cached in-memory (keyed by hostname) for the process lifetime. The `nono` wrapper script is updated to: - Bind-mount `ca.pem` into the sandbox (read-only) - Set `SSL_CERT_FILE` and `NODE_EXTRA_CA_CERTS` so tools inside the sandbox trust it ## MITM CONNECT Handling The current `handleConnect` does a blind TCP tunnel. The new flow: 1. Hijack the client connection, send `200 Connection Established` 2. Generate (or fetch from cache) a leaf cert for the requested hostname, signed by the nono CA 3. Wrap the client connection in a `tls.Server` using that leaf cert 4. Establish a real `tls.Client` connection to the target host 5. Read HTTP requests from the client-side TLS connection, run them through the scanner, and if clean, forward to the target 6. Relay the response back to the client For non-HTTP protocols over CONNECT (e.g. WebSockets upgrade after initial HTTP), forward the upgraded connection as a raw tunnel after the initial request passes scanning. ## Request Body Scanner A `scanner` package with a `Scan(body []byte) []Finding` function. Each `Finding` has a `Rule` name and a `Match` snippet (truncated for logging, not the full secret). ### Default Rules | Rule | Pattern | |------|---------| | `ssh-private-key` | `-----BEGIN (OPENSSH\|RSA\|DSA\|EC\|ED25519) PRIVATE KEY-----` | | `pgp-private-key` | `-----BEGIN PGP PRIVATE KEY BLOCK-----` | | `basic-auth` | `Authorization: Basic` header | | `bearer-token` | `Authorization: Bearer` header | | `aws-access-key` | `AKIA[0-9A-Z]{16}` | | `github-token` | `gh[ps]_[A-Za-z0-9_]{36,}` | | `openai-key` | `sk-[A-Za-z0-9]{32,}` | | `password-field` | `password=` or `"password":` in body | | `env-file` | 3+ consecutive lines matching `[A-Z_]+=.+` | ### Configurable Rules Rules are loaded from `~/.local/share/nono/rules.yaml`. On first run, `nono-proxy` writes a default file with the built-in rules if one doesn't exist. Users can add, remove, or modify rules. Format: ```yaml rules: - name: ssh-private-key pattern: "-----BEGIN (OPENSSH|RSA|DSA|EC|ED25519) PRIVATE KEY-----" - name: github-token pattern: "gh[ps]_[A-Za-z0-9_]{36,}" ``` Each rule is a name + regex pattern. The scanner compiles them at startup and returns an error if any pattern is invalid. ### Behavior - Scans outbound request bodies only (not responses) - Reads the full request body via `io.ReadAll`, scans, and if clean replays via `bytes.Reader` - On match: logs `BLOCKED <method> <host> [rule1, rule2]`, returns 403 with message like `"request blocked: contains sensitive data (ssh-private-key)"` ## Request Flow ``` Client in sandbox -> plain HTTP or CONNECT to nono-proxy (port 9854) -> host allowlist check (existing logic, unchanged) -> if CONNECT: MITM TLS termination, read inner HTTP request -> read request body, run scanner rules -> if findings: log BLOCKED, return 403 -> if clean: forward to target, relay response ``` ## Code Changes - `proxy/proxy.go` — `Proxy` struct gains `caKey`/`caCert` fields and `certCache map[string]*tls.Certificate`. `New()` takes a CA path in addition to the hosts file. `handleConnect` replaced with MITM flow. `handleHTTP` gets scanner check before forwarding. - `scanner/` — new package with `Rule`, `Finding`, `Scanner` (loads rules from YAML, compiles regexes, exposes `Scan([]byte) []Finding`) - `ca/` — new package with `LoadOrCreateCA(dir string)` and `GenerateLeafCert(host string, ca)` functions - `cmd/nono-proxy/main.go` — loads CA at startup, passes to `proxy.New()`, writes default `rules.yaml` if missing - `nono` script — adds `--ro-bind` for `ca.pem`, sets `SSL_CERT_FILE` and `NODE_EXTRA_CA_CERTS` ## New Dependencies - `gopkg.in/yaml.v3` for rules config - Everything else is stdlib (`crypto/x509`, `crypto/tls`, `crypto/ecdsa`) ## Scan Direction - Request bodies only (outbound exfiltration detection) - No size cap — large uploads are themselves suspicious