k_card/PHASE5_RUNBOOK.md

255 lines
9.5 KiB
Markdown

# Phase 5 Runbook (Session Reuse Prototype)
This runbook starts a minimal `k_server` + `k_proxy` prototype for session reuse testing.
Last updated: 2026-04-25
Related browser demo:
- `k_client_portal.py` can now be used in `k_client` at `http://127.0.0.1:8766` to show:
- registration
- current registered-user list from `k_proxy`
- unregister from the browser page
- login with card approval/denial
- protected `k_server` counter access
- logout
- explicit "k_server was not called" behavior when login is denied
## What This Prototype Covers
- `k_proxy` creates short-lived sessions.
- Session creation uses a card-presence check (`fido2_probe.py --json`) as the current auth gate.
- Valid sessions can repeatedly access a protected `k_server` counter endpoint without re-running card auth each request.
- Session status and logout/invalidation paths are implemented.
## Modes
There are two useful ways to run this prototype:
- Same-VM quickstart: `k_proxy` and `k_server` run on one VM for app-local testing.
- Split-VM chain: `k_proxy` runs in `k_proxy`, `k_server` runs in `k_server`, and the Qubes forwarding layer must permit the chain.
## Start Services
### Same-VM quickstart
This matches the code defaults and is useful for basic app behavior only.
In the chosen VM:
```bash
python3 /home/user/chromecard/k_server_app.py --host 127.0.0.1 --port 8780 --proxy-token dev-proxy-token
```
In the same VM:
```bash
python3 /home/user/chromecard/k_proxy_app.py \
--host 127.0.0.1 \
--port 8770 \
--session-ttl 300 \
--server-base-url http://127.0.0.1:8780 \
--proxy-token dev-proxy-token
```
### Split-VM chain
This is the current Qubes target shape.
In `k_server` VM:
```bash
python3 /home/user/chromecard/k_server_app.py \
--host 127.0.0.1 \
--port 8780 \
--proxy-token dev-proxy-token \
--tls-certfile /home/user/chromecard/tls/phase2/k_server.crt \
--tls-keyfile /home/user/chromecard/tls/phase2/k_server.key
```
In `k_proxy` VM:
```bash
qvm-connect-tcp 9780:k_server:8780
```
Notes:
```bash
python3 /home/user/chromecard/k_proxy_app.py \
--host 127.0.0.1 \
--port 8771 \
--session-ttl 300 \
--server-base-url https://127.0.0.1:9780 \
--server-ca-file /home/user/chromecard/tls/phase2/ca.crt \
--proxy-token dev-proxy-token \
--tls-certfile /home/user/chromecard/tls/phase2/k_proxy.crt \
--tls-keyfile /home/user/chromecard/tls/phase2/k_proxy.key
```
In `k_client` VM:
```bash
qvm-connect-tcp 9771:k_proxy:8771
```
Notes:
- Current validated split-VM path is `k_client localhost:9771 -> k_proxy localhost:8771 -> k_proxy localhost:9780 forward -> k_server localhost:8780`.
- Use `--cacert /home/user/chromecard/tls/phase2/ca.crt` for TLS verification in `curl`-based checks.
- Raw VM-IP routing is not the validated path for the current prototype.
## Ownership And Concurrency
- `k_proxy` is authoritative for session state.
- `k_server` is authoritative for the protected counter state.
- Sessions are in-memory only in `k_proxy` and are lost on proxy restart.
- The protected counter is in-memory only in `k_server` and resets on server restart.
- Both services use `ThreadingHTTPServer`.
- `k_proxy` guards its session store with a single process-local lock.
- `k_server` guards counter increments with a single process-local lock.
- Qubes localhost forwarders are transport plumbing only; they are not a source of state authority.
## Test Flow
Use the proxy port that matches the mode you started:
- Same-VM quickstart: `8770`
- Split-VM chain: `9771` from `k_client`, `8771` inside `k_proxy`
Create a session (runs auth gate once):
```bash
curl -sS --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/session/login \
-H 'Content-Type: application/json' \
-d '{"username":"alice"}'
```
Copy `session_token` from response, then:
```bash
TOKEN='<paste-token>'
```
Check session:
```bash
curl -sS --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/session/status \
-H "Authorization: Bearer $TOKEN"
```
Call protected resource multiple times (should not require new login):
```bash
curl -sS --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/resource/counter \
-H "Authorization: Bearer $TOKEN"
curl -sS --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/resource/counter \
-H "Authorization: Bearer $TOKEN"
```
Logout/invalidate:
```bash
curl -sS --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/session/logout \
-H "Authorization: Bearer $TOKEN"
```
Re-check after logout (should fail with 401):
```bash
curl -i --cacert /home/user/chromecard/tls/phase2/ca.crt -X POST https://127.0.0.1:<proxy-port>/resource/counter \
-H "Authorization: Bearer $TOKEN"
```
## Regression Script
For the split-VM chain, use the host-side regression helper:
```bash
/home/user/chromecard/phase5_chain_regression.sh
```
Defaults:
- Drives the test from `k_client` over SSH.
- Uses `https://127.0.0.1:9771` and `/home/user/chromecard/tls/phase2/ca.crt` inside `k_client`.
- Logs in as `alice`.
- Runs `20` counter requests at parallelism `8`.
- Verifies that returned counter values are unique and gap-free, then logs out and checks for `401` after logout.
Useful overrides:
```bash
REQUESTS=50 PARALLELISM=12 /home/user/chromecard/phase5_chain_regression.sh
```
```bash
/home/user/chromecard/phase5_chain_regression.sh --username alice --client-host k_client
```
For the browser-facing `k_client` page, use the Playwright regression spec:
```bash
npm install
npx playwright install
npm run test:k-client
```
Notes:
- default target is `http://127.0.0.1:8766`
- override with `PORTAL_BASE_URL=http://127.0.0.1:8766`
- the spec expects manual card confirmation during register and login
- timeouts can be tuned with `CARD_REGISTRATION_TIMEOUT_MS` and `CARD_LOGIN_TIMEOUT_MS`
- from this host, a forwarded portal URL was used successfully:
- `PORTAL_BASE_URL=http://127.0.0.1:18766 npm run test:k-client`
Verified result on 2026-04-25:
- Live split-VM chain passed end-to-end.
- Login, session status, counter reuse, and logout all worked from `k_client`.
- A `20` request / `8` worker concurrency burst returned unique, gap-free counter values `23..42`.
- The Playwright browser regression for `k_client_portal.py` also passed end-to-end:
- register
- login
- protected counter
- logout
- unregister
## Current Limitation
- This uses card-presence probing, not a full WebAuthn assertion verification path.
- Intended as a Phase 5 starter for session semantics and proxy/server behavior.
- Session and counter state are currently process-local only; restart loses state.
- Upstream trust still relies on a shared static `X-Proxy-Token`.
- Experimental direct FIDO2 mode now exists in `k_proxy_app.py` behind `--auth-mode fido2-direct`, but it is not the default runtime:
- direct registration on the current `k_proxy` card/library stack still fails with `No compatible PIN/UV protocols supported!`
- a CTAP1 fallback probe did not complete quickly enough to promote as the working path
- the deployed service was restored to default `probe` mode so the validated Phase 5 chain remains usable
- Raw CTAP debugging helper now exists at `/home/user/chromecard/raw_ctap_probe.py`:
- use it on `k_proxy` to exercise low-level `makeCredential` / `getAssertion`
- it logs keepalive callbacks and transport exceptions
- Current blocker before the next direct-auth attempt:
- `k_proxy` currently has no visible `/dev/hidraw*`
- `python3 /home/user/chromecard/fido2_probe.py --list` in `k_proxy` returns `No CTAP HID devices found.`
- restore card visibility first, then retry the raw CTAP probe and stop to tell the user when to press `yes` or `no`
- Latest retry after card reattach:
- `/dev/hidraw0` and `/dev/hidraw1` are visible in `k_proxy` again
- `/dev/hidraw0` opens successfully as the normal user, but `/dev/hidraw1` is still permission-denied
- raw `makeCredential` still shows no card prompt, so the hang is before the firmware confirmation UI
- hidraw inspection confirms `/dev/hidraw0` is the real FIDO interface and `/dev/hidraw1` is a separate vendor HID interface
- manual CTAPHID `INIT` written directly to `/dev/hidraw0` gets no reply at all within `3s`
- rerunning `webauthn_local_demo.py` inside `k_proxy` also shows no card prompt on register
- next step is to recover the USB/Qubes transport path before retrying direct auth
- after a full power cycle and reattach, manual CTAPHID `INIT` replies again and `webauthn_local_demo.py` registration succeeds again
- direct `raw_ctap_probe.py --device-path /dev/hidraw0 make-credential --rp-id localhost` also succeeds again after pressing `yes` on the card
- `k_proxy_app.py --auth-mode fido2-direct` was patched to use low-level CTAP2 and to auto-detect the working `/dev/hidraw*` node when the card re-enumerates
- after additional fixes for hidraw lifetime, VM-side `python-fido2` response mapping, and CTAP payload shape, `/enroll/register` now succeeds again for `directtest`
- `/session/login` for `directtest` now also succeeds after card confirmation and returns `auth_mode: "fido2_assertion"`
- `/session/status` succeeds
- protected `/resource/counter` succeeds again through `k_proxy -> k_server`
- `/session/logout` succeeds
- post-logout protected access returns `401`
- temporary direct-mode hidraw lifetime logging was removed again after diagnosis
- `phase5_chain_regression.sh` now supports card-interactive direct auth via `--interactive-card --expect-auth-mode fido2_assertion`