# Workplan Last updated: 2026-04-25 This is the execution plan for making ChromeCard FIDO2 development and validation reproducible on this machine. ## Constraints - Treat `/home/user/chromecard/CR_SDK_CK-main` as read-only. - Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`. - Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`. - Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation. - VM execution path is SSH-first for experiments: `ssh ` and `scp :~`. ## Goals - Re-establish deterministic host-to-card FIDO2 communication over USB HID/CTAPHID. - Restore a buildable/flashable firmware workspace for `CR_SDK_CK-main`. - Turn ad-hoc demos into a repeatable verification flow. - Stand up chained TLS communication in Qubes: `k_client -> k_proxy -> k_server`. - Support both login flow (browser in `k_client`) and user enrollment flow (process in `k_client`). - Minimize repeated card prompts by introducing secure session reuse after successful authentication. - Implement a protected dummy resource on `k_server` (monotonic counter) for end-to-end validation. - Ensure `k_proxy` and `k_server` are thread-safe and support concurrent access. - Prepare `k_proxy` auth path for future transport shift: USB-direct -> wireless phone bridge. ## Phase 0: Qubes VM Baseline (Blocking) 1. Provision/verify AppVMs. - Ensure `k_client`, `k_proxy`, `k_server` exist and are based on `debian-13-xfce`. 2. Assign functional responsibilities. - `k_client`: browser client + enrollment process. - `k_proxy`: USB card access + proxy/auth bridge. - `k_server`: protected resource/service endpoint. 3. Define TLS endpoints and certificates. - `k_proxy` presents TLS service to `k_client`. - `k_server` presents TLS service to `k_proxy`. - Trust roots and cert distribution model documented per VM. Exit criteria: - All 3 VMs exist, boot, and have clearly defined service ownership. ## Phase 1: Qubes Firewall Policy (Blocking) 1. Enforce allowed forward paths only. - Allow `k_client` outbound TLS only to `k_proxy` service port(s). - Allow `k_proxy` outbound TLS only to `k_server` service port(s). - Deny direct `k_client` to `k_server` traffic. 2. Validate return path behavior. - Confirm responses propagate back through established flows. 3. Verify with simple probes. - TLS handshake and HTTP(S) checks from `k_client` to `k_proxy`. - TLS handshake and HTTP(S) checks from `k_proxy` to `k_server`. Exit criteria: - Policy matches intended chain and is test-verified. Status (2026-04-24, remote diagnostics): - Confirmed active blocker remains Phase 1 network policy/pathing. - Evidence from live VM probes: - `k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771)`: TCP timeout. - `k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780)`: upstream timeout. - Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup. Status (2026-04-25, after restart and service recovery): - Refined blocker: this is currently a qrexec/`qubes.ConnectTCP` refusal problem, not an app-local listener problem. - Current evidence: - `k_proxy` local `/health` is up on `127.0.0.1:8771` - `k_server` local `/health` is up on `127.0.0.1:8780` - `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused` - `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused` - Immediate next action for Phase 1: - verify and fix the dom0 policy/mechanism that should permit `qubes.ConnectTCP` forwarding for the chain ports ## Phase 2: TLS Certificates and Service Endpoints 1. Certificate model. - Create or import CA and issue certs for `k_proxy` and `k_server`. - Install trust roots in client VM(s) that need validation. 2. Service shape. - `k_server`: HTTPS service exposing protected resource endpoint(s), including a monotonic counter endpoint. - `k_proxy`: minimal HTTPS API gateway service (full web server framework not required). 3. Endpoint contract. - Define request/response schema between `k_client` and `k_proxy`. - Define upstream request contract from `k_proxy` to `k_server`. Exit criteria: - Mutual TLS trust decisions are documented and tested. - HTTPS calls succeed on both links with expected cert validation. Status (2026-04-25): - Implemented HTTPS listeners in both prototype services. - Added local CA + service certificate generation in `generate_phase2_certs.py`. - Verified the working Qubes path is localhost forwarding plus TLS: - `k_client` local `9771` forwards to `k_proxy:8771` - `k_proxy` local `9780` forwards to `k_server:8780` - Verified cert validation on both hops using the generated CA. - Verified end-to-end HTTPS flow: - `k_client -> k_proxy` login over TLS - `k_proxy -> k_server` protected counter call over TLS - session reuse still works across repeated protected requests - Phase 2 is now effectively complete for the current prototype shape. ## Phase 2.5: Define State Ownership and Concurrency Model 1. State ownership. - Decide where user/session state is authoritative (`k_proxy`, `k_server`, or split model). - Define token/session format and validation boundary. 2. Concurrency controls. - Define thread-safe strategy for session store and shared counters. - Define locking/atomic/update semantics for counter increments and session updates. 3. Runtime model. - Choose service runtime/config that supports simultaneous requests safely. Exit criteria: - Architecture clearly documents state authority and race-free update rules. Next action (2026-04-25): - Move into Phase 2.5 and make the current prototype decisions explicit: - authority for session state remains `k_proxy` - `k_server` remains authority for the protected counter/resource state - localhost Qubes forwarders are part of the active runtime model for the two TLS hops - define concurrency assumptions and limits around session store, forwarders, and counter access Status (2026-04-25): - Current ownership model is now explicit: - `k_proxy` is authoritative for session creation, expiry, lookup, and logout - `k_server` is authoritative for the protected monotonic counter - `k_client` is a client only; it holds bearer tokens but is not a state authority - Current validation boundary is explicit: - `k_proxy` validates bearer tokens against its in-memory session store - `k_server` trusts only requests that arrive with the configured `X-Proxy-Token` - `k_server` does not currently validate end-user session tokens directly - Current concurrency strategy is explicit: - `k_proxy` uses `ThreadingHTTPServer` plus one lock around the in-memory session map - `k_server` uses `ThreadingHTTPServer` plus one lock around counter increments - upstream HTTPS calls from `k_proxy` are made outside the session-store lock - Current runtime limits are explicit: - sessions are process-local and disappear on `k_proxy` restart - counter state is process-local and resets on `k_server` restart - transport relies on Qubes localhost forwarders `9771` and `9780` - Phase 2.5 is complete for the current prototype shape. ## Phase 3: Recover Basic Device Visibility on `k_proxy` (Blocking) 1. Verify physical + USB enumeration path. - Check cable/port and confirm device appears in USB listings. - Confirm `/dev/hidraw*` nodes appear when card is connected. 2. Validate Linux permissions. - Install/update udev rule for ChromeCard HID VID/PID. - Reload udev and verify non-root read/write access to hidraw node. 3. Re-run host probe. - Run `python3 /home/user/chromecard/fido2_probe.py --list`. - Run `python3 /home/user/chromecard/fido2_probe.py --json`. - Record VID/PID/path and CTAP2 `getInfo` output in `Setup.md`. Exit criteria: - At least one CTAP HID device is listed. - `--json` returns valid `ctap2_info`. ## Phase 4: Re-validate Local WebAuthn Demo on `k_proxy` 1. Start local demo server. - Run `python3 /home/user/chromecard/webauthn_local_demo.py`. - Confirm URL is `http://localhost:8765`. 2. Exercise register/login. - Register a test user. - Authenticate with same user. - Capture errors (if any) and update `Setup.md`. 3. Decide next demo hardening step. - Keep bring-up-only mode, or - add signature verification for attestation/assertion. Exit criteria: - Register and login both complete with card interaction prompts. Status (2026-04-24): - Completed in `k_proxy` using `http://localhost:8765`. - Registration result: `ok=true`, `username=alice`, `credential_count=1`. - Authentication result: `ok=true`, `username=alice`, `authenticated=true`. ## Phase 5: Implement Proxy Auth + Session Reuse 1. Authenticate via card once per session window. - `k_proxy` handles initial auth using connected card. - On success, create session state for `k_client`. 2. Session model. - Prefer server-side session store or signed session token. - Include TTL/expiry, rotation, and explicit invalidation/logout path. - Do not expose card secrets or long-lived auth material to `k_client`. 3. Proxying behavior. - With valid session: `k_proxy` forwards request to `k_server` and returns result. - Without valid session: require fresh card-backed auth flow. Exit criteria: - Repeated authorized requests do not require card interaction until session expiry. - Expired/invalid sessions are correctly rejected. Status (2026-04-24): - Started with a runnable prototype: - `/home/user/chromecard/k_proxy_app.py` - `/home/user/chromecard/k_server_app.py` - `/home/user/chromecard/PHASE5_RUNBOOK.md` - Implemented in prototype: - session create/status/logout endpoints in `k_proxy` - TTL-based server-side session store with expiry garbage collection - protected monotonic counter endpoint in `k_server` with thread-safe increments - proxy forwarding from `k_proxy` to `k_server` using a shared upstream token - Current auth gate for session creation is card-presence probe (`fido2_probe.py --json`), pending upgrade to full assertion verification path. Status (2026-04-25): - Prototype services were re-started successfully after VM restart. - Current split-VM test shape is: - `k_proxy` listening on `127.0.0.1:8771` - `k_server` listening on `127.0.0.1:8780` - Phase 5 application logic is runnable locally inside each VM, but end-to-end validation is still blocked by Phase 1 qrexec forwarding refusal. ## Phase 5.5: Implement Dummy Resource + Access Policy on `k_server` 1. Protected dummy resource. - Add endpoint returning increasing number. - Require valid upstream auth/session context from `k_proxy`. 2. Optional user/session handling. - Add minimal user/session checks if `k_server` is chosen as authority (or partial authority). 3. Correctness under concurrency. - Ensure increments are monotonic and race-safe under parallel calls. Exit criteria: - Authorized requests obtain consistent increasing values. - Unauthorized requests are rejected. ## Phase 6: Integrate Client Enrollment + Proxy Login Flow 1. Enrollment process in `k_client`. - Start process from `k_client` that captures new-user enrollment intent/data. - Route enrollment requests to `k_proxy` over TLS. 2. Card-mediated login in `k_proxy`. - `k_proxy` uses connected card for FIDO2/WebAuthn operations. - `k_proxy` authenticates toward `k_server` over TLS. 3. Browser flow in `k_client`. - Browser traffic goes only to `k_proxy`. - Validate end-to-end login to `k_server` resource through proxy chain. Exit criteria: - Enrollment and login both function end-to-end via `k_client -> k_proxy -> k_server`. Status (2026-04-25): - Added first `k_client` implementation at `/home/user/chromecard/k_client_portal.py`. - Current prototype flow: - browser now targets `k_proxy` directly over `https://127.0.0.1:9771` - `k_client_portal.py` remains only as a temporary bridge page - `k_proxy` continues to authenticate with the card and forward to `k_server` - Verified end-to-end through the portal: - enroll `alice` - login succeeds - session status succeeds - protected counter succeeds repeatedly with session reuse - logout succeeds - Enrollment contract progress: - `k_proxy` now exposes prototype enrollment endpoints - proxy-side enrollment storage exists and is checked before login is allowed - direct browser/API traffic can now use those proxy endpoints without going through the local bridge - Phase 6 is materially further along for the current prototype shape: - direct browser target is on `k_proxy` - login/resource flow is integrated on the direct proxy path - enrollment now has a real client->proxy path - the `k_client` bridge remains only for transition/compatibility - final enrollment semantics are still provisional ## Phase 6.5: Concurrency and Multi-Client Test Setup 1. Single-VM concurrency tests. - Generate parallel request bursts from `k_client` to `k_proxy`. - Verify response integrity, session reuse behavior, and error rates. 2. Multi-client tests. - Run requests from multiple `k_client` instances (or equivalent parallel clients) concurrently. - Verify isolation between users/sessions. 3. Acceptance checks. - No race-related crashes/corruption in `k_proxy` or `k_server`. - Counter/resource behavior remains correct under load. - Session reuse reduces card prompts while preserving authorization checks. Exit criteria: - Test results demonstrate stable concurrent operation with documented limits. ## Phase 7: Restore Firmware Build/Flash Path 1. Validate SDK tree completeness. - Confirm presence of `mvp`, `setup`, `components`, `samples` under `CR_SDK_CK-main`. - If missing, obtain full repository/checkpoint and document source. 2. Install/enable build tools. - Ensure `west` and `nrfjprog` are available in shell. - Confirm target board/toolchain match (`nrf7002dk/nrf5340/cpuapp`, NCS `v2.9.2` baseline in docs). 3. Run baseline build+flash. - From `CR_SDK_CK-main`, run `./scripts/build_flash_mvp.sh`. - If flashing fails, run documented recovery and retry. Exit criteria: - Successful `west build` and `west flash`. ## Phase 8: Consolidate Documentation and Paths 1. Remove path drift between docs and actual files. - Keep `fido2_probe.py` and `webauthn_local_demo.py` at workspace root. - Ensure docs never instruct placing helper scripts under `CR_SDK_CK-main`. - Update references consistently in all docs. 2. Keep `Setup.md` current. - After each significant change, update status snapshot and outcomes. 3. Add minimal reproducibility checklist. - One command list for probe + demo + build/flash prechecks. 4. Maintain Markdown execution records continuously. - `Setup.md` and `Workplan.md` are the canonical living docs for this workspace. - Re-scan relevant `.md` files before each new execution cycle and reconcile drift. - Record date-stamped session notes when priorities or blockers change. Status (2026-04-24, markdown maintenance): - Re-scanned the active workspace Markdown set and the main source-tree reference docs. - No workplan phase change was required from this pass. - Ongoing documentation watch item remains path drift in `CR_SDK_CK-main/README_HOST.md`, which still uses historical `./scripts/...` helper locations instead of workspace-root helper paths. - Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell. Status (2026-04-24, chain probe retry): - Phase 1 remains blocked, but the failure point is now narrowed further: - current refusal occurs at Qubes `qubes.ConnectTCP` policy/service evaluation for ports `22`, `8770`, and `8780` - this happens before any end-to-end app-level request can be retried - Practical implication: - do not spend time on `k_proxy_app.py` / `k_server_app.py` request handling until qrexec forwarding is permitting the intended hops again - next recovery action is to fix/activate the relevant Qubes `qubes.ConnectTCP` policy and then re-run the qrexec bridge checks before testing HTTP flow Status (2026-04-25, post-restart probe): - Corrected the client-facing proxy port reference to `8771`. - SSH access to `k_proxy` and card visibility recovered after VM restart. - New immediate blockers are: - `k_proxy` service not listening on `127.0.0.1:8771` - `k_server` service not listening on `127.0.0.1:8780` - qrexec forwarding for `8771` and `8780` still returns `Request refused` - Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow. Status (2026-04-25, service restart): - Local VM services are running again on the intended loopback ports: - `k_server`: `127.0.0.1:8780` - `k_proxy`: `127.0.0.1:8771` - Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports. - Next action is no longer app startup; it is fixing the `qubes.ConnectTCP` allow path for `8771` and `8780`. Status (2026-04-25, in-VM forwarding test): - Verified that using `qvm-connect-tcp` inside the source VMs still does not complete the client->proxy hop: - bind succeeds locally, but first real connection gets `Request refused` - Independent app-layer blocker also found in `k_proxy`: - `python-fido2` is missing there, so local `/session/login` currently fails before card auth can succeed - Current ordered blockers: - first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771` - second: install `python-fido2` in `k_proxy` - third: re-test end-to-end login and then proxy->server counter flow Status (2026-04-25, after python3-fido2 install): - `python3-fido2` blocker in `k_proxy` is resolved. - Updated ordered blockers: - first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771` - second: restore CTAP HID device visibility/access in `k_proxy` (`No CTAP HID devices found`) - third: re-test end-to-end login and then proxy->server counter flow Status (2026-04-25, card reattached): - CTAP HID visibility/access in `k_proxy` is restored. - Local proxy login is working again with the attached card. - The only currently confirmed blocker for the end-to-end path is the `k_client -> k_proxy:8771` qrexec/`qvm-connect-tcp` refusal. Status (2026-04-25, clean forward retest): - The retest shows the same qrexec failure mode on both hops, not just the client-facing one. - Updated blocker statement: - effective `qubes.ConnectTCP` allow path is failing for both - `k_client -> k_proxy:8771` - `k_proxy -> k_server:8780` - App services and card path are currently good; forwarding remains the single active system blocker. Status (2026-04-25, dom0 policy fix validated): - The explicit-destination dom0 `qubes.ConnectTCP` policy fix resolved forwarding on both hops. - Current verified working chain: - `k_client -> k_proxy:8771` - `k_proxy -> k_server:8780` - Current verified prototype behavior: - session login works from `k_client` - session status works - protected counter flow reaches `k_server` - session reuse avoids re-login for repeated counter calls - logout invalidates the session and subsequent protected access returns `401` - Immediate networking blocker is cleared. Exit criteria: - New team member can follow docs end-to-end without path or tooling ambiguity. ## Phase 9: Migrate to Phone-Mediated Wireless Validation (Future) 1. Auth transport abstraction in `k_proxy`. - Introduce/keep a transport interface for authenticator operations. - Implement at least two backends: - USB-direct backend (current). - Phone-wireless backend (future). 2. Wireless phone integration. - Define protocol between `k_proxy` and phone service. - Define secure pairing/authentication and message integrity for wireless link. - Add timeout/retry behavior and offline handling. 3. Functional equivalence tests. - Verify login/enrollment behavior is unchanged at API level for `k_client`. - Verify session reuse still works and card prompts are not increased unexpectedly. Exit criteria: - `k_proxy` can validate via wireless phone path with no client-facing API changes. ## Inputs Expected During This Session - Exact observed behavior on reconnect attempts (USB/hidraw/probe). - Whether we should pull server-side code now. - Any board/firmware variants different from default documentation assumptions. - Preferred TLS ports, certificate approach, and hostname scheme for `k_client`, `k_proxy`, `k_server`. - Session TTL and invalidation requirements for cached authenticated access. - Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split). - Target concurrency level for validation (parallel clients and parallel requests per client). - Preferred wireless transport/protocol between `k_proxy` and phone (for future phase). ## Session Maintenance Notes (2026-04-24) - Top-level Markdown review completed for `PHASE5_RUNBOOK.md`, `Setup.md`, and `Workplan.md`. - Current execution plan remains in sync with the Phase 5 runbook: - prototype services at `/home/user/chromecard/k_proxy_app.py` and `/home/user/chromecard/k_server_app.py` - run sequence documented in `/home/user/chromecard/PHASE5_RUNBOOK.md` - No phase ordering or blocker changes were required from this review pass. - Remote execution support is now active and validated: - `ssh` command execution works for `k_client`, `k_proxy`, `k_server` - `scp` push to VM home works (validated on `k_proxy`)