diff --git a/Setup.md b/Setup.md index 715b521..2f0123c 100644 --- a/Setup.md +++ b/Setup.md @@ -133,6 +133,10 @@ Thread-safety expectation: - `/home/user/chromecard/k_proxy_app.py` - `/home/user/chromecard/k_server_app.py` - `/home/user/chromecard/PHASE5_RUNBOOK.md` +- Remote VM access is now available via SSH/SCP aliases: + - command execution: `ssh ` + - file copy to VM home: `scp :~` + - validated hosts: `k_client`, `k_proxy`, `k_server` - `west` is not currently installed/in PATH: `west not found`. - The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout: - missing: `mvp`, `setup`, `components`, `samples` @@ -156,6 +160,154 @@ Session note (2026-04-24): - Local WebAuthn demo completed successfully for user `alice` (register + login). - Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding. +Session note (2026-04-24, doc maintenance): +- Top-level Markdown files were re-scanned: `PHASE5_RUNBOOK.md`, `Setup.md`, `Workplan.md`. +- `PHASE5_RUNBOOK.md` remains consistent with the current Phase 5 prototype paths and flow. +- No plan/setup drift was found requiring behavioral changes; docs remain aligned. +- SSH-based VM operation was validated for `k_client`, `k_proxy`, `k_server` (Debian `13.4` confirmed remotely). +- SCP file transfer to `k_proxy` home directory was validated with read-back. + +Session note (2026-04-24, remote flow diagnostics): +- VM script staging gap found: `/home/user/chromecard/k_proxy_app.py`, `k_server_app.py`, and helper files were missing on AppVMs and were copied via `scp`. +- Services were started in VMs and verified locally: + - `k_proxy` local health OK on `127.0.0.1:8770` and `127.0.0.1:8771` + - `k_server` local health OK on `127.0.0.1:8780` +- Verified VM IPs during this run: + - `k_proxy`: `10.137.0.12` + - `k_server`: `10.137.0.13` + - `k_client`: `10.137.0.16` +- Current chain failure is network pathing/firewall: +- `k_client -> k_proxy` (`10.137.0.12:8771`) times out. +- `k_proxy -> k_server` (`10.137.0.13:8780`) times out. +- Proxy returns upstream error payload: `server unavailable: timed out`. + +Session note (2026-04-24, markdown re-scan): +- Re-read top-level workspace Markdown files: `Setup.md`, `Workplan.md`, `PHASE5_RUNBOOK.md`. +- Re-skimmed source-tree reference docs in `CR_SDK_CK-main`, including `BUILD.md`, `README.md`, `README_HOST.md`, `RELEASE.md`, and `distribute_bundle.md`. +- Current workspace docs remain aligned with the verified execution record. +- Source-tree doc drift remains unchanged: + - `README_HOST.md` still points to `./scripts/fido2_probe.py` and `./scripts/webauthn_local_demo.py`. + - Active workspace policy continues to treat those paths as historical; maintained helper paths remain `/home/user/chromecard/fido2_probe.py` and `/home/user/chromecard/webauthn_local_demo.py`. +- Source-tree build docs continue to describe a full SDK layout with `mvp`, `setup`, `components`, and `samples`, which is still not present in the current local checkout snapshot. + +Session note (2026-04-24, policy retry): +- Markdown re-scan was retried after local policy changes. +- Re-running the workspace doc scan with a non-login shell completed cleanly, without the earlier SSH/socat startup noise in command output. + +Session note (2026-04-24, chain probe retry): +- Re-probed the Qubes access path for `k_client -> k_proxy -> k_server`. +- Local forwarded SSH listener ports still exist on the host: + - `0.0.0.0:2222` -> `qrexec-client-vm 'k_client' qubes.ConnectTCP+22` + - `0.0.0.0:2223` -> `qrexec-client-vm 'k_proxy' qubes.ConnectTCP+22` + - `0.0.0.0:2224` -> `qrexec-client-vm 'k_server' qubes.ConnectTCP+22` +- These forwarded SSH ports currently fail immediately: + - `ssh k_client` / `ssh k_proxy` / `ssh k_server` close immediately on localhost forwarded ports. + - Direct `qrexec-client-vm qubes.ConnectTCP+22` returns `Request refused`. +- Chain ports are currently blocked at the same qrexec layer: + - `qrexec-client-vm k_proxy qubes.ConnectTCP+8770` -> `Request refused` + - `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused` +- This means the current blocker is active qrexec policy/service refusal for `qubes.ConnectTCP`, not the Python service code in `k_proxy_app.py` or `k_server_app.py`. +- Separate SSH config issue remains on the host: + - `/etc/ssh/ssh_config.d/20-systemd-ssh-proxy.conf` is still owned `root:root` but mode `777`, which causes OpenSSH to reject it as insecure on the normal login-shell path. + +Session note (2026-04-25, post-restart probe): +- Correct client-facing proxy port is `8771` for the current split-VM chain checks. +- SSH to `k_proxy` is working again. +- `k_proxy` card visibility is restored after VM restart and card reconnect: + - `/dev/hidraw0` and `/dev/hidraw1` are present in `k_proxy` +- Current service state after restart: + - `k_proxy` has no listener on `127.0.0.1:8771` + - `k_server` has no listener on `127.0.0.1:8780` +- Current qrexec chain state after restart: + - `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused` + - `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused` +- Practical meaning: + - SSH and card attachment recovered + - phase-5 app services are not currently running in the VMs + - qrexec forwarding for the chain ports is still being refused + +Session note (2026-04-25, service restart): +- `k_server_app.py` was restarted successfully in `k_server`: + - PID `1320` + - listening on `127.0.0.1:8780` + - `/health` returns `{"ok": true, "service": "k_server", ...}` +- `k_proxy_app.py` was restarted successfully in `k_proxy`: + - PID `2774` + - listening on `127.0.0.1:8771` + - `/health` returns `{"ok": true, "service": "k_proxy", "active_sessions": 0, ...}` +- Despite local service recovery, qrexec forwarding is still denied: + - `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused` + - `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused` + +Session note (2026-04-25, in-VM forwarding test): +- Tested the intended in-VM forwarding path with `qvm-connect-tcp` instead of host-side `qrexec-client-vm`. +- Forwarders start and bind locally: + - in `k_client`: `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771` + - in `k_proxy`: `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780` +- But the actual client->proxy connection is still refused when used: + - `k_client` forward log shows `Request refused` + - `socat` reports child exit status `126` and `Connection reset by peer` +- Local login on `k_proxy` reaches the app but fails on the auth dependency: + - `POST /session/login` to `http://127.0.0.1:8771` returns `401` + - details: `Missing dependency: python-fido2 ... No module named 'fido2'` +- `k_server` was not reached during this login test; current `k_server.log` only shows `/health`. + +Session note (2026-04-25, after python3-fido2 install): +- `k_proxy` was restarted after `python3-fido2` installation and now listens again on `127.0.0.1:8771`. +- The previous Python import blocker is resolved; local login now reaches the CTAP probe path. +- Current local login result on `k_proxy`: + - `{"ok": false, "error": "card auth failed", "details": "No CTAP HID devices found."}` +- Current forwarded login result from `k_client` is still not completing: + - `curl http://127.0.0.1:8771/session/login` -> `Empty reply from server` + - `qvm_connect_8771.log` still shows repeated `Request refused` and child exit status `126` +- Practical meaning: + - Python dependency issue in `k_proxy` is fixed + - card access inside `k_proxy` is currently missing again at CTAP/HID level + - `k_client -> k_proxy` qrexec forwarding is still effectively denied/refused + +Session note (2026-04-25, card reattached): +- Card visibility in `k_proxy` is restored again: + - `/dev/hidraw0` and `/dev/hidraw1` present + - `fido2_probe.py --list` detects ChromeCard on `/dev/hidraw0` +- Local login on `k_proxy` now succeeds again: + - `POST /session/login` on `127.0.0.1:8771` returns `200` + - session creation for user `alice` succeeded +- Remaining failure is isolated to the client-facing qrexec path: + - `k_client` -> `localhost:8771` through `qvm-connect-tcp` still returns `Empty reply from server` + - `qvm_connect_8771.log` still shows `Request refused` + +Session note (2026-04-25, clean forward retest): +- Re-ran both forwards and exercised each hop immediately after local bind. +- `k_proxy -> k_server`: + - `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780` in `k_proxy` + - first real `POST /resource/counter` through that forward returns `Empty reply from server` + - `qvm_connect_8780.log` then records `Request refused` with child exit status `126` +- `k_client -> k_proxy`: + - `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771` in `k_client` + - first real `POST /session/login` through that forward returns `Empty reply from server` + - `qvm_connect_8771.log` records `Request refused` with child exit status `126` +- Conclusion from this retest: + - both forwards fail in the same way + - local bind succeeds, but the actual qrexec `qubes.ConnectTCP` request is refused when the first connection is attempted + +Session note (2026-04-25, dom0 policy fix validated): +- After changing dom0 policy to use explicit destination VMs instead of `@default` for `qubes.ConnectTCP`, both forwards now work. +- Verified hop 1: + - in `k_proxy`, `POST http://127.0.0.1:8780/resource/counter` with `X-Proxy-Token: dev-proxy-token` succeeds + - response included counter value `1` +- Verified hop 2: + - in `k_client`, `POST http://127.0.0.1:8771/session/login` succeeds + - session token is returned through the `k_client -> k_proxy` forward +- Verified full end-to-end flow from `k_client`: + - login succeeded and returned session token + - `POST /session/status` succeeded + - `POST /resource/counter` succeeded twice with upstream values `2` and `3` + - `POST /session/logout` succeeded + - post-logout `POST /resource/counter` correctly returned `401 invalid or expired session` +- Current conclusion: + - `k_client -> k_proxy -> k_server` chain is operational + - session reuse and logout behavior are working in the current prototype + ## Known FIDO2 Transport Boundary - FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT. diff --git a/Workplan.md b/Workplan.md index 6f624f0..d5f1b78 100644 --- a/Workplan.md +++ b/Workplan.md @@ -10,6 +10,7 @@ This is the execution plan for making ChromeCard FIDO2 development and validatio - Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`. - Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`. - Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation. +- VM execution path is SSH-first for experiments: `ssh ` and `scp :~`. ## Goals @@ -58,6 +59,13 @@ Exit criteria: Exit criteria: - Policy matches intended chain and is test-verified. +Status (2026-04-24, remote diagnostics): +- Confirmed active blocker remains Phase 1 network policy/pathing. +- Evidence from live VM probes: + - `k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771)`: TCP timeout. + - `k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780)`: upstream timeout. +- Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup. + ## Phase 2: TLS Certificates and Service Endpoints 1. Certificate model. @@ -251,6 +259,79 @@ Exit criteria: - Re-scan relevant `.md` files before each new execution cycle and reconcile drift. - Record date-stamped session notes when priorities or blockers change. +Status (2026-04-24, markdown maintenance): +- Re-scanned the active workspace Markdown set and the main source-tree reference docs. +- No workplan phase change was required from this pass. +- Ongoing documentation watch item remains path drift in `CR_SDK_CK-main/README_HOST.md`, which still uses historical `./scripts/...` helper locations instead of workspace-root helper paths. +- Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell. + +Status (2026-04-24, chain probe retry): +- Phase 1 remains blocked, but the failure point is now narrowed further: + - current refusal occurs at Qubes `qubes.ConnectTCP` policy/service evaluation for ports `22`, `8770`, and `8780` + - this happens before any end-to-end app-level request can be retried +- Practical implication: + - do not spend time on `k_proxy_app.py` / `k_server_app.py` request handling until qrexec forwarding is permitting the intended hops again + - next recovery action is to fix/activate the relevant Qubes `qubes.ConnectTCP` policy and then re-run the qrexec bridge checks before testing HTTP flow + +Status (2026-04-25, post-restart probe): +- Corrected the client-facing proxy port reference to `8771`. +- SSH access to `k_proxy` and card visibility recovered after VM restart. +- New immediate blockers are: + - `k_proxy` service not listening on `127.0.0.1:8771` + - `k_server` service not listening on `127.0.0.1:8780` + - qrexec forwarding for `8771` and `8780` still returns `Request refused` +- Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow. + +Status (2026-04-25, service restart): +- Local VM services are running again on the intended loopback ports: + - `k_server`: `127.0.0.1:8780` + - `k_proxy`: `127.0.0.1:8771` +- Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports. +- Next action is no longer app startup; it is fixing the `qubes.ConnectTCP` allow path for `8771` and `8780`. + +Status (2026-04-25, in-VM forwarding test): +- Verified that using `qvm-connect-tcp` inside the source VMs still does not complete the client->proxy hop: + - bind succeeds locally, but first real connection gets `Request refused` +- Independent app-layer blocker also found in `k_proxy`: + - `python-fido2` is missing there, so local `/session/login` currently fails before card auth can succeed +- Current ordered blockers: + - first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771` + - second: install `python-fido2` in `k_proxy` + - third: re-test end-to-end login and then proxy->server counter flow + +Status (2026-04-25, after python3-fido2 install): +- `python3-fido2` blocker in `k_proxy` is resolved. +- Updated ordered blockers: + - first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771` + - second: restore CTAP HID device visibility/access in `k_proxy` (`No CTAP HID devices found`) + - third: re-test end-to-end login and then proxy->server counter flow + +Status (2026-04-25, card reattached): +- CTAP HID visibility/access in `k_proxy` is restored. +- Local proxy login is working again with the attached card. +- The only currently confirmed blocker for the end-to-end path is the `k_client -> k_proxy:8771` qrexec/`qvm-connect-tcp` refusal. + +Status (2026-04-25, clean forward retest): +- The retest shows the same qrexec failure mode on both hops, not just the client-facing one. +- Updated blocker statement: + - effective `qubes.ConnectTCP` allow path is failing for both + - `k_client -> k_proxy:8771` + - `k_proxy -> k_server:8780` +- App services and card path are currently good; forwarding remains the single active system blocker. + +Status (2026-04-25, dom0 policy fix validated): +- The explicit-destination dom0 `qubes.ConnectTCP` policy fix resolved forwarding on both hops. +- Current verified working chain: + - `k_client -> k_proxy:8771` + - `k_proxy -> k_server:8780` +- Current verified prototype behavior: + - session login works from `k_client` + - session status works + - protected counter flow reaches `k_server` + - session reuse avoids re-login for repeated counter calls + - logout invalidates the session and subsequent protected access returns `401` +- Immediate networking blocker is cleared. + Exit criteria: - New team member can follow docs end-to-end without path or tooling ambiguity. @@ -284,3 +365,14 @@ Exit criteria: - Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split). - Target concurrency level for validation (parallel clients and parallel requests per client). - Preferred wireless transport/protocol between `k_proxy` and phone (for future phase). + +## Session Maintenance Notes (2026-04-24) + +- Top-level Markdown review completed for `PHASE5_RUNBOOK.md`, `Setup.md`, and `Workplan.md`. +- Current execution plan remains in sync with the Phase 5 runbook: + - prototype services at `/home/user/chromecard/k_proxy_app.py` and `/home/user/chromecard/k_server_app.py` + - run sequence documented in `/home/user/chromecard/PHASE5_RUNBOOK.md` +- No phase ordering or blocker changes were required from this review pass. +- Remote execution support is now active and validated: + - `ssh` command execution works for `k_client`, `k_proxy`, `k_server` + - `scp` push to VM home works (validated on `k_proxy`)