Update Qubes chain status docs

This commit is contained in:
Morten V. Christiansen 2026-04-25 01:12:47 +02:00
parent 37600548ac
commit 6db7a7e217
2 changed files with 244 additions and 0 deletions

152
Setup.md
View File

@ -133,6 +133,10 @@ Thread-safety expectation:
- `/home/user/chromecard/k_proxy_app.py` - `/home/user/chromecard/k_proxy_app.py`
- `/home/user/chromecard/k_server_app.py` - `/home/user/chromecard/k_server_app.py`
- `/home/user/chromecard/PHASE5_RUNBOOK.md` - `/home/user/chromecard/PHASE5_RUNBOOK.md`
- Remote VM access is now available via SSH/SCP aliases:
- command execution: `ssh <host> <cmd>`
- file copy to VM home: `scp <file> <host>:~`
- validated hosts: `k_client`, `k_proxy`, `k_server`
- `west` is not currently installed/in PATH: `west not found`. - `west` is not currently installed/in PATH: `west not found`.
- The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout: - The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout:
- missing: `mvp`, `setup`, `components`, `samples` - missing: `mvp`, `setup`, `components`, `samples`
@ -156,6 +160,154 @@ Session note (2026-04-24):
- Local WebAuthn demo completed successfully for user `alice` (register + login). - Local WebAuthn demo completed successfully for user `alice` (register + login).
- Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding. - Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding.
Session note (2026-04-24, doc maintenance):
- Top-level Markdown files were re-scanned: `PHASE5_RUNBOOK.md`, `Setup.md`, `Workplan.md`.
- `PHASE5_RUNBOOK.md` remains consistent with the current Phase 5 prototype paths and flow.
- No plan/setup drift was found requiring behavioral changes; docs remain aligned.
- SSH-based VM operation was validated for `k_client`, `k_proxy`, `k_server` (Debian `13.4` confirmed remotely).
- SCP file transfer to `k_proxy` home directory was validated with read-back.
Session note (2026-04-24, remote flow diagnostics):
- VM script staging gap found: `/home/user/chromecard/k_proxy_app.py`, `k_server_app.py`, and helper files were missing on AppVMs and were copied via `scp`.
- Services were started in VMs and verified locally:
- `k_proxy` local health OK on `127.0.0.1:8770` and `127.0.0.1:8771`
- `k_server` local health OK on `127.0.0.1:8780`
- Verified VM IPs during this run:
- `k_proxy`: `10.137.0.12`
- `k_server`: `10.137.0.13`
- `k_client`: `10.137.0.16`
- Current chain failure is network pathing/firewall:
- `k_client -> k_proxy` (`10.137.0.12:8771`) times out.
- `k_proxy -> k_server` (`10.137.0.13:8780`) times out.
- Proxy returns upstream error payload: `server unavailable: timed out`.
Session note (2026-04-24, markdown re-scan):
- Re-read top-level workspace Markdown files: `Setup.md`, `Workplan.md`, `PHASE5_RUNBOOK.md`.
- Re-skimmed source-tree reference docs in `CR_SDK_CK-main`, including `BUILD.md`, `README.md`, `README_HOST.md`, `RELEASE.md`, and `distribute_bundle.md`.
- Current workspace docs remain aligned with the verified execution record.
- Source-tree doc drift remains unchanged:
- `README_HOST.md` still points to `./scripts/fido2_probe.py` and `./scripts/webauthn_local_demo.py`.
- Active workspace policy continues to treat those paths as historical; maintained helper paths remain `/home/user/chromecard/fido2_probe.py` and `/home/user/chromecard/webauthn_local_demo.py`.
- Source-tree build docs continue to describe a full SDK layout with `mvp`, `setup`, `components`, and `samples`, which is still not present in the current local checkout snapshot.
Session note (2026-04-24, policy retry):
- Markdown re-scan was retried after local policy changes.
- Re-running the workspace doc scan with a non-login shell completed cleanly, without the earlier SSH/socat startup noise in command output.
Session note (2026-04-24, chain probe retry):
- Re-probed the Qubes access path for `k_client -> k_proxy -> k_server`.
- Local forwarded SSH listener ports still exist on the host:
- `0.0.0.0:2222` -> `qrexec-client-vm 'k_client' qubes.ConnectTCP+22`
- `0.0.0.0:2223` -> `qrexec-client-vm 'k_proxy' qubes.ConnectTCP+22`
- `0.0.0.0:2224` -> `qrexec-client-vm 'k_server' qubes.ConnectTCP+22`
- These forwarded SSH ports currently fail immediately:
- `ssh k_client` / `ssh k_proxy` / `ssh k_server` close immediately on localhost forwarded ports.
- Direct `qrexec-client-vm <target> qubes.ConnectTCP+22` returns `Request refused`.
- Chain ports are currently blocked at the same qrexec layer:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8770` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
- This means the current blocker is active qrexec policy/service refusal for `qubes.ConnectTCP`, not the Python service code in `k_proxy_app.py` or `k_server_app.py`.
- Separate SSH config issue remains on the host:
- `/etc/ssh/ssh_config.d/20-systemd-ssh-proxy.conf` is still owned `root:root` but mode `777`, which causes OpenSSH to reject it as insecure on the normal login-shell path.
Session note (2026-04-25, post-restart probe):
- Correct client-facing proxy port is `8771` for the current split-VM chain checks.
- SSH to `k_proxy` is working again.
- `k_proxy` card visibility is restored after VM restart and card reconnect:
- `/dev/hidraw0` and `/dev/hidraw1` are present in `k_proxy`
- Current service state after restart:
- `k_proxy` has no listener on `127.0.0.1:8771`
- `k_server` has no listener on `127.0.0.1:8780`
- Current qrexec chain state after restart:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
- Practical meaning:
- SSH and card attachment recovered
- phase-5 app services are not currently running in the VMs
- qrexec forwarding for the chain ports is still being refused
Session note (2026-04-25, service restart):
- `k_server_app.py` was restarted successfully in `k_server`:
- PID `1320`
- listening on `127.0.0.1:8780`
- `/health` returns `{"ok": true, "service": "k_server", ...}`
- `k_proxy_app.py` was restarted successfully in `k_proxy`:
- PID `2774`
- listening on `127.0.0.1:8771`
- `/health` returns `{"ok": true, "service": "k_proxy", "active_sessions": 0, ...}`
- Despite local service recovery, qrexec forwarding is still denied:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
Session note (2026-04-25, in-VM forwarding test):
- Tested the intended in-VM forwarding path with `qvm-connect-tcp` instead of host-side `qrexec-client-vm`.
- Forwarders start and bind locally:
- in `k_client`: `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771`
- in `k_proxy`: `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780`
- But the actual client->proxy connection is still refused when used:
- `k_client` forward log shows `Request refused`
- `socat` reports child exit status `126` and `Connection reset by peer`
- Local login on `k_proxy` reaches the app but fails on the auth dependency:
- `POST /session/login` to `http://127.0.0.1:8771` returns `401`
- details: `Missing dependency: python-fido2 ... No module named 'fido2'`
- `k_server` was not reached during this login test; current `k_server.log` only shows `/health`.
Session note (2026-04-25, after python3-fido2 install):
- `k_proxy` was restarted after `python3-fido2` installation and now listens again on `127.0.0.1:8771`.
- The previous Python import blocker is resolved; local login now reaches the CTAP probe path.
- Current local login result on `k_proxy`:
- `{"ok": false, "error": "card auth failed", "details": "No CTAP HID devices found."}`
- Current forwarded login result from `k_client` is still not completing:
- `curl http://127.0.0.1:8771/session/login` -> `Empty reply from server`
- `qvm_connect_8771.log` still shows repeated `Request refused` and child exit status `126`
- Practical meaning:
- Python dependency issue in `k_proxy` is fixed
- card access inside `k_proxy` is currently missing again at CTAP/HID level
- `k_client -> k_proxy` qrexec forwarding is still effectively denied/refused
Session note (2026-04-25, card reattached):
- Card visibility in `k_proxy` is restored again:
- `/dev/hidraw0` and `/dev/hidraw1` present
- `fido2_probe.py --list` detects ChromeCard on `/dev/hidraw0`
- Local login on `k_proxy` now succeeds again:
- `POST /session/login` on `127.0.0.1:8771` returns `200`
- session creation for user `alice` succeeded
- Remaining failure is isolated to the client-facing qrexec path:
- `k_client` -> `localhost:8771` through `qvm-connect-tcp` still returns `Empty reply from server`
- `qvm_connect_8771.log` still shows `Request refused`
Session note (2026-04-25, clean forward retest):
- Re-ran both forwards and exercised each hop immediately after local bind.
- `k_proxy -> k_server`:
- `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780` in `k_proxy`
- first real `POST /resource/counter` through that forward returns `Empty reply from server`
- `qvm_connect_8780.log` then records `Request refused` with child exit status `126`
- `k_client -> k_proxy`:
- `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771` in `k_client`
- first real `POST /session/login` through that forward returns `Empty reply from server`
- `qvm_connect_8771.log` records `Request refused` with child exit status `126`
- Conclusion from this retest:
- both forwards fail in the same way
- local bind succeeds, but the actual qrexec `qubes.ConnectTCP` request is refused when the first connection is attempted
Session note (2026-04-25, dom0 policy fix validated):
- After changing dom0 policy to use explicit destination VMs instead of `@default` for `qubes.ConnectTCP`, both forwards now work.
- Verified hop 1:
- in `k_proxy`, `POST http://127.0.0.1:8780/resource/counter` with `X-Proxy-Token: dev-proxy-token` succeeds
- response included counter value `1`
- Verified hop 2:
- in `k_client`, `POST http://127.0.0.1:8771/session/login` succeeds
- session token is returned through the `k_client -> k_proxy` forward
- Verified full end-to-end flow from `k_client`:
- login succeeded and returned session token
- `POST /session/status` succeeded
- `POST /resource/counter` succeeded twice with upstream values `2` and `3`
- `POST /session/logout` succeeded
- post-logout `POST /resource/counter` correctly returned `401 invalid or expired session`
- Current conclusion:
- `k_client -> k_proxy -> k_server` chain is operational
- session reuse and logout behavior are working in the current prototype
## Known FIDO2 Transport Boundary ## Known FIDO2 Transport Boundary
- FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT. - FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT.

View File

@ -10,6 +10,7 @@ This is the execution plan for making ChromeCard FIDO2 development and validatio
- Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`. - Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`.
- Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`. - Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`.
- Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation. - Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation.
- VM execution path is SSH-first for experiments: `ssh <host> <cmd>` and `scp <file> <host>:~`.
## Goals ## Goals
@ -58,6 +59,13 @@ Exit criteria:
Exit criteria: Exit criteria:
- Policy matches intended chain and is test-verified. - Policy matches intended chain and is test-verified.
Status (2026-04-24, remote diagnostics):
- Confirmed active blocker remains Phase 1 network policy/pathing.
- Evidence from live VM probes:
- `k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771)`: TCP timeout.
- `k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780)`: upstream timeout.
- Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup.
## Phase 2: TLS Certificates and Service Endpoints ## Phase 2: TLS Certificates and Service Endpoints
1. Certificate model. 1. Certificate model.
@ -251,6 +259,79 @@ Exit criteria:
- Re-scan relevant `.md` files before each new execution cycle and reconcile drift. - Re-scan relevant `.md` files before each new execution cycle and reconcile drift.
- Record date-stamped session notes when priorities or blockers change. - Record date-stamped session notes when priorities or blockers change.
Status (2026-04-24, markdown maintenance):
- Re-scanned the active workspace Markdown set and the main source-tree reference docs.
- No workplan phase change was required from this pass.
- Ongoing documentation watch item remains path drift in `CR_SDK_CK-main/README_HOST.md`, which still uses historical `./scripts/...` helper locations instead of workspace-root helper paths.
- Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell.
Status (2026-04-24, chain probe retry):
- Phase 1 remains blocked, but the failure point is now narrowed further:
- current refusal occurs at Qubes `qubes.ConnectTCP` policy/service evaluation for ports `22`, `8770`, and `8780`
- this happens before any end-to-end app-level request can be retried
- Practical implication:
- do not spend time on `k_proxy_app.py` / `k_server_app.py` request handling until qrexec forwarding is permitting the intended hops again
- next recovery action is to fix/activate the relevant Qubes `qubes.ConnectTCP` policy and then re-run the qrexec bridge checks before testing HTTP flow
Status (2026-04-25, post-restart probe):
- Corrected the client-facing proxy port reference to `8771`.
- SSH access to `k_proxy` and card visibility recovered after VM restart.
- New immediate blockers are:
- `k_proxy` service not listening on `127.0.0.1:8771`
- `k_server` service not listening on `127.0.0.1:8780`
- qrexec forwarding for `8771` and `8780` still returns `Request refused`
- Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow.
Status (2026-04-25, service restart):
- Local VM services are running again on the intended loopback ports:
- `k_server`: `127.0.0.1:8780`
- `k_proxy`: `127.0.0.1:8771`
- Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports.
- Next action is no longer app startup; it is fixing the `qubes.ConnectTCP` allow path for `8771` and `8780`.
Status (2026-04-25, in-VM forwarding test):
- Verified that using `qvm-connect-tcp` inside the source VMs still does not complete the client->proxy hop:
- bind succeeds locally, but first real connection gets `Request refused`
- Independent app-layer blocker also found in `k_proxy`:
- `python-fido2` is missing there, so local `/session/login` currently fails before card auth can succeed
- Current ordered blockers:
- first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771`
- second: install `python-fido2` in `k_proxy`
- third: re-test end-to-end login and then proxy->server counter flow
Status (2026-04-25, after python3-fido2 install):
- `python3-fido2` blocker in `k_proxy` is resolved.
- Updated ordered blockers:
- first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771`
- second: restore CTAP HID device visibility/access in `k_proxy` (`No CTAP HID devices found`)
- third: re-test end-to-end login and then proxy->server counter flow
Status (2026-04-25, card reattached):
- CTAP HID visibility/access in `k_proxy` is restored.
- Local proxy login is working again with the attached card.
- The only currently confirmed blocker for the end-to-end path is the `k_client -> k_proxy:8771` qrexec/`qvm-connect-tcp` refusal.
Status (2026-04-25, clean forward retest):
- The retest shows the same qrexec failure mode on both hops, not just the client-facing one.
- Updated blocker statement:
- effective `qubes.ConnectTCP` allow path is failing for both
- `k_client -> k_proxy:8771`
- `k_proxy -> k_server:8780`
- App services and card path are currently good; forwarding remains the single active system blocker.
Status (2026-04-25, dom0 policy fix validated):
- The explicit-destination dom0 `qubes.ConnectTCP` policy fix resolved forwarding on both hops.
- Current verified working chain:
- `k_client -> k_proxy:8771`
- `k_proxy -> k_server:8780`
- Current verified prototype behavior:
- session login works from `k_client`
- session status works
- protected counter flow reaches `k_server`
- session reuse avoids re-login for repeated counter calls
- logout invalidates the session and subsequent protected access returns `401`
- Immediate networking blocker is cleared.
Exit criteria: Exit criteria:
- New team member can follow docs end-to-end without path or tooling ambiguity. - New team member can follow docs end-to-end without path or tooling ambiguity.
@ -284,3 +365,14 @@ Exit criteria:
- Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split). - Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split).
- Target concurrency level for validation (parallel clients and parallel requests per client). - Target concurrency level for validation (parallel clients and parallel requests per client).
- Preferred wireless transport/protocol between `k_proxy` and phone (for future phase). - Preferred wireless transport/protocol between `k_proxy` and phone (for future phase).
## Session Maintenance Notes (2026-04-24)
- Top-level Markdown review completed for `PHASE5_RUNBOOK.md`, `Setup.md`, and `Workplan.md`.
- Current execution plan remains in sync with the Phase 5 runbook:
- prototype services at `/home/user/chromecard/k_proxy_app.py` and `/home/user/chromecard/k_server_app.py`
- run sequence documented in `/home/user/chromecard/PHASE5_RUNBOOK.md`
- No phase ordering or blocker changes were required from this review pass.
- Remote execution support is now active and validated:
- `ssh` command execution works for `k_client`, `k_proxy`, `k_server`
- `scp` push to VM home works (validated on `k_proxy`)