Update Qubes chain status docs
This commit is contained in:
parent
37600548ac
commit
6db7a7e217
152
Setup.md
152
Setup.md
|
|
@ -133,6 +133,10 @@ Thread-safety expectation:
|
||||||
- `/home/user/chromecard/k_proxy_app.py`
|
- `/home/user/chromecard/k_proxy_app.py`
|
||||||
- `/home/user/chromecard/k_server_app.py`
|
- `/home/user/chromecard/k_server_app.py`
|
||||||
- `/home/user/chromecard/PHASE5_RUNBOOK.md`
|
- `/home/user/chromecard/PHASE5_RUNBOOK.md`
|
||||||
|
- Remote VM access is now available via SSH/SCP aliases:
|
||||||
|
- command execution: `ssh <host> <cmd>`
|
||||||
|
- file copy to VM home: `scp <file> <host>:~`
|
||||||
|
- validated hosts: `k_client`, `k_proxy`, `k_server`
|
||||||
- `west` is not currently installed/in PATH: `west not found`.
|
- `west` is not currently installed/in PATH: `west not found`.
|
||||||
- The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout:
|
- The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout:
|
||||||
- missing: `mvp`, `setup`, `components`, `samples`
|
- missing: `mvp`, `setup`, `components`, `samples`
|
||||||
|
|
@ -156,6 +160,154 @@ Session note (2026-04-24):
|
||||||
- Local WebAuthn demo completed successfully for user `alice` (register + login).
|
- Local WebAuthn demo completed successfully for user `alice` (register + login).
|
||||||
- Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding.
|
- Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding.
|
||||||
|
|
||||||
|
Session note (2026-04-24, doc maintenance):
|
||||||
|
- Top-level Markdown files were re-scanned: `PHASE5_RUNBOOK.md`, `Setup.md`, `Workplan.md`.
|
||||||
|
- `PHASE5_RUNBOOK.md` remains consistent with the current Phase 5 prototype paths and flow.
|
||||||
|
- No plan/setup drift was found requiring behavioral changes; docs remain aligned.
|
||||||
|
- SSH-based VM operation was validated for `k_client`, `k_proxy`, `k_server` (Debian `13.4` confirmed remotely).
|
||||||
|
- SCP file transfer to `k_proxy` home directory was validated with read-back.
|
||||||
|
|
||||||
|
Session note (2026-04-24, remote flow diagnostics):
|
||||||
|
- VM script staging gap found: `/home/user/chromecard/k_proxy_app.py`, `k_server_app.py`, and helper files were missing on AppVMs and were copied via `scp`.
|
||||||
|
- Services were started in VMs and verified locally:
|
||||||
|
- `k_proxy` local health OK on `127.0.0.1:8770` and `127.0.0.1:8771`
|
||||||
|
- `k_server` local health OK on `127.0.0.1:8780`
|
||||||
|
- Verified VM IPs during this run:
|
||||||
|
- `k_proxy`: `10.137.0.12`
|
||||||
|
- `k_server`: `10.137.0.13`
|
||||||
|
- `k_client`: `10.137.0.16`
|
||||||
|
- Current chain failure is network pathing/firewall:
|
||||||
|
- `k_client -> k_proxy` (`10.137.0.12:8771`) times out.
|
||||||
|
- `k_proxy -> k_server` (`10.137.0.13:8780`) times out.
|
||||||
|
- Proxy returns upstream error payload: `server unavailable: timed out`.
|
||||||
|
|
||||||
|
Session note (2026-04-24, markdown re-scan):
|
||||||
|
- Re-read top-level workspace Markdown files: `Setup.md`, `Workplan.md`, `PHASE5_RUNBOOK.md`.
|
||||||
|
- Re-skimmed source-tree reference docs in `CR_SDK_CK-main`, including `BUILD.md`, `README.md`, `README_HOST.md`, `RELEASE.md`, and `distribute_bundle.md`.
|
||||||
|
- Current workspace docs remain aligned with the verified execution record.
|
||||||
|
- Source-tree doc drift remains unchanged:
|
||||||
|
- `README_HOST.md` still points to `./scripts/fido2_probe.py` and `./scripts/webauthn_local_demo.py`.
|
||||||
|
- Active workspace policy continues to treat those paths as historical; maintained helper paths remain `/home/user/chromecard/fido2_probe.py` and `/home/user/chromecard/webauthn_local_demo.py`.
|
||||||
|
- Source-tree build docs continue to describe a full SDK layout with `mvp`, `setup`, `components`, and `samples`, which is still not present in the current local checkout snapshot.
|
||||||
|
|
||||||
|
Session note (2026-04-24, policy retry):
|
||||||
|
- Markdown re-scan was retried after local policy changes.
|
||||||
|
- Re-running the workspace doc scan with a non-login shell completed cleanly, without the earlier SSH/socat startup noise in command output.
|
||||||
|
|
||||||
|
Session note (2026-04-24, chain probe retry):
|
||||||
|
- Re-probed the Qubes access path for `k_client -> k_proxy -> k_server`.
|
||||||
|
- Local forwarded SSH listener ports still exist on the host:
|
||||||
|
- `0.0.0.0:2222` -> `qrexec-client-vm 'k_client' qubes.ConnectTCP+22`
|
||||||
|
- `0.0.0.0:2223` -> `qrexec-client-vm 'k_proxy' qubes.ConnectTCP+22`
|
||||||
|
- `0.0.0.0:2224` -> `qrexec-client-vm 'k_server' qubes.ConnectTCP+22`
|
||||||
|
- These forwarded SSH ports currently fail immediately:
|
||||||
|
- `ssh k_client` / `ssh k_proxy` / `ssh k_server` close immediately on localhost forwarded ports.
|
||||||
|
- Direct `qrexec-client-vm <target> qubes.ConnectTCP+22` returns `Request refused`.
|
||||||
|
- Chain ports are currently blocked at the same qrexec layer:
|
||||||
|
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8770` -> `Request refused`
|
||||||
|
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
|
||||||
|
- This means the current blocker is active qrexec policy/service refusal for `qubes.ConnectTCP`, not the Python service code in `k_proxy_app.py` or `k_server_app.py`.
|
||||||
|
- Separate SSH config issue remains on the host:
|
||||||
|
- `/etc/ssh/ssh_config.d/20-systemd-ssh-proxy.conf` is still owned `root:root` but mode `777`, which causes OpenSSH to reject it as insecure on the normal login-shell path.
|
||||||
|
|
||||||
|
Session note (2026-04-25, post-restart probe):
|
||||||
|
- Correct client-facing proxy port is `8771` for the current split-VM chain checks.
|
||||||
|
- SSH to `k_proxy` is working again.
|
||||||
|
- `k_proxy` card visibility is restored after VM restart and card reconnect:
|
||||||
|
- `/dev/hidraw0` and `/dev/hidraw1` are present in `k_proxy`
|
||||||
|
- Current service state after restart:
|
||||||
|
- `k_proxy` has no listener on `127.0.0.1:8771`
|
||||||
|
- `k_server` has no listener on `127.0.0.1:8780`
|
||||||
|
- Current qrexec chain state after restart:
|
||||||
|
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
|
||||||
|
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
|
||||||
|
- Practical meaning:
|
||||||
|
- SSH and card attachment recovered
|
||||||
|
- phase-5 app services are not currently running in the VMs
|
||||||
|
- qrexec forwarding for the chain ports is still being refused
|
||||||
|
|
||||||
|
Session note (2026-04-25, service restart):
|
||||||
|
- `k_server_app.py` was restarted successfully in `k_server`:
|
||||||
|
- PID `1320`
|
||||||
|
- listening on `127.0.0.1:8780`
|
||||||
|
- `/health` returns `{"ok": true, "service": "k_server", ...}`
|
||||||
|
- `k_proxy_app.py` was restarted successfully in `k_proxy`:
|
||||||
|
- PID `2774`
|
||||||
|
- listening on `127.0.0.1:8771`
|
||||||
|
- `/health` returns `{"ok": true, "service": "k_proxy", "active_sessions": 0, ...}`
|
||||||
|
- Despite local service recovery, qrexec forwarding is still denied:
|
||||||
|
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
|
||||||
|
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
|
||||||
|
|
||||||
|
Session note (2026-04-25, in-VM forwarding test):
|
||||||
|
- Tested the intended in-VM forwarding path with `qvm-connect-tcp` instead of host-side `qrexec-client-vm`.
|
||||||
|
- Forwarders start and bind locally:
|
||||||
|
- in `k_client`: `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771`
|
||||||
|
- in `k_proxy`: `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780`
|
||||||
|
- But the actual client->proxy connection is still refused when used:
|
||||||
|
- `k_client` forward log shows `Request refused`
|
||||||
|
- `socat` reports child exit status `126` and `Connection reset by peer`
|
||||||
|
- Local login on `k_proxy` reaches the app but fails on the auth dependency:
|
||||||
|
- `POST /session/login` to `http://127.0.0.1:8771` returns `401`
|
||||||
|
- details: `Missing dependency: python-fido2 ... No module named 'fido2'`
|
||||||
|
- `k_server` was not reached during this login test; current `k_server.log` only shows `/health`.
|
||||||
|
|
||||||
|
Session note (2026-04-25, after python3-fido2 install):
|
||||||
|
- `k_proxy` was restarted after `python3-fido2` installation and now listens again on `127.0.0.1:8771`.
|
||||||
|
- The previous Python import blocker is resolved; local login now reaches the CTAP probe path.
|
||||||
|
- Current local login result on `k_proxy`:
|
||||||
|
- `{"ok": false, "error": "card auth failed", "details": "No CTAP HID devices found."}`
|
||||||
|
- Current forwarded login result from `k_client` is still not completing:
|
||||||
|
- `curl http://127.0.0.1:8771/session/login` -> `Empty reply from server`
|
||||||
|
- `qvm_connect_8771.log` still shows repeated `Request refused` and child exit status `126`
|
||||||
|
- Practical meaning:
|
||||||
|
- Python dependency issue in `k_proxy` is fixed
|
||||||
|
- card access inside `k_proxy` is currently missing again at CTAP/HID level
|
||||||
|
- `k_client -> k_proxy` qrexec forwarding is still effectively denied/refused
|
||||||
|
|
||||||
|
Session note (2026-04-25, card reattached):
|
||||||
|
- Card visibility in `k_proxy` is restored again:
|
||||||
|
- `/dev/hidraw0` and `/dev/hidraw1` present
|
||||||
|
- `fido2_probe.py --list` detects ChromeCard on `/dev/hidraw0`
|
||||||
|
- Local login on `k_proxy` now succeeds again:
|
||||||
|
- `POST /session/login` on `127.0.0.1:8771` returns `200`
|
||||||
|
- session creation for user `alice` succeeded
|
||||||
|
- Remaining failure is isolated to the client-facing qrexec path:
|
||||||
|
- `k_client` -> `localhost:8771` through `qvm-connect-tcp` still returns `Empty reply from server`
|
||||||
|
- `qvm_connect_8771.log` still shows `Request refused`
|
||||||
|
|
||||||
|
Session note (2026-04-25, clean forward retest):
|
||||||
|
- Re-ran both forwards and exercised each hop immediately after local bind.
|
||||||
|
- `k_proxy -> k_server`:
|
||||||
|
- `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780` in `k_proxy`
|
||||||
|
- first real `POST /resource/counter` through that forward returns `Empty reply from server`
|
||||||
|
- `qvm_connect_8780.log` then records `Request refused` with child exit status `126`
|
||||||
|
- `k_client -> k_proxy`:
|
||||||
|
- `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771` in `k_client`
|
||||||
|
- first real `POST /session/login` through that forward returns `Empty reply from server`
|
||||||
|
- `qvm_connect_8771.log` records `Request refused` with child exit status `126`
|
||||||
|
- Conclusion from this retest:
|
||||||
|
- both forwards fail in the same way
|
||||||
|
- local bind succeeds, but the actual qrexec `qubes.ConnectTCP` request is refused when the first connection is attempted
|
||||||
|
|
||||||
|
Session note (2026-04-25, dom0 policy fix validated):
|
||||||
|
- After changing dom0 policy to use explicit destination VMs instead of `@default` for `qubes.ConnectTCP`, both forwards now work.
|
||||||
|
- Verified hop 1:
|
||||||
|
- in `k_proxy`, `POST http://127.0.0.1:8780/resource/counter` with `X-Proxy-Token: dev-proxy-token` succeeds
|
||||||
|
- response included counter value `1`
|
||||||
|
- Verified hop 2:
|
||||||
|
- in `k_client`, `POST http://127.0.0.1:8771/session/login` succeeds
|
||||||
|
- session token is returned through the `k_client -> k_proxy` forward
|
||||||
|
- Verified full end-to-end flow from `k_client`:
|
||||||
|
- login succeeded and returned session token
|
||||||
|
- `POST /session/status` succeeded
|
||||||
|
- `POST /resource/counter` succeeded twice with upstream values `2` and `3`
|
||||||
|
- `POST /session/logout` succeeded
|
||||||
|
- post-logout `POST /resource/counter` correctly returned `401 invalid or expired session`
|
||||||
|
- Current conclusion:
|
||||||
|
- `k_client -> k_proxy -> k_server` chain is operational
|
||||||
|
- session reuse and logout behavior are working in the current prototype
|
||||||
|
|
||||||
## Known FIDO2 Transport Boundary
|
## Known FIDO2 Transport Boundary
|
||||||
|
|
||||||
- FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT.
|
- FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT.
|
||||||
|
|
|
||||||
92
Workplan.md
92
Workplan.md
|
|
@ -10,6 +10,7 @@ This is the execution plan for making ChromeCard FIDO2 development and validatio
|
||||||
- Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`.
|
- Keep helper scripts such as `fido2_probe.py` and `webauthn_local_demo.py` at `/home/user/chromecard`.
|
||||||
- Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`.
|
- Target deployment model is Qubes OS with 3 AppVMs based on `debian-13-xfce`: `k_client`, `k_proxy`, `k_server`.
|
||||||
- Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation.
|
- Current authenticator link is card->`k_proxy` (USB), but architecture must allow migration to wireless phone-mediated validation.
|
||||||
|
- VM execution path is SSH-first for experiments: `ssh <host> <cmd>` and `scp <file> <host>:~`.
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
|
||||||
|
|
@ -58,6 +59,13 @@ Exit criteria:
|
||||||
Exit criteria:
|
Exit criteria:
|
||||||
- Policy matches intended chain and is test-verified.
|
- Policy matches intended chain and is test-verified.
|
||||||
|
|
||||||
|
Status (2026-04-24, remote diagnostics):
|
||||||
|
- Confirmed active blocker remains Phase 1 network policy/pathing.
|
||||||
|
- Evidence from live VM probes:
|
||||||
|
- `k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771)`: TCP timeout.
|
||||||
|
- `k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780)`: upstream timeout.
|
||||||
|
- Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup.
|
||||||
|
|
||||||
## Phase 2: TLS Certificates and Service Endpoints
|
## Phase 2: TLS Certificates and Service Endpoints
|
||||||
|
|
||||||
1. Certificate model.
|
1. Certificate model.
|
||||||
|
|
@ -251,6 +259,79 @@ Exit criteria:
|
||||||
- Re-scan relevant `.md` files before each new execution cycle and reconcile drift.
|
- Re-scan relevant `.md` files before each new execution cycle and reconcile drift.
|
||||||
- Record date-stamped session notes when priorities or blockers change.
|
- Record date-stamped session notes when priorities or blockers change.
|
||||||
|
|
||||||
|
Status (2026-04-24, markdown maintenance):
|
||||||
|
- Re-scanned the active workspace Markdown set and the main source-tree reference docs.
|
||||||
|
- No workplan phase change was required from this pass.
|
||||||
|
- Ongoing documentation watch item remains path drift in `CR_SDK_CK-main/README_HOST.md`, which still uses historical `./scripts/...` helper locations instead of workspace-root helper paths.
|
||||||
|
- Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell.
|
||||||
|
|
||||||
|
Status (2026-04-24, chain probe retry):
|
||||||
|
- Phase 1 remains blocked, but the failure point is now narrowed further:
|
||||||
|
- current refusal occurs at Qubes `qubes.ConnectTCP` policy/service evaluation for ports `22`, `8770`, and `8780`
|
||||||
|
- this happens before any end-to-end app-level request can be retried
|
||||||
|
- Practical implication:
|
||||||
|
- do not spend time on `k_proxy_app.py` / `k_server_app.py` request handling until qrexec forwarding is permitting the intended hops again
|
||||||
|
- next recovery action is to fix/activate the relevant Qubes `qubes.ConnectTCP` policy and then re-run the qrexec bridge checks before testing HTTP flow
|
||||||
|
|
||||||
|
Status (2026-04-25, post-restart probe):
|
||||||
|
- Corrected the client-facing proxy port reference to `8771`.
|
||||||
|
- SSH access to `k_proxy` and card visibility recovered after VM restart.
|
||||||
|
- New immediate blockers are:
|
||||||
|
- `k_proxy` service not listening on `127.0.0.1:8771`
|
||||||
|
- `k_server` service not listening on `127.0.0.1:8780`
|
||||||
|
- qrexec forwarding for `8771` and `8780` still returns `Request refused`
|
||||||
|
- Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow.
|
||||||
|
|
||||||
|
Status (2026-04-25, service restart):
|
||||||
|
- Local VM services are running again on the intended loopback ports:
|
||||||
|
- `k_server`: `127.0.0.1:8780`
|
||||||
|
- `k_proxy`: `127.0.0.1:8771`
|
||||||
|
- Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports.
|
||||||
|
- Next action is no longer app startup; it is fixing the `qubes.ConnectTCP` allow path for `8771` and `8780`.
|
||||||
|
|
||||||
|
Status (2026-04-25, in-VM forwarding test):
|
||||||
|
- Verified that using `qvm-connect-tcp` inside the source VMs still does not complete the client->proxy hop:
|
||||||
|
- bind succeeds locally, but first real connection gets `Request refused`
|
||||||
|
- Independent app-layer blocker also found in `k_proxy`:
|
||||||
|
- `python-fido2` is missing there, so local `/session/login` currently fails before card auth can succeed
|
||||||
|
- Current ordered blockers:
|
||||||
|
- first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771`
|
||||||
|
- second: install `python-fido2` in `k_proxy`
|
||||||
|
- third: re-test end-to-end login and then proxy->server counter flow
|
||||||
|
|
||||||
|
Status (2026-04-25, after python3-fido2 install):
|
||||||
|
- `python3-fido2` blocker in `k_proxy` is resolved.
|
||||||
|
- Updated ordered blockers:
|
||||||
|
- first: effective Qubes/qrexec allow path for `k_client -> k_proxy:8771`
|
||||||
|
- second: restore CTAP HID device visibility/access in `k_proxy` (`No CTAP HID devices found`)
|
||||||
|
- third: re-test end-to-end login and then proxy->server counter flow
|
||||||
|
|
||||||
|
Status (2026-04-25, card reattached):
|
||||||
|
- CTAP HID visibility/access in `k_proxy` is restored.
|
||||||
|
- Local proxy login is working again with the attached card.
|
||||||
|
- The only currently confirmed blocker for the end-to-end path is the `k_client -> k_proxy:8771` qrexec/`qvm-connect-tcp` refusal.
|
||||||
|
|
||||||
|
Status (2026-04-25, clean forward retest):
|
||||||
|
- The retest shows the same qrexec failure mode on both hops, not just the client-facing one.
|
||||||
|
- Updated blocker statement:
|
||||||
|
- effective `qubes.ConnectTCP` allow path is failing for both
|
||||||
|
- `k_client -> k_proxy:8771`
|
||||||
|
- `k_proxy -> k_server:8780`
|
||||||
|
- App services and card path are currently good; forwarding remains the single active system blocker.
|
||||||
|
|
||||||
|
Status (2026-04-25, dom0 policy fix validated):
|
||||||
|
- The explicit-destination dom0 `qubes.ConnectTCP` policy fix resolved forwarding on both hops.
|
||||||
|
- Current verified working chain:
|
||||||
|
- `k_client -> k_proxy:8771`
|
||||||
|
- `k_proxy -> k_server:8780`
|
||||||
|
- Current verified prototype behavior:
|
||||||
|
- session login works from `k_client`
|
||||||
|
- session status works
|
||||||
|
- protected counter flow reaches `k_server`
|
||||||
|
- session reuse avoids re-login for repeated counter calls
|
||||||
|
- logout invalidates the session and subsequent protected access returns `401`
|
||||||
|
- Immediate networking blocker is cleared.
|
||||||
|
|
||||||
Exit criteria:
|
Exit criteria:
|
||||||
- New team member can follow docs end-to-end without path or tooling ambiguity.
|
- New team member can follow docs end-to-end without path or tooling ambiguity.
|
||||||
|
|
||||||
|
|
@ -284,3 +365,14 @@ Exit criteria:
|
||||||
- Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split).
|
- Decision on where user/session authority lives (`k_proxy` vs `k_server` vs split).
|
||||||
- Target concurrency level for validation (parallel clients and parallel requests per client).
|
- Target concurrency level for validation (parallel clients and parallel requests per client).
|
||||||
- Preferred wireless transport/protocol between `k_proxy` and phone (for future phase).
|
- Preferred wireless transport/protocol between `k_proxy` and phone (for future phase).
|
||||||
|
|
||||||
|
## Session Maintenance Notes (2026-04-24)
|
||||||
|
|
||||||
|
- Top-level Markdown review completed for `PHASE5_RUNBOOK.md`, `Setup.md`, and `Workplan.md`.
|
||||||
|
- Current execution plan remains in sync with the Phase 5 runbook:
|
||||||
|
- prototype services at `/home/user/chromecard/k_proxy_app.py` and `/home/user/chromecard/k_server_app.py`
|
||||||
|
- run sequence documented in `/home/user/chromecard/PHASE5_RUNBOOK.md`
|
||||||
|
- No phase ordering or blocker changes were required from this review pass.
|
||||||
|
- Remote execution support is now active and validated:
|
||||||
|
- `ssh` command execution works for `k_client`, `k_proxy`, `k_server`
|
||||||
|
- `scp` push to VM home works (validated on `k_proxy`)
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue