k_card/Setup.md

409 lines
21 KiB
Markdown

# Setup
Last updated: 2026-04-25
This is a living setup/status file for the local ChromeCard workspace at `/home/user/chromecard`.
Update this file whenever environment status or verified behavior changes.
## Repository Policy
- Treat `/home/user/chromecard/CR_SDK_CK-main` as read-only in this workflow.
- Do not add or modify helper/test scripts inside `CR_SDK_CK-main`.
- Keep host-side helper scripts at workspace root (`/home/user/chromecard`).
## Documentation Maintenance
- Canonical living status docs for this workspace are:
- `/home/user/chromecard/Setup.md`
- `/home/user/chromecard/Workplan.md`
- After each meaningful execution step, update at least:
- `Setup.md` for observed environment/runtime state
- `Workplan.md` for phase progress and next blocking action
- Keep helper script paths consistent in docs:
- `/home/user/chromecard/fido2_probe.py`
- `/home/user/chromecard/webauthn_local_demo.py`
- Treat `CR_SDK_CK-main/README_HOST.md` as historical reference unless its script paths are aligned with this workspace policy.
## Scope
- Experimental ChromeCard connected over USB.
- Firmware source tree: `/home/user/chromecard/CR_SDK_CK-main`.
- Host-side FIDO2 demo tools:
- `/home/user/chromecard/fido2_probe.py`
- `/home/user/chromecard/webauthn_local_demo.py`
- Target runtime platform: Qubes OS with 3 AppVMs:
- `k_client` (browser + enrollment process)
- `k_proxy` (card-connected proxy/auth client)
- `k_server` (protected resource/backend)
## Planned Transport Evolution
- Current phase assumption: card is connected directly to `k_proxy` (USB).
- Future target: card is connected to a phone, and `k_proxy` performs validation through a wireless link to that phone.
- Design implication: keep authenticator transport behind an abstraction in `k_proxy` so USB-direct and phone-wireless backends can be swapped without changing client/server API contracts.
## Target Qubes Topology
- Base template for all AppVMs: `debian-13-xfce`.
- Allowed network paths:
- `k_client` -> `k_proxy` over TLS
- `k_proxy` -> `k_server` over TLS
- Response traffic returns on those established connections.
- Disallowed direct path:
- `k_client` -> `k_server` (direct access should be blocked).
Functional roles:
- `k_client`:
- Browser-only traffic client.
- Runs a user enrollment process.
- `k_proxy`:
- Current: connected to the ChromeCard over USB.
- Future: connects wirelessly to phone-attached card for validation.
- Accepts TLS requests from `k_client`.
- Uses card-backed FIDO2/WebAuthn operations to authenticate user/session.
- Calls `k_server` over TLS after successful authorization.
- Returns proxied data and session information to `k_client`.
- `k_server`:
- Hosts resource(s) requiring login via the proxy-mediated flow.
- Provides a dummy protected resource for early integration testing (monotonic increasing number/counter).
- May hold user/session state logic needed for authorization decisions.
UI baseline for each AppVM (start-menu visible apps):
- Firefox
- XFCE Terminal
- File Manager
## Target Request Flow
1. `k_client` sends HTTPS request to `k_proxy`.
2. `k_proxy` validates/authenticates user via card-backed flow.
3. If allowed, `k_proxy` opens HTTPS request to `k_server` resource.
4. `k_server` responds to `k_proxy`.
5. `k_proxy` returns response payload to `k_client` plus session state.
6. Subsequent requests reuse session state so card auth is not required every request.
Implementation note:
- `k_proxy` does not need a full web server stack; a minimal TLS API service is sufficient.
- Session state should be integrity-protected (signed/encrypted token or server-side session ID) with TTL and revocation behavior defined.
- `k_proxy` and `k_server` must be safe under concurrent access (thread-safe state handling).
## Minimum Service Behavior (Current Target)
- `k_server`:
- Expose protected endpoint returning an increasing integer value (dummy resource).
- Increment behavior must remain correct under concurrent requests.
- Optionally expose/maintain user/session validation logic.
- `k_proxy`:
- Accept concurrent HTTPS requests from one or more `k_client` instances.
- Perform card-backed auth when no valid session is present.
- Cache and validate session state so repeated requests avoid card access until expiry.
- Forward authorized requests to `k_server` and return upstream data plus session info.
Thread-safety expectation:
- Shared mutable state (counter, session store, user state) must be protected against races.
- Parallel requests must not corrupt session records or return duplicate/skipped counter values caused by unsafe updates.
## Test Topology Requirement
- Support concurrency testing from multiple simultaneous clients:
- multiple browser tabs/processes in one `k_client`, and/or
- multiple `k_client` AppVM instances if available.
- Validate both correctness and stability under load:
- session reuse works as intended
- unauthorized access stays blocked
- protected counter/resource remains consistent.
## Current Status Snapshot (2026-04-24)
- AppVM OS version is confirmed: Debian `13.4` (`k_server`, and same on `k_client`/`k_proxy`).
- Python in AppVMs is available: `Python 3.13.5`.
- `python3 /home/user/chromecard/fido2_probe.py --list` in `k_proxy` now detects ChromeCard on `/dev/hidraw0` (`vid:pid=4617:5`).
- HID raw device nodes are now visible in `k_proxy`:
- `/dev/hidraw0` -> `crw-rw----+`
- `/dev/hidraw1` -> `crw-------`
- `python3 /home/user/chromecard/fido2_probe.py --json` succeeds and returns CTAP2 `getInfo`:
- versions: `["FIDO_2_0"]`
- aaguid: `1234567890abcdef0123456789abcdef`
- options: `rk=false`, `up=true`, `uv=true`
- max_msg_size: `1024`
- Local WebAuthn demo (`http://localhost:8765` in `k_proxy`) succeeded:
- register: `ok=true`, `username=alice`, `credential_count=1`
- login/auth: `ok=true`, `username=alice`, `authenticated=true`
- Phase 5 prototype services are now available:
- `/home/user/chromecard/k_proxy_app.py`
- `/home/user/chromecard/k_server_app.py`
- `/home/user/chromecard/PHASE5_RUNBOOK.md`
- Remote VM access is now available via SSH/SCP aliases:
- command execution: `ssh <host> <cmd>`
- file copy to VM home: `scp <file> <host>:~`
- validated hosts: `k_client`, `k_proxy`, `k_server`
- `west` is not currently installed/in PATH: `west not found`.
- The checked-out `CR_SDK_CK-main` tree appears incomplete for documented sysbuild role layout:
- missing: `mvp`, `setup`, `components`, `samples`
- `CR_SDK_CK-main/scripts/build_flash_mvp.sh` exists, but it expects the above role directories.
- Python helper scripts were intentionally moved out of `CR_SDK_CK-main/scripts` and are now maintained at workspace root.
- Qubes AppVM baseline is now up: `k_client`, `k_proxy`, `k_server` can start and have terminals running.
Implication:
- Live FIDO2 connectivity from `k_proxy` to ChromeCard is confirmed over USB HID/CTAPHID.
- Local browser WebAuthn register/login flow is confirmed working in `k_proxy`.
- We cannot currently run the documented firmware build/flash flow.
Session note (2026-04-24):
- Markdown tracking was reviewed and normalized around `Setup.md` + `Workplan.md` as the active, continuously updated execution record.
- AppVM template decision recorded: use `debian-13-xfce` for `k_client`, `k_proxy`, and `k_server`.
- VM start attempt failed with Xen toolstack error: `libxenlight have failed to create new domain 'k_client'`.
- VM start blocker was resolved by reducing VM memory to `400` MiB; all three AppVMs now start.
- Runtime check from VMs: Debian `13.4` and Python `3.13.5`; `k_proxy` still shows `no hidraw devices`.
- After USB assignment to `k_proxy`, `/dev/hidraw0` and `/dev/hidraw1` appeared.
- CTAP probe re-run succeeded with detected ChromeCard device and valid CTAP2 `getInfo` response.
- Local WebAuthn demo completed successfully for user `alice` (register + login).
- Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding.
Session note (2026-04-24, doc maintenance):
- Top-level Markdown files were re-scanned: `PHASE5_RUNBOOK.md`, `Setup.md`, `Workplan.md`.
- `PHASE5_RUNBOOK.md` remains consistent with the current Phase 5 prototype paths and flow.
- No plan/setup drift was found requiring behavioral changes; docs remain aligned.
- SSH-based VM operation was validated for `k_client`, `k_proxy`, `k_server` (Debian `13.4` confirmed remotely).
- SCP file transfer to `k_proxy` home directory was validated with read-back.
Session note (2026-04-24, remote flow diagnostics):
- VM script staging gap found: `/home/user/chromecard/k_proxy_app.py`, `k_server_app.py`, and helper files were missing on AppVMs and were copied via `scp`.
- Services were started in VMs and verified locally:
- `k_proxy` local health OK on `127.0.0.1:8770` and `127.0.0.1:8771`
- `k_server` local health OK on `127.0.0.1:8780`
- Verified VM IPs during this run:
- `k_proxy`: `10.137.0.12`
- `k_server`: `10.137.0.13`
- `k_client`: `10.137.0.16`
- Current chain failure is network pathing/firewall:
- `k_client -> k_proxy` (`10.137.0.12:8771`) times out.
- `k_proxy -> k_server` (`10.137.0.13:8780`) times out.
- Proxy returns upstream error payload: `server unavailable: timed out`.
Session note (2026-04-24, markdown re-scan):
- Re-read top-level workspace Markdown files: `Setup.md`, `Workplan.md`, `PHASE5_RUNBOOK.md`.
- Re-skimmed source-tree reference docs in `CR_SDK_CK-main`, including `BUILD.md`, `README.md`, `README_HOST.md`, `RELEASE.md`, and `distribute_bundle.md`.
- Current workspace docs remain aligned with the verified execution record.
- Source-tree doc drift remains unchanged:
- `README_HOST.md` still points to `./scripts/fido2_probe.py` and `./scripts/webauthn_local_demo.py`.
- Active workspace policy continues to treat those paths as historical; maintained helper paths remain `/home/user/chromecard/fido2_probe.py` and `/home/user/chromecard/webauthn_local_demo.py`.
- Source-tree build docs continue to describe a full SDK layout with `mvp`, `setup`, `components`, and `samples`, which is still not present in the current local checkout snapshot.
Session note (2026-04-24, policy retry):
- Markdown re-scan was retried after local policy changes.
- Re-running the workspace doc scan with a non-login shell completed cleanly, without the earlier SSH/socat startup noise in command output.
Session note (2026-04-24, chain probe retry):
- Re-probed the Qubes access path for `k_client -> k_proxy -> k_server`.
- Local forwarded SSH listener ports still exist on the host:
- `0.0.0.0:2222` -> `qrexec-client-vm 'k_client' qubes.ConnectTCP+22`
- `0.0.0.0:2223` -> `qrexec-client-vm 'k_proxy' qubes.ConnectTCP+22`
- `0.0.0.0:2224` -> `qrexec-client-vm 'k_server' qubes.ConnectTCP+22`
- These forwarded SSH ports currently fail immediately:
- `ssh k_client` / `ssh k_proxy` / `ssh k_server` close immediately on localhost forwarded ports.
- Direct `qrexec-client-vm <target> qubes.ConnectTCP+22` returns `Request refused`.
- Chain ports are currently blocked at the same qrexec layer:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8770` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
- This means the current blocker is active qrexec policy/service refusal for `qubes.ConnectTCP`, not the Python service code in `k_proxy_app.py` or `k_server_app.py`.
- Separate SSH config issue remains on the host:
- `/etc/ssh/ssh_config.d/20-systemd-ssh-proxy.conf` is still owned `root:root` but mode `777`, which causes OpenSSH to reject it as insecure on the normal login-shell path.
Session note (2026-04-25, post-restart probe):
- Correct client-facing proxy port is `8771` for the current split-VM chain checks.
- SSH to `k_proxy` is working again.
- `k_proxy` card visibility is restored after VM restart and card reconnect:
- `/dev/hidraw0` and `/dev/hidraw1` are present in `k_proxy`
- Current service state after restart:
- `k_proxy` has no listener on `127.0.0.1:8771`
- `k_server` has no listener on `127.0.0.1:8780`
- Current qrexec chain state after restart:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
- Practical meaning:
- SSH and card attachment recovered
- phase-5 app services are not currently running in the VMs
- qrexec forwarding for the chain ports is still being refused
Session note (2026-04-25, service restart):
- `k_server_app.py` was restarted successfully in `k_server`:
- PID `1320`
- listening on `127.0.0.1:8780`
- `/health` returns `{"ok": true, "service": "k_server", ...}`
- `k_proxy_app.py` was restarted successfully in `k_proxy`:
- PID `2774`
- listening on `127.0.0.1:8771`
- `/health` returns `{"ok": true, "service": "k_proxy", "active_sessions": 0, ...}`
- Despite local service recovery, qrexec forwarding is still denied:
- `qrexec-client-vm k_proxy qubes.ConnectTCP+8771` -> `Request refused`
- `qrexec-client-vm k_server qubes.ConnectTCP+8780` -> `Request refused`
Session note (2026-04-25, markdown refresh):
- Re-read the active workspace markdown files:
- `Setup.md`
- `Workplan.md`
- `PHASE5_RUNBOOK.md`
- Corrected the Phase 5 runbook to distinguish the old same-VM quickstart from the current split-VM chain usage.
- Current documented client-facing proxy port for split-VM tests is `8771`.
- Current documented blocker remains unchanged:
- local service health inside `k_proxy` and `k_server` is good
- inter-VM forwarding via `qubes.ConnectTCP` is still refused
Session note (2026-04-25, Phase 2 HTTPS bring-up):
- Added direct TLS support to:
- `/home/user/chromecard/k_proxy_app.py`
- `/home/user/chromecard/k_server_app.py`
- Added local certificate generator:
- `/home/user/chromecard/generate_phase2_certs.py`
- Generated local CA and service certs at:
- `/home/user/chromecard/tls/phase2/ca.crt`
- `/home/user/chromecard/tls/phase2/k_proxy.crt`
- `/home/user/chromecard/tls/phase2/k_server.crt`
- Certificate generation was corrected to include subject key identifier and authority key identifier so Python TLS verification succeeds.
- Current validated HTTPS shape is Qubes-localhost forwarding, not raw VM-IP routing:
- in `k_client`: `qvm-connect-tcp 9771:k_proxy:8771`
- in `k_proxy`: `qvm-connect-tcp 9780:k_server:8780`
- `k_proxy` listens on `https://127.0.0.1:8771`
- `k_server` listens on `https://127.0.0.1:8780`
- `k_proxy` upstream is `https://127.0.0.1:9780`
- Verified HTTPS checks:
- `k_client -> k_proxy` `/health` over TLS succeeds with `--cacert /home/user/chromecard/tls/phase2/ca.crt`
- `k_proxy -> k_server` `/health` and `/resource/counter` over TLS succeed through the `9780` forwarder
- end-to-end `k_client -> k_proxy -> k_server` login + session reuse succeeded over HTTPS
- End-to-end verified results:
- login returned `ok=true` for `alice`
- first protected counter call returned value `1`
- second protected counter call returned value `2`
- session status remained valid after reuse
Session note (2026-04-25, in-VM forwarding test):
- Tested the intended in-VM forwarding path with `qvm-connect-tcp` instead of host-side `qrexec-client-vm`.
- Forwarders start and bind locally:
- in `k_client`: `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771`
- in `k_proxy`: `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780`
- But the actual client->proxy connection is still refused when used:
- `k_client` forward log shows `Request refused`
- `socat` reports child exit status `126` and `Connection reset by peer`
- Local login on `k_proxy` reaches the app but fails on the auth dependency:
- `POST /session/login` to `http://127.0.0.1:8771` returns `401`
- details: `Missing dependency: python-fido2 ... No module named 'fido2'`
- `k_server` was not reached during this login test; current `k_server.log` only shows `/health`.
Session note (2026-04-25, after python3-fido2 install):
- `k_proxy` was restarted after `python3-fido2` installation and now listens again on `127.0.0.1:8771`.
- The previous Python import blocker is resolved; local login now reaches the CTAP probe path.
- Current local login result on `k_proxy`:
- `{"ok": false, "error": "card auth failed", "details": "No CTAP HID devices found."}`
- Current forwarded login result from `k_client` is still not completing:
- `curl http://127.0.0.1:8771/session/login` -> `Empty reply from server`
- `qvm_connect_8771.log` still shows repeated `Request refused` and child exit status `126`
- Practical meaning:
- Python dependency issue in `k_proxy` is fixed
- card access inside `k_proxy` is currently missing again at CTAP/HID level
- `k_client -> k_proxy` qrexec forwarding is still effectively denied/refused
Session note (2026-04-25, card reattached):
- Card visibility in `k_proxy` is restored again:
- `/dev/hidraw0` and `/dev/hidraw1` present
- `fido2_probe.py --list` detects ChromeCard on `/dev/hidraw0`
- Local login on `k_proxy` now succeeds again:
- `POST /session/login` on `127.0.0.1:8771` returns `200`
- session creation for user `alice` succeeded
- Remaining failure is isolated to the client-facing qrexec path:
- `k_client` -> `localhost:8771` through `qvm-connect-tcp` still returns `Empty reply from server`
- `qvm_connect_8771.log` still shows `Request refused`
Session note (2026-04-25, clean forward retest):
- Re-ran both forwards and exercised each hop immediately after local bind.
- `k_proxy -> k_server`:
- `qvm-connect-tcp 8780:k_server:8780` binds `localhost:8780` in `k_proxy`
- first real `POST /resource/counter` through that forward returns `Empty reply from server`
- `qvm_connect_8780.log` then records `Request refused` with child exit status `126`
- `k_client -> k_proxy`:
- `qvm-connect-tcp 8771:k_proxy:8771` binds `localhost:8771` in `k_client`
- first real `POST /session/login` through that forward returns `Empty reply from server`
- `qvm_connect_8771.log` records `Request refused` with child exit status `126`
- Conclusion from this retest:
- both forwards fail in the same way
- local bind succeeds, but the actual qrexec `qubes.ConnectTCP` request is refused when the first connection is attempted
Session note (2026-04-25, dom0 policy fix validated):
- After changing dom0 policy to use explicit destination VMs instead of `@default` for `qubes.ConnectTCP`, both forwards now work.
- Verified hop 1:
- in `k_proxy`, `POST http://127.0.0.1:8780/resource/counter` with `X-Proxy-Token: dev-proxy-token` succeeds
- response included counter value `1`
- Verified hop 2:
- in `k_client`, `POST http://127.0.0.1:8771/session/login` succeeds
- session token is returned through the `k_client -> k_proxy` forward
- Verified full end-to-end flow from `k_client`:
- login succeeded and returned session token
- `POST /session/status` succeeded
- `POST /resource/counter` succeeded twice with upstream values `2` and `3`
- `POST /session/logout` succeeded
- post-logout `POST /resource/counter` correctly returned `401 invalid or expired session`
- Current conclusion:
- `k_client -> k_proxy -> k_server` chain is operational
- session reuse and logout behavior are working in the current prototype
## Known FIDO2 Transport Boundary
- FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT.
- Key code points in `CR_SDK_CK-main`:
- `mgr_fido2.c`: `mgr_fido2_init()` registers `fido2_ctaphid_handle_packet`.
- `ctaphid.c`: `fido2_ctaphid_handle_packet(...)`.
- `cr_config.h`: FIDO2 HID report descriptor definitions.
## Host Bring-Up Steps (How To Get To A Working FIDO2 Check)
1. Confirm USB enumeration and HID visibility.
- Replug card with a known data-capable cable.
- Check: `ls -l /dev/hidraw*`
2. If needed, grant Linux HID access for this device.
- Add rule at `/etc/udev/rules.d/70-chromecard-fido.rules`:
```udev
SUBSYSTEM=="hidraw", ATTRS{idVendor}=="1209", ATTRS{idProduct}=="0005", MODE="0660", TAG+="uaccess"
```
- Reload/apply rules and replug the device.
3. Verify CTAP HID presence.
- `python3 /home/user/chromecard/fido2_probe.py --list`
- Then:
- `python3 /home/user/chromecard/fido2_probe.py --json`
4. Run local WebAuthn bring-up demo.
- `python3 /home/user/chromecard/webauthn_local_demo.py`
- Open `http://localhost:8765` (use `localhost`, not `127.0.0.1`).
5. Execute register/login test.
- Register a user.
- Login with the same user.
- Confirm no origin/challenge mismatch errors.
## Build/Flash Prerequisites (How To Get To Firmware Build)
1. Ensure full SDK checkout layout exists under `CR_SDK_CK-main`:
- `mvp`
- `setup`
- `components`
- `samples`
2. Ensure toolchain is available in shell:
- `west --version`
- `nrfjprog --version`
3. Once layout/tooling are in place, run:
- `cd /home/user/chromecard/CR_SDK_CK-main`
- `./scripts/build_flash_mvp.sh`
## Open Gaps To Resolve
- Whether a full `CR_SDK_CK-main` checkout (with role directories) is available locally.
- Whether server-side code should be pulled now for broader CIP/WebAuthn integration testing.
- Exact Qubes firewall and service binding rules to enforce the `k_client -> k_proxy -> k_server` chain.
- Exact enrollment process interface running in `k_client` and how it reaches `k_proxy`.
- Upgrade Phase 5 auth gate from card-presence probe to full WebAuthn assertion verification for session creation.
- Precise ownership split of session/user state between `k_proxy` and `k_server`.
- Concrete concurrency limits and acceptance criteria (requests/sec, parallel clients, latency/error thresholds).