k_card/Setup.md

38 KiB

Setup

Last updated: 2026-04-25

This is a living setup/status file for the local ChromeCard workspace at /home/user/chromecard. Update this file whenever environment status or verified behavior changes.

Repository Policy

  • Treat /home/user/chromecard/CR_SDK_CK-main as read-only in this workflow.
  • Do not add or modify helper/test scripts inside CR_SDK_CK-main.
  • Keep host-side helper scripts at workspace root (/home/user/chromecard).

Documentation Maintenance

  • Canonical living status docs for this workspace are:
    • /home/user/chromecard/Setup.md
    • /home/user/chromecard/Workplan.md
  • After each meaningful execution step, update at least:
    • Setup.md for observed environment/runtime state
    • Workplan.md for phase progress and next blocking action
  • Keep helper script paths consistent in docs:
    • /home/user/chromecard/fido2_probe.py
    • /home/user/chromecard/webauthn_local_demo.py
  • Treat CR_SDK_CK-main/README_HOST.md as historical reference unless its script paths are aligned with this workspace policy.

Scope

  • Experimental ChromeCard connected over USB.
  • Firmware source tree: /home/user/chromecard/CR_SDK_CK-main.
  • Host-side FIDO2 demo tools:
    • /home/user/chromecard/fido2_probe.py
    • /home/user/chromecard/webauthn_local_demo.py
  • Target runtime platform: Qubes OS with 3 AppVMs:
    • k_client (browser + enrollment process)
    • k_proxy (card-connected proxy/auth client)
    • k_server (protected resource/backend)

Planned Transport Evolution

  • Current phase assumption: card is connected directly to k_proxy (USB).
  • Future target: card is connected to a phone, and k_proxy performs validation through a wireless link to that phone.
  • Design implication: keep authenticator transport behind an abstraction in k_proxy so USB-direct and phone-wireless backends can be swapped without changing client/server API contracts.

Target Qubes Topology

  • Base template for all AppVMs: debian-13-xfce.
  • Allowed network paths:
    • k_client -> k_proxy over TLS
    • k_proxy -> k_server over TLS
    • Response traffic returns on those established connections.
  • Disallowed direct path:
    • k_client -> k_server (direct access should be blocked).

Functional roles:

  • k_client:
    • Browser-only traffic client.
    • Runs a user enrollment process.
  • k_proxy:
    • Current: connected to the ChromeCard over USB.
    • Future: connects wirelessly to phone-attached card for validation.
    • Accepts TLS requests from k_client.
    • Uses card-backed FIDO2/WebAuthn operations to authenticate user/session.
    • Calls k_server over TLS after successful authorization.
    • Returns proxied data and session information to k_client.
  • k_server:
    • Hosts resource(s) requiring login via the proxy-mediated flow.
    • Provides a dummy protected resource for early integration testing (monotonic increasing number/counter).
    • May hold user/session state logic needed for authorization decisions.

UI baseline for each AppVM (start-menu visible apps):

  • Firefox
  • XFCE Terminal
  • File Manager

Target Request Flow

  1. k_client sends HTTPS request to k_proxy.
  2. k_proxy validates/authenticates user via card-backed flow.
  3. If allowed, k_proxy opens HTTPS request to k_server resource.
  4. k_server responds to k_proxy.
  5. k_proxy returns response payload to k_client plus session state.
  6. Subsequent requests reuse session state so card auth is not required every request.

Implementation note:

  • k_proxy does not need a full web server stack; a minimal TLS API service is sufficient.
  • Session state should be integrity-protected (signed/encrypted token or server-side session ID) with TTL and revocation behavior defined.
  • k_proxy and k_server must be safe under concurrent access (thread-safe state handling).

Minimum Service Behavior (Current Target)

  • k_server:
    • Expose protected endpoint returning an increasing integer value (dummy resource).
    • Increment behavior must remain correct under concurrent requests.
    • Optionally expose/maintain user/session validation logic.
  • k_proxy:
    • Accept concurrent HTTPS requests from one or more k_client instances.
    • Perform card-backed auth when no valid session is present.
    • Cache and validate session state so repeated requests avoid card access until expiry.
    • Forward authorized requests to k_server and return upstream data plus session info.

Thread-safety expectation:

  • Shared mutable state (counter, session store, user state) must be protected against races.
  • Parallel requests must not corrupt session records or return duplicate/skipped counter values caused by unsafe updates.

Test Topology Requirement

  • Support concurrency testing from multiple simultaneous clients:
    • multiple browser tabs/processes in one k_client, and/or
    • multiple k_client AppVM instances if available.
  • Validate both correctness and stability under load:
    • session reuse works as intended
    • unauthorized access stays blocked
    • protected counter/resource remains consistent.

Current Status Snapshot (2026-04-24)

  • AppVM OS version is confirmed: Debian 13.4 (k_server, and same on k_client/k_proxy).
  • Python in AppVMs is available: Python 3.13.5.
  • python3 /home/user/chromecard/fido2_probe.py --list in k_proxy now detects ChromeCard on /dev/hidraw0 (vid:pid=4617:5).
  • HID raw device nodes are now visible in k_proxy:
    • /dev/hidraw0 -> crw-rw----+
    • /dev/hidraw1 -> crw-------
  • python3 /home/user/chromecard/fido2_probe.py --json succeeds and returns CTAP2 getInfo:
    • versions: ["FIDO_2_0"]
    • aaguid: 1234567890abcdef0123456789abcdef
    • options: rk=false, up=true, uv=true
    • max_msg_size: 1024
  • Local WebAuthn demo (http://localhost:8765 in k_proxy) succeeded:
    • register: ok=true, username=alice, credential_count=1
    • login/auth: ok=true, username=alice, authenticated=true
  • Phase 5 prototype services are now available:
    • /home/user/chromecard/k_proxy_app.py
    • /home/user/chromecard/k_server_app.py
    • /home/user/chromecard/PHASE5_RUNBOOK.md
  • Remote VM access is now available via SSH/SCP aliases:
    • command execution: ssh <host> <cmd>
    • file copy to VM home: scp <file> <host>:~
    • validated hosts: k_client, k_proxy, k_server
  • west is not currently installed/in PATH: west not found.
  • The checked-out CR_SDK_CK-main tree appears incomplete for documented sysbuild role layout:
    • missing: mvp, setup, components, samples
  • CR_SDK_CK-main/scripts/build_flash_mvp.sh exists, but it expects the above role directories.
  • Python helper scripts were intentionally moved out of CR_SDK_CK-main/scripts and are now maintained at workspace root.
  • Qubes AppVM baseline is now up: k_client, k_proxy, k_server can start and have terminals running.

Implication:

  • Live FIDO2 connectivity from k_proxy to ChromeCard is confirmed over USB HID/CTAPHID.
  • Local browser WebAuthn register/login flow is confirmed working in k_proxy.
  • We cannot currently run the documented firmware build/flash flow.

Session note (2026-04-24):

  • Markdown tracking was reviewed and normalized around Setup.md + Workplan.md as the active, continuously updated execution record.
  • AppVM template decision recorded: use debian-13-xfce for k_client, k_proxy, and k_server.
  • VM start attempt failed with Xen toolstack error: libxenlight have failed to create new domain 'k_client'.
  • VM start blocker was resolved by reducing VM memory to 400 MiB; all three AppVMs now start.
  • Runtime check from VMs: Debian 13.4 and Python 3.13.5; k_proxy still shows no hidraw devices.
  • After USB assignment to k_proxy, /dev/hidraw0 and /dev/hidraw1 appeared.
  • CTAP probe re-run succeeded with detected ChromeCard device and valid CTAP2 getInfo response.
  • Local WebAuthn demo completed successfully for user alice (register + login).
  • Phase 5 starter implementation added with session TTL, logout/invalidation, and proxy->server protected counter forwarding.

Session note (2026-04-24, doc maintenance):

  • Top-level Markdown files were re-scanned: PHASE5_RUNBOOK.md, Setup.md, Workplan.md.
  • PHASE5_RUNBOOK.md remains consistent with the current Phase 5 prototype paths and flow.
  • No plan/setup drift was found requiring behavioral changes; docs remain aligned.
  • SSH-based VM operation was validated for k_client, k_proxy, k_server (Debian 13.4 confirmed remotely).
  • SCP file transfer to k_proxy home directory was validated with read-back.

Session note (2026-04-24, remote flow diagnostics):

  • VM script staging gap found: /home/user/chromecard/k_proxy_app.py, k_server_app.py, and helper files were missing on AppVMs and were copied via scp.
  • Services were started in VMs and verified locally:
    • k_proxy local health OK on 127.0.0.1:8770 and 127.0.0.1:8771
    • k_server local health OK on 127.0.0.1:8780
  • Verified VM IPs during this run:
    • k_proxy: 10.137.0.12
    • k_server: 10.137.0.13
    • k_client: 10.137.0.16
  • Current chain failure is network pathing/firewall:
  • k_client -> k_proxy (10.137.0.12:8771) times out.
  • k_proxy -> k_server (10.137.0.13:8780) times out.
  • Proxy returns upstream error payload: server unavailable: timed out.

Session note (2026-04-24, markdown re-scan):

  • Re-read top-level workspace Markdown files: Setup.md, Workplan.md, PHASE5_RUNBOOK.md.
  • Re-skimmed source-tree reference docs in CR_SDK_CK-main, including BUILD.md, README.md, README_HOST.md, RELEASE.md, and distribute_bundle.md.
  • Current workspace docs remain aligned with the verified execution record.
  • Source-tree doc drift remains unchanged:
    • README_HOST.md still points to ./scripts/fido2_probe.py and ./scripts/webauthn_local_demo.py.
    • Active workspace policy continues to treat those paths as historical; maintained helper paths remain /home/user/chromecard/fido2_probe.py and /home/user/chromecard/webauthn_local_demo.py.
  • Source-tree build docs continue to describe a full SDK layout with mvp, setup, components, and samples, which is still not present in the current local checkout snapshot.

Session note (2026-04-24, policy retry):

  • Markdown re-scan was retried after local policy changes.
  • Re-running the workspace doc scan with a non-login shell completed cleanly, without the earlier SSH/socat startup noise in command output.

Session note (2026-04-24, chain probe retry):

  • Re-probed the Qubes access path for k_client -> k_proxy -> k_server.
  • Local forwarded SSH listener ports still exist on the host:
    • 0.0.0.0:2222 -> qrexec-client-vm 'k_client' qubes.ConnectTCP+22
    • 0.0.0.0:2223 -> qrexec-client-vm 'k_proxy' qubes.ConnectTCP+22
    • 0.0.0.0:2224 -> qrexec-client-vm 'k_server' qubes.ConnectTCP+22
  • These forwarded SSH ports currently fail immediately:
    • ssh k_client / ssh k_proxy / ssh k_server close immediately on localhost forwarded ports.
    • Direct qrexec-client-vm <target> qubes.ConnectTCP+22 returns Request refused.
  • Chain ports are currently blocked at the same qrexec layer:
    • qrexec-client-vm k_proxy qubes.ConnectTCP+8770 -> Request refused
    • qrexec-client-vm k_server qubes.ConnectTCP+8780 -> Request refused
  • This means the current blocker is active qrexec policy/service refusal for qubes.ConnectTCP, not the Python service code in k_proxy_app.py or k_server_app.py.
  • Separate SSH config issue remains on the host:
    • /etc/ssh/ssh_config.d/20-systemd-ssh-proxy.conf is still owned root:root but mode 777, which causes OpenSSH to reject it as insecure on the normal login-shell path.

Session note (2026-04-25, post-restart probe):

  • Correct client-facing proxy port is 8771 for the current split-VM chain checks.
  • SSH to k_proxy is working again.
  • k_proxy card visibility is restored after VM restart and card reconnect:
    • /dev/hidraw0 and /dev/hidraw1 are present in k_proxy
  • Current service state after restart:
    • k_proxy has no listener on 127.0.0.1:8771
    • k_server has no listener on 127.0.0.1:8780
  • Current qrexec chain state after restart:
    • qrexec-client-vm k_proxy qubes.ConnectTCP+8771 -> Request refused
    • qrexec-client-vm k_server qubes.ConnectTCP+8780 -> Request refused
  • Practical meaning:
    • SSH and card attachment recovered
    • phase-5 app services are not currently running in the VMs
    • qrexec forwarding for the chain ports is still being refused

Session note (2026-04-25, service restart):

  • k_server_app.py was restarted successfully in k_server:
    • PID 1320
    • listening on 127.0.0.1:8780
    • /health returns {"ok": true, "service": "k_server", ...}
  • k_proxy_app.py was restarted successfully in k_proxy:
    • PID 2774
    • listening on 127.0.0.1:8771
    • /health returns {"ok": true, "service": "k_proxy", "active_sessions": 0, ...}
  • Despite local service recovery, qrexec forwarding is still denied:
    • qrexec-client-vm k_proxy qubes.ConnectTCP+8771 -> Request refused
    • qrexec-client-vm k_server qubes.ConnectTCP+8780 -> Request refused

Session note (2026-04-25, markdown refresh):

  • Re-read the active workspace markdown files:
    • Setup.md
    • Workplan.md
    • PHASE5_RUNBOOK.md
  • Corrected the Phase 5 runbook to distinguish the old same-VM quickstart from the current split-VM chain usage.
  • Current documented client-facing proxy port for split-VM tests is 8771.
  • Current documented blocker remains unchanged:
    • local service health inside k_proxy and k_server is good
    • inter-VM forwarding via qubes.ConnectTCP is still refused

Session note (2026-04-25, Phase 2 HTTPS bring-up):

  • Added direct TLS support to:
    • /home/user/chromecard/k_proxy_app.py
    • /home/user/chromecard/k_server_app.py
  • Added local certificate generator:
    • /home/user/chromecard/generate_phase2_certs.py
  • Generated local CA and service certs at:
    • /home/user/chromecard/tls/phase2/ca.crt
    • /home/user/chromecard/tls/phase2/k_proxy.crt
    • /home/user/chromecard/tls/phase2/k_server.crt
  • Certificate generation was corrected to include subject key identifier and authority key identifier so Python TLS verification succeeds.
  • Current validated HTTPS shape is Qubes-localhost forwarding, not raw VM-IP routing:
    • in k_client: qvm-connect-tcp 9771:k_proxy:8771
    • in k_proxy: qvm-connect-tcp 9780:k_server:8780
    • k_proxy listens on https://127.0.0.1:8771
    • k_server listens on https://127.0.0.1:8780
    • k_proxy upstream is https://127.0.0.1:9780
  • Verified HTTPS checks:
    • k_client -> k_proxy /health over TLS succeeds with --cacert /home/user/chromecard/tls/phase2/ca.crt
    • k_proxy -> k_server /health and /resource/counter over TLS succeed through the 9780 forwarder
    • end-to-end k_client -> k_proxy -> k_server login + session reuse succeeded over HTTPS
  • End-to-end verified results:
    • login returned ok=true for alice
    • first protected counter call returned value 1
    • second protected counter call returned value 2
    • session status remained valid after reuse

Session note (2026-04-25, Phase 2.5 ownership and concurrency):

  • Current prototype state ownership is now explicit:
    • k_proxy is authoritative for session state
    • k_server is authoritative for protected resource state
    • k_client is not authoritative for either session validity or counter/resource state
  • Current session model in k_proxy:
    • server-side in-memory session store only
    • opaque bearer token generated by secrets.token_urlsafe(32)
    • per-session fields are username and expires_at
    • expiry is enforced in k_proxy; k_server does not validate client sessions directly
  • Current resource model in k_server:
    • in-memory monotonic counter guarded by a lock
    • access allowed only when request arrives from k_proxy with the expected X-Proxy-Token
  • Current concurrency model in code:
    • both services use ThreadingHTTPServer
    • k_proxy protects session-map mutations and garbage collection with a single lock
    • k_server protects counter increments with a single lock
    • TLS verification and upstream fetches happen outside the session lock in k_proxy
  • Current runtime assumptions and limits:
    • Qubes localhost forwarders are treated as transport plumbing, not as state authorities
    • if k_proxy restarts, in-memory sessions are lost
    • if k_server restarts, the in-memory counter resets
    • the current shared X-Proxy-Token is a prototype trust mechanism, not a final authorization design
  • Practical meaning:
    • race-free behavior is currently defined for session CRUD and counter increments inside one process per VM
    • persistence, distributed session authority, and multi-proxy/multi-server coordination are not implemented yet

Session note (2026-04-25, Phase 6 client portal prototype):

  • Added browser-facing client process:
    • /home/user/chromecard/k_client_portal.py
  • Current Phase 6 prototype shape:
    • portal runs in k_client on http://127.0.0.1:8766
    • portal keeps local enrolled username state in k_client
    • portal calls k_proxy over the validated TLS forward https://127.0.0.1:9771
  • Current local enrollment model:
    • enrollment is a client-local username selection stored by the portal
    • no dedicated server-side enrollment API exists yet
  • Verified portal API flow in k_client:
    • GET /health returns ok=true
    • POST /api/enroll with alice succeeds
    • POST /api/login succeeds and returns a proxy session token
    • POST /api/status succeeds
    • POST /api/resource/counter succeeds twice with upstream values 3 and 4
    • POST /api/logout succeeds
  • Current implication:
    • k_client now has a concrete client-side process instead of only runbook curls
    • browser-facing flow is now available through the local portal
    • next hardening step is to replace client-local enrollment with the intended enrollment contract and decide whether browser traffic should eventually talk to k_proxy directly or continue through a local client portal

Session note (2026-04-25, Phase 6 enrollment contract):

  • Added proxy-side enrollment API and storage:
    • POST /enroll/register
    • GET /enroll/status?username=<name>
    • persisted prototype store at /home/user/chromecard/k_proxy_enrollments.json in k_proxy
  • Current enrollment authority is now k_proxy, not the k_client portal.
  • Current portal behavior:
    • portal enrollment calls k_proxy over TLS
    • portal keeps only a preferred local username for convenience
    • portal login now depends on proxy-side enrollment existing
  • Verified behavior:
    • direct proxy login for unenrolled bob returns {"ok": false, "error": "user not enrolled", ...}
    • portal enrollment of alice succeeds and persists in proxy-side enrollment storage
    • proxy enrollment status for alice returns ok=true
    • portal login and protected counter access still succeed after enrollment
  • Practical meaning:
    • Phase 6 now has a real k_client -> k_proxy enrollment request path
    • the remaining gap is not basic routing; it is deciding the final enrollment semantics and whether the browser should stay behind a local portal or talk to k_proxy directly

Session note (2026-04-25, browser target moved to k_proxy):

  • k_proxy now serves the browser-facing portal UI directly on / over https://127.0.0.1:9771.
  • k_client_portal.py is now a temporary bridge page:
    • it points users to https://127.0.0.1:9771/
    • it is no longer the primary browser target
  • Verified direct browser/API target behavior from k_client:
    • GET https://127.0.0.1:9771/ returns the proxy portal HTML
    • GET https://127.0.0.1:9771/health returns ok=true
    • direct POST /enroll/register for carol succeeds
    • direct POST /session/login for carol succeeds
  • Current implication:
    • browser traffic is now intended to go straight to k_proxy
    • the k_client portal remains only as a temporary bridge/compatibility layer

Session note (2026-04-25, k_client browser flow page):

  • k_client_portal.py now also serves a local browser demo page again on http://127.0.0.1:8766 inside k_client.
  • The page is useful as an operator/demo surface:
    • register user
    • login with card approval or denial in k_proxy
    • call the protected k_server counter
    • logout
  • The page now also exposes current proxy enrollment state:
    • shows the registered users visible in k_proxy
    • lets the operator select a listed user into the username field
    • lets the operator unregister users from the browser page
    • login now uses the current username field instead of only the portal's last remembered user
  • It also makes the negative path explicit:
    • if login is denied on the card, the page reports that k_server was not called
  • Primary browser-facing app logic still lives on k_proxy, but the k_client page is now a concrete demo/control surface rather than just a redirect.

Session note (2026-04-25, provisional enrollment hardening):

  • The enrollment contract in k_proxy is now explicit but provisional.
  • Current prototype enrollment rules:
    • usernames are canonicalized to lowercase
    • allowed username pattern is 3-32 chars using lowercase letters, digits, ., _, -
    • optional display_name is allowed up to 64 chars
    • enrollment create is create-only and duplicate create returns user already enrolled
    • enrollment update is a separate operation
    • enrollment delete is a separate operation and removes any active sessions for that username
  • Current enrollment endpoints on k_proxy:
    • POST /enroll/register
    • GET /enroll/status?username=<name>
    • POST /enroll/update
    • POST /enroll/delete
    • GET /enroll/list
  • Verified behavior from k_client against https://127.0.0.1:9771:
    • invalid username A! is rejected
    • create for dave with display_name succeeds
    • duplicate create for dave is rejected
    • update for dave succeeds
    • list returns enrolled users and metadata
    • delete for dave succeeds
    • login for deleted dave fails with user not enrolled
  • Deliberate current limit:
    • enrollment itself still does not require card presence; only login does
    • this was kept lightweight because the enrollment semantics are expected to change later

Session note (2026-04-25, Phase 6.5 concurrency probe):

  • Added reproducible concurrency probe:
    • /home/user/chromecard/phase65_concurrency_probe.py
    • probe now supports --max-workers so client-side fan-out can be swept explicitly
  • Successful baseline run from k_client against direct proxy path:
    • 3 users
    • 4 protected requests per user
    • 12/12 requests succeeded
    • counter values were unique and contiguous from 6 to 17
    • max observed latency was about 457 ms
  • Larger follow-up run exposed current limit:
    • 5 users
    • 5 protected requests per user
    • 18/25 requests succeeded
    • failures returned TLS EOF / upstream unavailable errors
    • successful counter values were still unique and contiguous from 18 to 35
    • max observed latency was about 758 ms
  • Additional Phase 6.5 diagnosis:
    • fixed a keep-alive/body-drain bug in the HTTP/1.1 experiment so k_server no longer misparses follow-on requests as {}POST
    • added an upstream connection pool in k_proxy; current default/test setting clamps k_proxy -> k_server to one pooled TLS connection
    • despite that change, a full fan-out run with 25 in-flight protected calls still fails on client-observed TLS EOFs
    • a worker-limited run now passes cleanly:
      • 5 users
      • 5 protected requests per user
      • 25/25 requests succeeded with --max-workers 10
    • raising client-side fan-out still breaks:
      • 22/25 requests succeeded with --max-workers 15
      • 15/25 requests succeeded with fully unbounded 25 workers in the latest rerun
  • Current diagnosis:
    • the protected counter and session logic stay correct under load; successful values remain unique and contiguous
    • k_proxy and k_server can complete the requests that actually reach them
    • the primary collapse point in current testing is the client-facing Qubes forwarder on 9771
    • qvm_connect_9771.log shows qrexec-agent-data / data-vchan failures and repeated xs_transaction_start: No space left on device
    • qvm_connect_9780.log also showed earlier qrexec failures, but the latest worker-threshold evidence points first to connection fan-out on k_client -> k_proxy
  • Practical meaning:
    • the application logic is good for moderate concurrent use in the current prototype
    • the direct browser path appears stable around 10 in-flight protected calls in the current Qubes setup
    • the current concurrency ceiling is being set by Qubes forwarding behavior rather than by the monotonic counter logic

Session note (2026-04-25, in-VM forwarding test):

  • Tested the intended in-VM forwarding path with qvm-connect-tcp instead of host-side qrexec-client-vm.
  • Forwarders start and bind locally:
    • in k_client: qvm-connect-tcp 8771:k_proxy:8771 binds localhost:8771
    • in k_proxy: qvm-connect-tcp 8780:k_server:8780 binds localhost:8780
  • But the actual client->proxy connection is still refused when used:
    • k_client forward log shows Request refused
    • socat reports child exit status 126 and Connection reset by peer
  • Local login on k_proxy reaches the app but fails on the auth dependency:
    • POST /session/login to http://127.0.0.1:8771 returns 401
    • details: Missing dependency: python-fido2 ... No module named 'fido2'
  • k_server was not reached during this login test; current k_server.log only shows /health.

Session note (2026-04-25, after python3-fido2 install):

  • k_proxy was restarted after python3-fido2 installation and now listens again on 127.0.0.1:8771.
  • The previous Python import blocker is resolved; local login now reaches the CTAP probe path.
  • Current local login result on k_proxy:
    • {"ok": false, "error": "card auth failed", "details": "No CTAP HID devices found."}
  • Current forwarded login result from k_client is still not completing:
    • curl http://127.0.0.1:8771/session/login -> Empty reply from server
    • qvm_connect_8771.log still shows repeated Request refused and child exit status 126
  • Practical meaning:
    • Python dependency issue in k_proxy is fixed
    • card access inside k_proxy is currently missing again at CTAP/HID level
    • k_client -> k_proxy qrexec forwarding is still effectively denied/refused

Session note (2026-04-25, card reattached):

  • Card visibility in k_proxy is restored again:
    • /dev/hidraw0 and /dev/hidraw1 present
    • fido2_probe.py --list detects ChromeCard on /dev/hidraw0
  • Local login on k_proxy now succeeds again:
    • POST /session/login on 127.0.0.1:8771 returns 200
    • session creation for user alice succeeded
  • Remaining failure is isolated to the client-facing qrexec path:
    • k_client -> localhost:8771 through qvm-connect-tcp still returns Empty reply from server
    • qvm_connect_8771.log still shows Request refused

Session note (2026-04-25, clean forward retest):

  • Re-ran both forwards and exercised each hop immediately after local bind.
  • k_proxy -> k_server:
    • qvm-connect-tcp 8780:k_server:8780 binds localhost:8780 in k_proxy
    • first real POST /resource/counter through that forward returns Empty reply from server
    • qvm_connect_8780.log then records Request refused with child exit status 126
  • k_client -> k_proxy:
    • qvm-connect-tcp 8771:k_proxy:8771 binds localhost:8771 in k_client
    • first real POST /session/login through that forward returns Empty reply from server
    • qvm_connect_8771.log records Request refused with child exit status 126
  • Conclusion from this retest:
    • both forwards fail in the same way
    • local bind succeeds, but the actual qrexec qubes.ConnectTCP request is refused when the first connection is attempted

Session note (2026-04-25, dom0 policy fix validated):

  • After changing dom0 policy to use explicit destination VMs instead of @default for qubes.ConnectTCP, both forwards now work.
  • Verified hop 1:
    • in k_proxy, POST http://127.0.0.1:8780/resource/counter with X-Proxy-Token: dev-proxy-token succeeds
    • response included counter value 1
  • Verified hop 2:
    • in k_client, POST http://127.0.0.1:8771/session/login succeeds
    • session token is returned through the k_client -> k_proxy forward
  • Verified full end-to-end flow from k_client:
    • login succeeded and returned session token
    • POST /session/status succeeded
    • POST /resource/counter succeeded twice with upstream values 2 and 3
    • POST /session/logout succeeded
    • post-logout POST /resource/counter correctly returned 401 invalid or expired session
  • Current conclusion:
    • k_client -> k_proxy -> k_server chain is operational
    • session reuse and logout behavior are working in the current prototype

Session note (2026-04-25, live chain re-validation and regression helper):

  • Re-validated the split-VM chain after restart using the current TLS/localhost-forward shape:
    • k_client local 9771 -> k_proxy:8771
    • k_proxy local 9780 -> k_server:8780
  • Verified live service state during this run:
    • k_server local https://127.0.0.1:8780/health returned ok=true
    • k_proxy local https://127.0.0.1:8771/health returned ok=true
    • k_proxy local https://127.0.0.1:9780/health reached k_server
    • k_client local https://127.0.0.1:9771/health reached k_proxy
  • Verified end-to-end behavior from k_client:
    • login for alice succeeded
    • session status succeeded
    • protected counter calls succeeded with session reuse
    • logout succeeded
    • post-logout protected access returned 401 invalid or expired session
  • Added reproducible regression helper at:
    • /home/user/chromecard/phase5_chain_regression.sh
  • Verified the new helper end-to-end on 2026-04-25:
    • default run uses 20 requests at parallelism 8
    • returned values were unique and gap-free
    • latest verified counter range from the helper was 43..62
  • Practical meaning:
    • the current blocker is no longer Qubes forwarding for the base Phase 5 chain
    • the current next-step gap is auth semantics, not transport bring-up

Session note (2026-04-25, direct FIDO2 auth attempt):

  • Added an experimental direct FIDO2 path in /home/user/chromecard/k_proxy_app.py:
    • runtime switch: --auth-mode fido2-direct
    • default runtime remains probe
  • Added a low-level CTAP helper at /home/user/chromecard/raw_ctap_probe.py:
    • purpose: bypass Fido2Client and exercise raw CTAP2 makeCredential / getAssertion
    • logs keepalive callbacks and exact transport exceptions for host-side debugging
  • Direct-mode intent:
    • replace the legacy fido2_probe.py --json session gate
    • perform real credential registration and real assertion verification locally in k_proxy with python-fido2
  • Current observed blocker on k_proxy:
    • direct make_credential fails with No compatible PIN/UV protocols supported!
    • reproduces outside the app in a minimal VM-side probe, so this is not just a handler bug
    • likely cause is the current card / python-fido2 stack selecting a PIN/UV-dependent CTAP2 path for registration
  • Additional probe:
    • a forced CTAP1 fallback experiment did not fail immediately, but also did not complete quickly enough to treat as a usable working path in this turn
  • Latest live blocker (2026-04-25, after refactor/deploy):
    • direct probing is currently blocked before the card Yes/No UI stage because k_proxy no longer sees any CTAP HID device
    • ssh k_proxy "python3 /home/user/chromecard/fido2_probe.py --list" now returns No CTAP HID devices found.
    • ssh k_proxy "ls -l /dev/hidraw*" shows no hidraw nodes at the moment
  • Follow-up after card reattach (2026-04-25):
    • k_proxy again shows /dev/hidraw0 and /dev/hidraw1
    • direct node-open check confirms /dev/hidraw0 is readable as the normal user
    • /dev/hidraw1 still returns PermissionError: [Errno 13] Permission denied
    • raw makeCredential probe still produced no on-card registration prompt, so the host path is hanging before the firmware Yes/No UI
    • hidraw mapping confirms /dev/hidraw0 is the FIDO interface:
      • report descriptor begins with usage page 0xF1D0
      • get_descriptor('/dev/hidraw0') returns report_size_in=64, report_size_out=64
    • /dev/hidraw1 is a separate vendor HID interface with usage page 0xFF00
    • stale Python probes holding /dev/hidraw0 were cleared, but behavior did not change
    • a manual CTAPHID INIT packet sent directly to /dev/hidraw0 writes successfully and still gets no response within 3s
    • this places the current blocker below python-fido2: raw HID traffic is not getting a CTAPHID reply after the latest reattach
    • webauthn_local_demo.py was re-run inside k_proxy after reattach and still produced no card prompt on register
    • that confirms the current failure is below both the browser WebAuthn path and the direct python-fido2 path
    • after a full power cycle and reattach, manual CTAPHID INIT on /dev/hidraw0 started replying again
    • webauthn_local_demo.py register in k_proxy then succeeded again, confirming the card transport was recovered by the power cycle
    • direct host-side registration via raw_ctap_probe.py --device-path /dev/hidraw0 make-credential --rp-id localhost also succeeded again after pressing yes on the card
    • returned credential material included:
      • fmt="none"
      • credential id 7986cfcf45663f625eb7fc7b52640d83cf3d0e8a6627eeadaba3126406b1e0b8
    • this confirms the recovered direct path now reaches the real card confirmation UI and completes CTAP2 makeCredential
    • k_proxy_app.py --auth-mode fido2-direct was then patched to:
      • use low-level CTAP2 instead of the higher-level Fido2Client registration/assertion calls
      • open the explicit FIDO node /dev/hidraw0 instead of scanning devices
      • cache the direct device handle instead of reopening it for each operation
    • current remaining blocker:
      • was narrowed through repeated retries to a mix of hidraw node disappearance, older python-fido2 response-mapping requirements, and CTAP payload-shape mismatches
    • latest verified state:
      • after reattach with healthy CTAPHID INIT, real app registration through k_proxy_app.py --auth-mode fido2-direct now succeeds
      • /enroll/register for directtest returned ok=true and has_credential=true
      • real app login through /session/login for directtest also now succeeds after card confirmation
      • returned auth_mode is fido2_assertion
      • session status succeeds
      • protected /resource/counter access succeeds again through k_proxy -> k_server
      • logout succeeds
      • post-logout protected access returns 401
      • direct mode no longer depends on a fixed /dev/hidraw0 path
      • after a later re-enumeration where the card appeared on /dev/hidraw1, k_proxy_app.py was patched to probe available /dev/hidraw* nodes and select the first working CTAPHID device automatically
      • browser registration then worked again without changing the configured --direct-device-path
    • temporary direct-mode hidraw lifetime logging has been removed again after diagnosis
    • /home/user/chromecard/phase5_chain_regression.sh now supports the direct-auth baseline via:
      • --interactive-card
      • --login-timeout
      • --expect-auth-mode fido2_assertion
  • Practical outcome for this session:
    • the experimental direct mode is kept in code for follow-up work
    • the deployed k_proxy service was restored to default probe mode
    • verified alice login still works afterward, so the validated Phase 5 baseline remains intact

Known FIDO2 Transport Boundary

  • FIDO2 on this firmware is handled via USB HID (CTAPHID), not Wi-Fi/BLE/MQTT.
  • Key code points in CR_SDK_CK-main:
    • mgr_fido2.c: mgr_fido2_init() registers fido2_ctaphid_handle_packet.
    • ctaphid.c: fido2_ctaphid_handle_packet(...).
    • cr_config.h: FIDO2 HID report descriptor definitions.

Host Bring-Up Steps (How To Get To A Working FIDO2 Check)

  1. Confirm USB enumeration and HID visibility.
  • Replug card with a known data-capable cable.
  • Check: ls -l /dev/hidraw*
  1. If needed, grant Linux HID access for this device.
  • Add rule at /etc/udev/rules.d/70-chromecard-fido.rules:
SUBSYSTEM=="hidraw", ATTRS{idVendor}=="1209", ATTRS{idProduct}=="0005", MODE="0660", TAG+="uaccess"
  • Reload/apply rules and replug the device.
  1. Verify CTAP HID presence.
  • python3 /home/user/chromecard/fido2_probe.py --list
  • Then:
  • python3 /home/user/chromecard/fido2_probe.py --json
  • For raw CTAP debugging on k_proxy:
  • python3 /home/user/chromecard/raw_ctap_probe.py info
  • python3 /home/user/chromecard/raw_ctap_probe.py make-credential --rp-id localhost
  1. Run local WebAuthn bring-up demo.
  • python3 /home/user/chromecard/webauthn_local_demo.py
  • Open http://localhost:8765 (use localhost, not 127.0.0.1).
  1. Execute register/login test.
  • Register a user.
  • Login with the same user.
  • Confirm no origin/challenge mismatch errors.

Build/Flash Prerequisites (How To Get To Firmware Build)

  1. Ensure full SDK checkout layout exists under CR_SDK_CK-main:
  • mvp
  • setup
  • components
  • samples
  1. Ensure toolchain is available in shell:
  • west --version
  • nrfjprog --version
  1. Once layout/tooling are in place, run:
  • cd /home/user/chromecard/CR_SDK_CK-main
  • ./scripts/build_flash_mvp.sh

Open Gaps To Resolve

  • Whether a full CR_SDK_CK-main checkout (with role directories) is available locally.
  • Whether server-side code should be pulled now for broader CIP/WebAuthn integration testing.
  • Exact enrollment process interface running in k_client and how it reaches k_proxy.
  • Upgrade Phase 5 auth gate from card-presence probe to full WebAuthn assertion verification for session creation.
  • Determine the viable path for real credential registration on k_proxy:
    • enable whatever PIN/UV support the card expects for direct CTAP2 registration, or
    • adopt a different one-time enrollment path that can persist real credential material for later direct assertion verification.
  • Restore card visibility inside k_proxy so direct probes can reach the card UI again:
    • /dev/hidraw* must exist in k_proxy
    • fido2_probe.py --list must detect the card before the raw Yes/No probe can continue
  • Identify why the host probe hangs before card UI even with /dev/hidraw0 readable:
    • determine why CTAPHID INIT on the correct FIDO hidraw node receives no reply after reattach
    • likely recovery targets are the Qubes USB mediation path, a fresh USB reassign, or a k_proxy VM/device reset
  • Precise ownership split of session/user state between k_proxy and k_server.
  • Concrete concurrency limits and acceptance criteria (requests/sec, parallel clients, latency/error thresholds).