44 KiB
Workplan
Last updated: 2026-05-08
This is the execution plan for making ChromeCard FIDO2 development and validation reproducible on this machine.
Constraints
- Treat
/home/user/chromecard/CR_SDK_CK-mainas read-only. - Keep helper scripts such as
fido2_probe.pyandwebauthn_local_demo.pyat/home/user/chromecard. - Target deployment model is Qubes OS with 3 AppVMs based on
debian-13-xfce:k_client,k_proxy,k_server. - Current authenticator link is card->
k_proxy(USB), but architecture must allow migration to wireless phone-mediated validation. - VM execution path is SSH-first for experiments:
ssh <host> <cmd>andscp <file> <host>:~.
Goals
- Re-establish deterministic host-to-card FIDO2 communication over USB HID/CTAPHID.
- Restore a buildable/flashable firmware workspace for
CR_SDK_CK-main. - Turn ad-hoc demos into a repeatable verification flow.
- Stand up chained TLS communication in Qubes:
k_client -> k_proxy -> k_server. - Support both login flow (browser in
k_client) and user enrollment flow (process ink_client). - Minimize repeated card prompts by introducing secure session reuse after successful authentication.
- Implement a protected dummy resource on
k_server(monotonic counter) for end-to-end validation. - Ensure
k_proxyandk_serverare thread-safe and support concurrent access. - Prepare
k_proxyauth path for future transport shift: USB-direct -> wireless phone bridge.
Phase 0: Qubes VM Baseline (Blocking)
- Provision/verify AppVMs.
- Ensure
k_client,k_proxy,k_serverexist and are based ondebian-13-xfce.
- Assign functional responsibilities.
k_client: browser client + enrollment process.k_proxy: USB card access + proxy/auth bridge.k_server: protected resource/service endpoint.
- Define TLS endpoints and certificates.
k_proxypresents TLS service tok_client.k_serverpresents TLS service tok_proxy.- Trust roots and cert distribution model documented per VM.
Exit criteria:
- All 3 VMs exist, boot, and have clearly defined service ownership.
Phase 1: Qubes Firewall Policy
- Enforce allowed forward paths only.
- Allow
k_clientoutbound TLS only tok_proxyservice port(s). - Allow
k_proxyoutbound TLS only tok_serverservice port(s). - Deny direct
k_clienttok_servertraffic.
- Validate return path behavior.
- Confirm responses propagate back through established flows.
- Verify with simple probes.
- TLS handshake and HTTP(S) checks from
k_clienttok_proxy. - TLS handshake and HTTP(S) checks from
k_proxytok_server.
Exit criteria:
- Policy matches intended chain and is test-verified.
Status (2026-04-24, remote diagnostics):
- Confirmed active blocker remains Phase 1 network policy/pathing.
- Evidence from live VM probes:
k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771): TCP timeout.k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780): upstream timeout.
- Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup.
Status (2026-04-25, after restart and service recovery):
- Refined blocker: this is currently a qrexec/
qubes.ConnectTCPrefusal problem, not an app-local listener problem. - Current evidence:
k_proxylocal/healthis up on127.0.0.1:8771k_serverlocal/healthis up on127.0.0.1:8780qrexec-client-vm k_proxy qubes.ConnectTCP+8771->Request refusedqrexec-client-vm k_server qubes.ConnectTCP+8780->Request refused
- Immediate next action for Phase 1:
- verify and fix the dom0 policy/mechanism that should permit
qubes.ConnectTCPforwarding for the chain ports
- verify and fix the dom0 policy/mechanism that should permit
Status (2026-04-25, dom0 policy fix validated):
- The forwarding blocker is cleared for the current prototype shape.
- Verified working chain:
k_clientlocalhost9771->k_proxy:8771k_proxylocalhost9780->k_server:8780
- Verified outcome:
- TLS health checks pass on both hops
- end-to-end login, session status, protected counter access, and logout all succeed from
k_client
- Phase 1 is complete for the current localhost-forwarded
qubes.ConnectTCPdesign.
Phase 2: TLS Certificates and Service Endpoints
- Certificate model.
- Create or import CA and issue certs for
k_proxyandk_server. - Install trust roots in client VM(s) that need validation.
- Service shape.
k_server: HTTPS service exposing protected resource endpoint(s), including a monotonic counter endpoint.k_proxy: minimal HTTPS API gateway service (full web server framework not required).
- Endpoint contract.
- Define request/response schema between
k_clientandk_proxy. - Define upstream request contract from
k_proxytok_server.
Exit criteria:
- Mutual TLS trust decisions are documented and tested.
- HTTPS calls succeed on both links with expected cert validation.
Status (2026-04-25):
- Implemented HTTPS listeners in both prototype services.
- Added local CA + service certificate generation in
generate_phase2_certs.py. - Verified the working Qubes path is localhost forwarding plus TLS:
k_clientlocal9771forwards tok_proxy:8771k_proxylocal9780forwards tok_server:8780
- Verified cert validation on both hops using the generated CA.
- Verified end-to-end HTTPS flow:
k_client -> k_proxylogin over TLSk_proxy -> k_serverprotected counter call over TLS- session reuse still works across repeated protected requests
- Phase 2 is now effectively complete for the current prototype shape.
Phase 2.5: Define State Ownership and Concurrency Model
- State ownership.
- Decide where user/session state is authoritative (
k_proxy,k_server, or split model). - Define token/session format and validation boundary.
- Concurrency controls.
- Define thread-safe strategy for session store and shared counters.
- Define locking/atomic/update semantics for counter increments and session updates.
- Runtime model.
- Choose service runtime/config that supports simultaneous requests safely.
Exit criteria:
- Architecture clearly documents state authority and race-free update rules.
Next action (2026-04-25):
- Move into Phase 2.5 and make the current prototype decisions explicit:
- authority for session state remains
k_proxy k_serverremains authority for the protected counter/resource state- localhost Qubes forwarders are part of the active runtime model for the two TLS hops
- define concurrency assumptions and limits around session store, forwarders, and counter access
- authority for session state remains
Status (2026-04-25):
- Current ownership model is now explicit:
k_proxyis authoritative for session creation, expiry, lookup, and logoutk_serveris authoritative for the protected monotonic counterk_clientis a client only; it holds bearer tokens but is not a state authority
- Current validation boundary is explicit:
k_proxyvalidates bearer tokens against its in-memory session storek_servertrusts only requests that arrive with the configuredX-Proxy-Tokenk_serverdoes not currently validate end-user session tokens directly
- Current concurrency strategy is explicit:
k_proxyusesThreadingHTTPServerplus one lock around the in-memory session mapk_serverusesThreadingHTTPServerplus one lock around counter increments- upstream HTTPS calls from
k_proxyare made outside the session-store lock
- Current runtime limits are explicit:
- sessions are process-local and disappear on
k_proxyrestart - counter state is process-local and resets on
k_serverrestart - transport relies on Qubes localhost forwarders
9771and9780
- sessions are process-local and disappear on
- Phase 2.5 is complete for the current prototype shape.
Phase 3: Recover Basic Device Visibility on k_proxy (Blocking)
- Verify physical + USB enumeration path.
- Check cable/port and confirm device appears in USB listings.
- Confirm
/dev/hidraw*nodes appear when card is connected.
- Validate Linux permissions.
- Install/update udev rule for ChromeCard HID VID/PID.
- Reload udev and verify non-root read/write access to hidraw node.
- Re-run host probe.
- Run
python3 /home/user/chromecard/fido2_probe.py --list. - Run
python3 /home/user/chromecard/fido2_probe.py --json. - Record VID/PID/path and CTAP2
getInfooutput inSetup.md.
Exit criteria:
- At least one CTAP HID device is listed.
--jsonreturns validctap2_info.
Phase 4: Re-validate Local WebAuthn Demo on k_proxy
- Start local demo server.
- Run
python3 /home/user/chromecard/webauthn_local_demo.py. - Confirm URL is
http://localhost:8765.
- Exercise register/login.
- Register a test user.
- Authenticate with same user.
- Capture errors (if any) and update
Setup.md.
- Decide next demo hardening step.
- Keep bring-up-only mode, or
- add signature verification for attestation/assertion.
Exit criteria:
- Register and login both complete with card interaction prompts.
Status (2026-04-24):
- Completed in
k_proxyusinghttp://localhost:8765. - Registration result:
ok=true,username=alice,credential_count=1. - Authentication result:
ok=true,username=alice,authenticated=true.
Phase 5: Implement Proxy Auth + Session Reuse
- Authenticate via card once per session window.
k_proxyhandles initial auth using connected card.- On success, create session state for
k_client.
- Session model.
- Prefer server-side session store or signed session token.
- Include TTL/expiry, rotation, and explicit invalidation/logout path.
- Do not expose card secrets or long-lived auth material to
k_client.
- Proxying behavior.
- With valid session:
k_proxyforwards request tok_serverand returns result. - Without valid session: require fresh card-backed auth flow.
Exit criteria:
- Repeated authorized requests do not require card interaction until session expiry.
- Expired/invalid sessions are correctly rejected.
Status (2026-04-24):
- Started with a runnable prototype:
/home/user/chromecard/k_proxy_app.py/home/user/chromecard/k_server_app.py/home/user/chromecard/PHASE5_RUNBOOK.md
- Implemented in prototype:
- session create/status/logout endpoints in
k_proxy - TTL-based server-side session store with expiry garbage collection
- protected monotonic counter endpoint in
k_serverwith thread-safe increments - proxy forwarding from
k_proxytok_serverusing a shared upstream token
- session create/status/logout endpoints in
- Current auth gate for session creation is card-presence probe (
fido2_probe.py --json), pending upgrade to full assertion verification path.
Status (2026-04-25):
- Prototype services were re-started successfully after VM restart.
- Current split-VM test shape is:
k_proxylistening on127.0.0.1:8771k_serverlistening on127.0.0.1:8780
- End-to-end validation is now passing through the live chain from
k_client. - Current verified behavior:
- login succeeds for
alice - session status succeeds
- repeated protected counter requests succeed with session reuse
- logout succeeds
- post-logout protected access returns
401
- login succeeds for
- Added repeatable host-side regression helper:
/home/user/chromecard/phase5_chain_regression.sh
- Phase 5 is complete for the current prototype semantics.
- Experimental follow-up in code:
k_proxy_app.pynow also has--auth-mode fido2-direct- this mode attempts direct credential registration and direct assertion verification with
python-fido2 - it is not the deployed default because direct registration currently fails on
k_proxywithNo compatible PIN/UV protocols supported! /home/user/chromecard/raw_ctap_probe.pynow exists for lower-level CTAP2 probing with keepalive/error logging- latest retry result: after reattaching the card,
k_proxyagain exposes/dev/hidraw0and/dev/hidraw1, but rawmakeCredentialstill reaches no Yes/No card prompt /dev/hidraw0opens successfully as the normal user;/dev/hidraw1is still permission-denied- manual CTAPHID testing now shows
/dev/hidraw0is the correct FIDO interface and a directINITwrite gets no response at all - rerunning
webauthn_local_demo.pyinsidek_proxyalso still gives no card prompt, so the current break is below both browser WebAuthn and direct host probes - after a full power cycle and reattach, manual CTAPHID
INITreplies again and browser registration inwebauthn_local_demo.pysucceeds again - direct
raw_ctap_probe.py --device-path /dev/hidraw0 make-credential --rp-id localhostnow also succeeds again after card confirmation k_proxy_app.py --auth-mode fido2-directhas been moved onto low-level CTAP2 with hidraw auto-detection; it still accepts--direct-device-path, but no longer breaks if the card re-enumerates onto/dev/hidraw1- after repeated fixes for hidraw lifetime, VM-side
python-fido2response mapping, and CTAP payload shape, real app registration now succeeds fordirecttest
Phase 5.5: Implement Dummy Resource + Access Policy on k_server
- Protected dummy resource.
- Add endpoint returning increasing number.
- Require valid upstream auth/session context from
k_proxy.
- Optional user/session handling.
- Add minimal user/session checks if
k_serveris chosen as authority (or partial authority).
- Correctness under concurrency.
- Ensure increments are monotonic and race-safe under parallel calls.
Exit criteria:
- Authorized requests obtain consistent increasing values.
- Unauthorized requests are rejected.
Status (2026-04-25):
- The protected counter resource is implemented and validated in the live split-VM chain.
- Verified behavior:
- authorized requests from
k_proxyobtain increasing values - unauthorized post-logout requests from
k_clientare rejected with401 20concurrent protected requests through the chain returned unique, gap-free values
- authorized requests from
- Phase 5.5 is complete for the current prototype shape.
Phase 6: Integrate Client Enrollment + Proxy Login Flow
- Enrollment process in
k_client.
- Start process from
k_clientthat captures new-user enrollment intent/data. - Route enrollment requests to
k_proxyover TLS.
- Card-mediated login in
k_proxy.
k_proxyuses connected card for FIDO2/WebAuthn operations.k_proxyauthenticates towardk_serverover TLS.
- Browser flow in
k_client.
- Browser traffic goes only to
k_proxy.
Immediate next action:
- Preserve the now-working direct auth path as a tested option while keeping the default deployed baseline stable.
- Verified end-to-end state:
- direct
/enroll/registersucceeds fordirecttest - direct
/session/loginsucceeds fordirecttest /session/statussucceeds- protected
/resource/countersucceeds throughk_proxy -> k_server /session/logoutsucceeds- post-logout protected access returns
401
- direct
- Next work should be cleanup/hardening:
- decide whether to keep
directtestenrollment - rerun
phase5_chain_regression.sh --interactive-card --expect-auth-mode fido2_assertionagainst the current direct-auth baseline - decide when
fido2-directshould replaceprobeas the default deployed auth mode
- decide whether to keep
Exit criteria:
- Enrollment and login both function end-to-end via
k_client -> k_proxy -> k_server.
Status (2026-04-25):
- Added first
k_clientimplementation at/home/user/chromecard/k_client_portal.py. - Current prototype flow:
- browser now targets
k_proxydirectly overhttps://127.0.0.1:9771 k_client_portal.pyalso serves a local browser flow page onhttp://127.0.0.1:8766k_proxycontinues to authenticate with the card and forward tok_server- the
k_clientpage now also lists registered users fromk_proxy - the
k_clientpage can unregister users from the browser - the portal login action now uses the current username field instead of only the remembered local user
- a Playwright regression spec now exists for the browser flow in
tests/k_client_portal.spec.js - the Playwright browser regression has now passed end-to-end once from this host against a forwarded portal URL
- browser now targets
- Verified end-to-end through the portal:
- enroll
alice - login succeeds
- session status succeeds
- protected counter succeeds repeatedly with session reuse
- logout succeeds
- enroll
- Enrollment contract progress:
k_proxynow exposes prototype enrollment endpoints- proxy-side enrollment storage exists and is checked before login is allowed
- direct browser/API traffic can now use those proxy endpoints without going through the local bridge
- Phase 6 is materially further along for the current prototype shape:
- direct browser target is on
k_proxy - login/resource flow is integrated on the direct proxy path
- enrollment now has a real client->proxy path
- the
k_clientpage is now a usable demo/operator surface in addition to the direct proxy path - final enrollment semantics are still provisional
- direct browser target is on
Status (2026-04-25, enrollment hardening):
- Added a more explicit provisional enrollment contract in
k_proxy:- username normalization and validation
- optional
display_name - separate create, update, delete, status, and list operations
- delete invalidates existing sessions for that username
- Verified the hardened behaviors on the direct proxy path.
- Phase 6 is now strong enough to treat the browser/proxy flow as a stable prototype baseline.
- The remaining reason Phase 6 is not "final" is product semantics, not missing basic mechanics:
- whether enrollment should require card presence
- what user attributes belong in enrollment
- what re-enroll and recovery should mean
Status (2026-04-25, Phase 6.5 initial concurrency results):
- Added reproducible probe script at
/home/user/chromecard/phase65_concurrency_probe.py. - Probe now supports
--max-workersso client-side fan-out can be tested separately from total request count. - Moderate direct-path concurrency passes:
3 users x 4 requests12/12successful protected calls- counter values remained unique and contiguous
- Larger direct-path concurrency currently fails:
5 users x 5 requests- only
18/25successful protected calls - failed calls report TLS EOF / upstream unavailable errors
- Follow-up findings are more precise:
- body-drain handling was fixed for the HTTP/1.1 keep-alive experiment
k_proxy -> k_serverupstream concurrency is now clampable and currently tested at one pooled connection5 users x 5 requestspasses at25/25when client fan-out is limited to--max-workers 10- the same total load still fails at higher fan-out:
22/25at--max-workers 1515/25at fully unbounded25workers in the latest rerun
- Current bottleneck is still not counter correctness:
- successful results still show unique, contiguous counter values
k_proxyandk_servercomplete the requests that actually arrive
- Current likely bottleneck is the client-facing Qubes forwarding layer:
qvm_connect_9771.logshows qrexec data-vchan failures- observed message includes
xs_transaction_start: No space left on device qvm_connect_9780.logshowed earlier failures too, but the latest threshold test points first to connection fan-out onk_client -> k_proxy
- Phase 6.5 is therefore started but not complete:
- application-level concurrency looks acceptable at moderate load
- current working envelope is roughly
10in-flight protected calls on the direct browser path - higher-load failures still need Qubes forwarding diagnosis before the phase can be closed
Status (2026-04-25, Phase 5 regression helper):
- Added repeatable split-VM regression helper:
/home/user/chromecard/phase5_chain_regression.sh
- Verified helper result on the live chain:
20requests at parallelism8- login/session-status/counter/logout sequence completed successfully
- returned counter values were unique and gap-free
- latest verified helper range was
43..62
- Current implication:
- the Phase 5 baseline is now reproducible
- next work should target auth semantics rather than basic chain bring-up
Phase 6.5: Concurrency and Multi-Client Test Setup
- Single-VM concurrency tests.
- Generate parallel request bursts from
k_clienttok_proxy. - Verify response integrity, session reuse behavior, and error rates.
- Multi-client tests.
- Run requests from multiple
k_clientinstances (or equivalent parallel clients) concurrently. - Verify isolation between users/sessions.
- Acceptance checks.
- No race-related crashes/corruption in
k_proxyork_server. - Counter/resource behavior remains correct under load.
- Session reuse reduces card prompts while preserving authorization checks.
Exit criteria:
- Test results demonstrate stable concurrent operation with documented limits.
Phase 7: Restore Firmware Build/Flash Path
- Validate SDK tree completeness.
- Confirm presence of
mvp,setup,components,samplesunderCR_SDK_CK-main. - If missing, obtain full repository/checkpoint and document source.
- Install/enable build tools.
- Ensure
westandnrfjprogare available in shell. - Confirm target board/toolchain match (
nrf7002dk/nrf5340/cpuapp, NCSv2.9.2baseline in docs).
- Run baseline build+flash.
- From
CR_SDK_CK-main, run./scripts/build_flash_mvp.sh. - If flashing fails, run documented recovery and retry.
Exit criteria:
- Successful
west buildandwest flash.
Phase 8: Consolidate Documentation and Paths
- Remove path drift between docs and actual files.
- Keep
fido2_probe.pyandwebauthn_local_demo.pyat workspace root. - Ensure docs never instruct placing helper scripts under
CR_SDK_CK-main. - Update references consistently in all docs.
- Keep
Setup.mdcurrent.
- After each significant change, update status snapshot and outcomes.
- Add minimal reproducibility checklist.
- One command list for probe + demo + build/flash prechecks.
- Maintain Markdown execution records continuously.
Setup.mdandWorkplan.mdare the canonical living docs for this workspace.- Re-scan relevant
.mdfiles before each new execution cycle and reconcile drift. - Record date-stamped session notes when priorities or blockers change.
Status (2026-04-24, markdown maintenance):
- Re-scanned the active workspace Markdown set and the main source-tree reference docs.
- No workplan phase change was required from this pass.
- Ongoing documentation watch item remains path drift in
CR_SDK_CK-main/README_HOST.md, which still uses historical./scripts/...helper locations instead of workspace-root helper paths. - Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell.
Status (2026-04-24, chain probe retry):
- Phase 1 remains blocked, but the failure point is now narrowed further:
- current refusal occurs at Qubes
qubes.ConnectTCPpolicy/service evaluation for ports22,8770, and8780 - this happens before any end-to-end app-level request can be retried
- current refusal occurs at Qubes
- Practical implication:
- do not spend time on
k_proxy_app.py/k_server_app.pyrequest handling until qrexec forwarding is permitting the intended hops again - next recovery action is to fix/activate the relevant Qubes
qubes.ConnectTCPpolicy and then re-run the qrexec bridge checks before testing HTTP flow
- do not spend time on
Status (2026-04-25, post-restart probe):
- Corrected the client-facing proxy port reference to
8771. - SSH access to
k_proxyand card visibility recovered after VM restart. - New immediate blockers are:
k_proxyservice not listening on127.0.0.1:8771k_serverservice not listening on127.0.0.1:8780- qrexec forwarding for
8771and8780still returnsRequest refused
- Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow.
Status (2026-04-25, service restart):
- Local VM services are running again on the intended loopback ports:
k_server:127.0.0.1:8780k_proxy:127.0.0.1:8771
- Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports.
- Next action is no longer app startup; it is fixing the
qubes.ConnectTCPallow path for8771and8780.
Status (2026-04-25, in-VM forwarding test):
- Verified that using
qvm-connect-tcpinside the source VMs still does not complete the client->proxy hop:- bind succeeds locally, but first real connection gets
Request refused
- bind succeeds locally, but first real connection gets
- Independent app-layer blocker also found in
k_proxy:python-fido2is missing there, so local/session/logincurrently fails before card auth can succeed
- Current ordered blockers:
- first: effective Qubes/qrexec allow path for
k_client -> k_proxy:8771 - second: install
python-fido2ink_proxy - third: re-test end-to-end login and then proxy->server counter flow
- first: effective Qubes/qrexec allow path for
Status (2026-04-25, after python3-fido2 install):
python3-fido2blocker ink_proxyis resolved.- Updated ordered blockers:
- first: effective Qubes/qrexec allow path for
k_client -> k_proxy:8771 - second: restore CTAP HID device visibility/access in
k_proxy(No CTAP HID devices found) - third: re-test end-to-end login and then proxy->server counter flow
- first: effective Qubes/qrexec allow path for
Status (2026-04-25, card reattached):
- CTAP HID visibility/access in
k_proxyis restored. - Local proxy login is working again with the attached card.
- The only currently confirmed blocker for the end-to-end path is the
k_client -> k_proxy:8771qrexec/qvm-connect-tcprefusal.
Status (2026-04-25, clean forward retest):
- The retest shows the same qrexec failure mode on both hops, not just the client-facing one.
- Updated blocker statement:
- effective
qubes.ConnectTCPallow path is failing for bothk_client -> k_proxy:8771k_proxy -> k_server:8780
- effective
- App services and card path are currently good; forwarding remains the single active system blocker.
Status (2026-04-25, dom0 policy fix validated):
- The explicit-destination dom0
qubes.ConnectTCPpolicy fix resolved forwarding on both hops. - Current verified working chain:
k_client -> k_proxy:8771k_proxy -> k_server:8780
- Current verified prototype behavior:
- session login works from
k_client - session status works
- protected counter flow reaches
k_server - session reuse avoids re-login for repeated counter calls
- logout invalidates the session and subsequent protected access returns
401
- session login works from
- Immediate networking blocker is cleared.
Exit criteria:
- New team member can follow docs end-to-end without path or tooling ambiguity.
Phase 9: Migrate to Phone-Mediated Wireless Validation
Status (2026-05-04): ACTIVE — Architecture v2 adopted; Component 1 + Component 2 CONNECT handler complete
Architecture v2 changes (2026-05-04)
The following changes replace the v1 architecture. Source: chromecard_arkitektur_v2.docx.
Component 2 no longer calls endpoints: Component 2 returns the WebAuthn token to whoever asked (Component 1). It is Component 1 that calls the endpoint with the token. This is the most important behavioral change.
New Component 3 (external client): A compiled binary (Go recommended, Rust alternative) installed on external client computers. Replaces the old browser-proxy-configuration approach. Tasks: find the phone (currently hardcoded IP+port — rendezvous TBD), forward validation requests to Component 1, receive token back, call the protected endpoint directly, return response to browser.
Flow A splits into two paths:
- Phone browser: Browser → Component 1 → Component 2 (returns token) → Component 1 calls endpoint → resource
- External client: Browser → Component 3 → Component 1 → Component 2 (returns token) → Component 1 → Component 3 calls endpoint → resource
Platform note: Android needs no extra infrastructure. iOS requires a push-relay (APNs) for background operation — platform priority is an open decision.
New open decisions: Rendezvous mechanism for Component 3; iOS vs Android priority.
Architectural decision (2026-05-08) — token binding model: Current choice: per-request authentication. No session is opened. Each request to a gated resource requires a fresh FIDO2 assertion from the card, with the challenge bound to the specific request (URL + method + nonce). The server verifies that the assertion's challenge matches the resource being requested. A token cannot be replayed for a different resource. Consequence: one card interaction per request. This is intentional for now. May change to: session model (one card interaction opens a time-limited session for all gated resources). If changed, token must at minimum be bound to a specific server (audience) to prevent cross-server replay. Trigger for revisiting: user experience — if per-request card interaction proves too slow or disruptive.
Target architecture (v2)
Four physical devices: optional client computer, phone, chromecard, server.
Phone components:
- Component 1 — Proxy + gating filter: Receives requests from phone browser and from external clients via Component 3. Per-request: gated host → forward to Component 2, receive WebAuthn token back, call endpoint with token (TLS); non-gated → forward directly to internet on port 80 (no TLS, bypasses auth entirely).
- Component 2 — WebAuthn client + URL recognition: Always returns token to caller, never calls endpoints itself. Detects registration URL → admin registration flow (admin fingerprint); other gated URLs → FIDO2 assertion flow (user fingerprint → token returned to Component 1).
- Registration page: Local web app on phone; admin fingerprint access control enforced by card.
- Component 3 (external client): Compiled binary, finds phone, relays auth through Component 1, calls endpoint with received token.
Three flows:
- Flow A (phone browser): Browser → Comp 1 → Comp 2 → card → token → Comp 1 → endpoint → resource
- Flow A (external client): Browser → Comp 3 → Comp 1 → Comp 2 → card → token → Comp 1 → Comp 3 → endpoint → resource
- Flow B: Browser → Comp 1 → Comp 2 (registration URL) → card (admin biometric) → enroll/delete user
- Flow C: Non-gated host → Comp 1 → internet port 80 (no TLS, no card)
Open decisions: PIN on card; user DB on-card vs. external; network-level access control on registration page; Component 3 rendezvous mechanism; iOS vs Android priority.
Development chain (Qubes): k_client browser → k_phone (Flutter Android) → USB HID → ChromeCard → k_server
The k_phone Flutter app replaces k_proxy entirely. It presents the same HTTP API as k_proxy_app.py
so k_client_portal.py and the browser portal work without changes.
Development environment: Mac (not Qubes). Android emulator is incompatible with Xen/Qubes. All
k_phone development and testing runs on the Mac with the Android emulator and card_emulator_bridge.py.
Work completed (2026-04-29)
- Flutter project scaffolded at
k_phone/(noflutter create— fully hand-written) - 10+ Android build issues resolved (AGP, Gradle, Kotlin, desugaring, notification channel, foreground service type)
k_phone/lib/ctaphid_channel.dart: full CTAPHID framing + USB/emulator dual-transport- Fixed: persistent socket subscription (single-subscription stream cannot use
await for ... breakper packet) - Fixed:
_emulatorSocketOpenflag prevents dead-socket writes from raisingStateError - Fixed: emulator round-trip sends all request packets before reading (no per-packet blocking)
- Fixed: persistent socket subscription (single-subscription stream cannot use
k_phone/lib/proxy_service.dart: full HTTP proxy — all endpoints implemented, error handling hardened- Fixed: card-error try-catch separated from DB StateError catch (was masking socket errors as "user already enrolled")
autoStart: truefor emulator testing; revert tofalsefor production builds
k_phone/lib/enrollment_db.dart: enrollment model + JSON persistence via path_providerk_phone/lib/fido2_ops.dart: CTAP2makeCredential,getAssertion, ECDSA-P256 assertion verification- Fixed: CTAP2 command prefix bytes (0x01/0x02) prepended to CBOR payload per CTAP2-over-CTAPHID spec
k_phone/lib/session_manager.dart: in-memory bearer token sessions;hasAnyActiveSession()added for gated-proxy forwarding (personal-device model: any live session authorises gated traffic)k_phone/lib/k_server_client.dart: HTTP forwarder to k_serverk_phone/android/app/src/main/kotlin/.../MainActivity.kt: USB HID Kotlin platform channeltests/card_emulator_bridge.py: asyncio CTAPHID TCP bridge wrappingCardEmulatorfor emulator dev
Work completed (2026-05-02)
k_phone/lib/filter_proxy.dart: Component 1 implemented — HTTP proxy with gating filter- Plain HTTP to gated host: rewritten to relative path and forwarded to Component 2
- HTTPS CONNECT to gated host: CONNECT request relayed to Component 2; tunnel opened on 200, denied on 4xx
- All other traffic forwarded directly to target host
- Gated hosts file:
gated_hosts.txtin app documents directory (onehostorhost:portper line) - Default seeded with
httpbin.orgon first run
k_phone/test/filter_proxy_test.dart: full test suite for Component 1 (gated matching, HTTP routing, CONNECT routing, edge cases)k_phone/test/enrollment_test.dart: full test suite forEnrollmentDb(register, list, delete, persistence, update)
Work completed (2026-05-02, session 2)
k_phone/lib/proxy_service.dart:_handleConnectadded to_ProxyServer- Dispatched from
_handleRequestforCONNECTmethod - Checks
_sessions.hasAnyActiveSession()— returns 407 if no active session - Extracts upstream host:port from
Hostheader - Opens TCP socket to upstream target (the real external server — httpbin.org, etc.)
- Detaches the HTTP socket (
detachSocket(writeHeaders: false)) and writes200 Connection Establishedmanually - Pipes bytes bidirectionally: client ↔ upstream
- k_server is not involved in CONNECT tunnels; Component 2 connects directly to the real target
- Dispatched from
Verified on emulator (2026-04-29)
POST /enroll/register → makeCredential via bridge → has_credential: true ✓
POST /session/login → getAssertion + ECDSA verify → auth_mode: fido2_assertion ✓
POST /session/status → 299 s remaining ✓
POST /session/logout → invalidated: true ✓
POST /resource/counter → internal error (k_server not running locally — expected)
POST /resource/counter (after logout) → 401 invalid or expired session ✓
Bridge log confirmed:
CTAP2 cmd=0x01 body=180 bytes → makeCredential OK auth_data=164 bytes
CTAP2 cmd=0x02 body=113 bytes → getAssertion OK auth_data=37 bytes sig=71 bytes
Work completed (2026-05-05, v2 architecture refactor)
k_phone (Dart):
filter_proxy_test.dart: rewritten for v2 semantics — gated HTTP now hits a mock endpoint with Bearer token, not Component 2 directly. 24/24 tests pass.filter_proxy.dart: extracted_writeProxyHeadersand_forwardHttpRequesthelpers to eliminate ~30 lines of duplication between_handleGatedHttpand_handleDirectHttp; simplified_handleDirectHttpsignature (redundanthost/portparams removed).session_manager.dart: addedstatic const int ttlSeconds = 300(public);_ttlnow references it.portal_html.dart(new): extracted 400-line HTML blobs (kPortalHtml,kEnrollHtml,kPortalHtmlBytes,kEnrollHtmlBytes) fromproxy_service.dart.proxy_service.dart: importsportal_html.dart; removed_kSessionTtlSecondsconstant (replaced withSessionManager.ttlSeconds); merged_serveHtml/_serveEnrollHtmlinto_serveHtmlBytes(req, bytes); extracted_parseUsernameand_parseUsernameAndDisplayhelpers eliminating repeated validation boilerplate; removed dead_loadTlsContextstub; simplifiedstart()TLS branch. File: 872 → 455 lines.k_server_client.dart: deleted (dead code — no longer imported anywhere).
component3 (Go):
gated.go:IsGated(host, port string)— wasIsGated(host string). Was silently missinghost:portentries in gated_hosts.txt. Now checks both bare hostname andhost:port.proxy.go:handleHTTPextractsportfrom URL (defaults"80"), passes toIsGated;handleConnectpassesportStrtoIsGated.phone.go: addedgetToken()calling/auth/get-token— avoids FIDO2 card interaction if the phone already has an active session.EnsureSession()triesgetToken()first, falls back tologin(). Fixedlogin()JSON field:expires_in→ttl_seconds(actual server field name).go build ./...passes.
Parallel-change note: Component 1 and Component 3 share the same proxy logic
Component 3 (component3/) and Component 1 (k_phone/lib/filter_proxy.dart) implement the same core behaviour: intercept HTTP/HTTPS traffic, decide per-request whether the target is gated, fetch a WebAuthn token if so, and call the endpoint directly with the token. Any structural change to one (new gating logic, token-binding changes, CONNECT handling, error semantics) will almost certainly need a corresponding change in the other. Treat them as a pair: when modifying Component 3, check Component 1 for the same fix, and vice versa.
Work completed (2026-05-08, per-request token binding)
fido2_ops.dart:GetAssertionResultnow includesclientDataJson;getAssertion()accepts optionalchallengeparam for binding.proxy_service.dart:_handleAuthGetTokenrewritten — accepts{url, method, nonce}, deriveschallenge = SHA256(url|method|nonce), calls card (getAssertion), returns self-contained assertion bundle as base64url Bearer token. No session involved.filter_proxy.dart:_getAuthToken(uri, method)generates a secure 16-byte nonce, posts{url, method, nonce}to Component 2, uses returned assertion token directly.component3/phone.go: rewritten as statelessGetTokenForRequest(url, method)— no session caching, no mutex, no expiry tracking.component3/proxy.go:handleHTTPusesGetTokenForRequest(r.URL.String(), r.Method).component3/main.go:--userflag removed (Component 2 picks the enrolled user).k_server_app.py:_verify_assertion_token()added — decodes bundle, verifies path+method match, verifies challenge claim, verifies ECDSA-P256 signature over authData||clientDataHash using public key extracted from bundle's credentialData._is_proxy_authorized()accepts either X-Proxy-Token (legacy k_proxy path) or Bearer assertion token.filter_proxy_test.dart: 2 new tests for/auth/get-tokenbody fields (url, method, nonce). 48/48 tests pass.tests/test_k_server.py: 17 Python tests for_verify_assertion_token— 12 unit tests with synthetic P-256 keys, 5 round-trip tests viaCardEmulator. All pass.- 48/48 Flutter tests pass;
go build ./...clean;flutter analyzeno issues.
Work completed (2026-05-08, Playwright acceptance tests for k_phone)
-
tests/k_phone_portal.spec.js(new): Portal UI acceptance tests (enroll → login → status → list → logout → delete). DOM assertions against#storedUser,#sessionActive,#log. Also tests empty-username and unknown-user error paths.- Run:
K_PHONE_BASE_URL=http://phone-ip:8771 npx playwright test tests/k_phone_portal.spec.js
- Run:
-
tests/k_phone_proxy.spec.js(new): Proxy routing acceptance tests. Four serial tests that prove Component 1's routing decisions:- No users → non-gated request passes through (< 500).
- No users → gated request rejected with 407 (Component 2 has no enrolled user).
- Register user (card fingerprint) → non-gated still passes through.
- With enrolled user → gated request succeeds after card assertion (200); response body proves Bearer token was forwarded to target.
- Uses Node
httpmodule for proxy requests (absolute URI / proxy protocol). - Uses Playwright
pagefixture for enrollment in test 3 (card interaction). GATED_URLdefaults tohttp://httpbin.org/get; point athttp://k-server-ip:8780/resource/counter(GATED_METHOD=POST) for full chain validation including token signature verification.- Run:
K_PHONE_PROXY=http://phone-ip:8888 K_PHONE_BASE_URL=http://phone-ip:8771 npx playwright test tests/k_phone_proxy.spec.js
Work completed (2026-05-09, Android Playwright tests passing)
-
tests/k_phone_android.spec.js: all 4 tests pass (16 s total). Two root causes fixed:-
launchBrowser()hangs on Chrome 145. Replaced with: write proxy flag to/data/local/tmp/chrome-command-line, force-stop + restart Chrome,adb forward tcp:9222 localabstract:chrome_devtools_remote,chromium.connectOverCDP(). CDP polling loop handles startup variance (≤ 15 s). -
Stale emulator socket after bridge restart.
proxy_service.dart: added_ensureCardOpen()— checksisCardAttached()and re-runs_tryOpenCard()if the socket is closed. Called beforemakeCredentialandgetAssertionin all three handler paths (enroll, session login,/auth/get-token).
-
-
playwright.config.js: global timeout reduced from 180 s → 60 s. -
adbauto-detected at~/Library/Android/sdk/platform-tools/adbwithout PATH changes. -
card_emulator_bridge.pyis long-running; no restart needed between test runs.
Next action
- Deploy to a real Android phone with physical ChromeCard via USB
- Verify USB HID path (Kotlin MainActivity.kt platform channel, hidraw node auto-detection)
- Run
phase5_chain_regression.shagainstk_phoneon Android with k_server running
k_phone API contract (must match k_proxy_app.py exactly)
GET /healthPOST /enroll/register{"username","display_name"}GET /enroll/status?username=POST /enroll/update{"username","display_name"}POST /enroll/delete{"username"}GET /enroll/listPOST /session/login{"username"}POST /session/statusPOST /session/logoutPOST /resource/counter(forwarded to k_server with X-Proxy-Token)
Key design decisions
- rp_id:
"localhost", origin:"https://localhost"(matches k_proxy_app.py defaults) - clientDataHash = SHA256(clientDataJSON), where clientDataJSON =
{"type":"webauthn.create","challenge":"<b64>","origin":"https://localhost","crossOrigin":false} - credential_data_b64 stores
AttestedCredentialDatabytes =aaguid(16) + credIdLen(2) + credId(n) + coseKey - Signature verification: ECDSA-SHA256(authData || clientDataHash, P-256 pubKey extracted from COSE key)
- No begin/complete HTTP round-trip — registration and auth are each a single HTTP call (same as Python)
- Sessions: server-side in-memory, TTL 300 s (matching Python default), token = 32-byte hex
start bridge for emulator testing
uv run --python 3.12 --with fido2 --with cbor2 --with cryptography tests/card_emulator_bridge.py
Phase 9 exit criteria
k_phonepresents identical HTTP API tok_proxy_app.py(so k_client works unchanged)- Registration and login both complete via
card_emulator_bridge.pyin emulator testing - With physical ChromeCard plugged into Android phone: full register → login → counter → logout works
phase5_chain_regression.shpasses againstk_phoneon Android
Current Next Step
Status (2026-04-29):
- Phase 9 emulator milestone complete: makeCredential + getAssertion verified via CardEmulator bridge.
- Next blocking step: deploy to real Android phone with ChromeCard over USB.
- k_server is not running in the Mac test environment; counter endpoint will work once running in Qubes.
Phase status (2026-04-29):
- Phase 6.5 (concurrency): deferred. ~10 in-flight ceiling is acceptable.
- Phase 7 (firmware build/flash): blocked on Chrome Roads (card vendor).
- Phase 9 (phone integration): emulator FIDO2 verified; physical phone + USB HID path is next.
Status (2026-04-26, markdown maintenance):
- Re-scanned
Setup.md,Workplan.md, andPHASE5_RUNBOOK.mdagainst the current workspace files.
Inputs Expected During This Session
- Exact observed behavior on reconnect attempts (USB/hidraw/probe).
- Whether we should pull server-side code now.
- Any board/firmware variants different from default documentation assumptions.
- Preferred TLS ports, certificate approach, and hostname scheme for
k_client,k_proxy,k_server. - Session TTL and invalidation requirements for cached authenticated access.
- Decision on where user/session authority lives (
k_proxyvsk_servervs split). - Target concurrency level for validation (parallel clients and parallel requests per client).
- Preferred wireless transport/protocol between
k_proxyand phone (for future phase).
Session Maintenance Notes (2026-04-24)
- Top-level Markdown review completed for
PHASE5_RUNBOOK.md,Setup.md, andWorkplan.md. - Current execution plan remains in sync with the Phase 5 runbook:
- prototype services at
/home/user/chromecard/k_proxy_app.pyand/home/user/chromecard/k_server_app.py - run sequence documented in
/home/user/chromecard/PHASE5_RUNBOOK.md
- prototype services at
- No phase ordering or blocker changes were required from this review pass.
- Remote execution support is now active and validated:
sshcommand execution works fork_client,k_proxy,k_serverscppush to VM home works (validated onk_proxy)