20 KiB
Workplan
Last updated: 2026-04-25
This is the execution plan for making ChromeCard FIDO2 development and validation reproducible on this machine.
Constraints
- Treat
/home/user/chromecard/CR_SDK_CK-mainas read-only. - Keep helper scripts such as
fido2_probe.pyandwebauthn_local_demo.pyat/home/user/chromecard. - Target deployment model is Qubes OS with 3 AppVMs based on
debian-13-xfce:k_client,k_proxy,k_server. - Current authenticator link is card->
k_proxy(USB), but architecture must allow migration to wireless phone-mediated validation. - VM execution path is SSH-first for experiments:
ssh <host> <cmd>andscp <file> <host>:~.
Goals
- Re-establish deterministic host-to-card FIDO2 communication over USB HID/CTAPHID.
- Restore a buildable/flashable firmware workspace for
CR_SDK_CK-main. - Turn ad-hoc demos into a repeatable verification flow.
- Stand up chained TLS communication in Qubes:
k_client -> k_proxy -> k_server. - Support both login flow (browser in
k_client) and user enrollment flow (process ink_client). - Minimize repeated card prompts by introducing secure session reuse after successful authentication.
- Implement a protected dummy resource on
k_server(monotonic counter) for end-to-end validation. - Ensure
k_proxyandk_serverare thread-safe and support concurrent access. - Prepare
k_proxyauth path for future transport shift: USB-direct -> wireless phone bridge.
Phase 0: Qubes VM Baseline (Blocking)
- Provision/verify AppVMs.
- Ensure
k_client,k_proxy,k_serverexist and are based ondebian-13-xfce.
- Assign functional responsibilities.
k_client: browser client + enrollment process.k_proxy: USB card access + proxy/auth bridge.k_server: protected resource/service endpoint.
- Define TLS endpoints and certificates.
k_proxypresents TLS service tok_client.k_serverpresents TLS service tok_proxy.- Trust roots and cert distribution model documented per VM.
Exit criteria:
- All 3 VMs exist, boot, and have clearly defined service ownership.
Phase 1: Qubes Firewall Policy (Blocking)
- Enforce allowed forward paths only.
- Allow
k_clientoutbound TLS only tok_proxyservice port(s). - Allow
k_proxyoutbound TLS only tok_serverservice port(s). - Deny direct
k_clienttok_servertraffic.
- Validate return path behavior.
- Confirm responses propagate back through established flows.
- Verify with simple probes.
- TLS handshake and HTTP(S) checks from
k_clienttok_proxy. - TLS handshake and HTTP(S) checks from
k_proxytok_server.
Exit criteria:
- Policy matches intended chain and is test-verified.
Status (2026-04-24, remote diagnostics):
- Confirmed active blocker remains Phase 1 network policy/pathing.
- Evidence from live VM probes:
k_client (10.137.0.16) -> k_proxy (10.137.0.12:8771): TCP timeout.k_proxy (10.137.0.12) -> k_server (10.137.0.13:8780): upstream timeout.
- Local service health inside each VM is good, so failure is inter-VM reachability, not local process startup.
Status (2026-04-25, after restart and service recovery):
- Refined blocker: this is currently a qrexec/
qubes.ConnectTCPrefusal problem, not an app-local listener problem. - Current evidence:
k_proxylocal/healthis up on127.0.0.1:8771k_serverlocal/healthis up on127.0.0.1:8780qrexec-client-vm k_proxy qubes.ConnectTCP+8771->Request refusedqrexec-client-vm k_server qubes.ConnectTCP+8780->Request refused
- Immediate next action for Phase 1:
- verify and fix the dom0 policy/mechanism that should permit
qubes.ConnectTCPforwarding for the chain ports
- verify and fix the dom0 policy/mechanism that should permit
Phase 2: TLS Certificates and Service Endpoints
- Certificate model.
- Create or import CA and issue certs for
k_proxyandk_server. - Install trust roots in client VM(s) that need validation.
- Service shape.
k_server: HTTPS service exposing protected resource endpoint(s), including a monotonic counter endpoint.k_proxy: minimal HTTPS API gateway service (full web server framework not required).
- Endpoint contract.
- Define request/response schema between
k_clientandk_proxy. - Define upstream request contract from
k_proxytok_server.
Exit criteria:
- Mutual TLS trust decisions are documented and tested.
- HTTPS calls succeed on both links with expected cert validation.
Status (2026-04-25):
- Implemented HTTPS listeners in both prototype services.
- Added local CA + service certificate generation in
generate_phase2_certs.py. - Verified the working Qubes path is localhost forwarding plus TLS:
k_clientlocal9771forwards tok_proxy:8771k_proxylocal9780forwards tok_server:8780
- Verified cert validation on both hops using the generated CA.
- Verified end-to-end HTTPS flow:
k_client -> k_proxylogin over TLSk_proxy -> k_serverprotected counter call over TLS- session reuse still works across repeated protected requests
- Phase 2 is now effectively complete for the current prototype shape.
Phase 2.5: Define State Ownership and Concurrency Model
- State ownership.
- Decide where user/session state is authoritative (
k_proxy,k_server, or split model). - Define token/session format and validation boundary.
- Concurrency controls.
- Define thread-safe strategy for session store and shared counters.
- Define locking/atomic/update semantics for counter increments and session updates.
- Runtime model.
- Choose service runtime/config that supports simultaneous requests safely.
Exit criteria:
- Architecture clearly documents state authority and race-free update rules.
Next action (2026-04-25):
- Move into Phase 2.5 and make the current prototype decisions explicit:
- authority for session state remains
k_proxy k_serverremains authority for the protected counter/resource state- localhost Qubes forwarders are part of the active runtime model for the two TLS hops
- define concurrency assumptions and limits around session store, forwarders, and counter access
- authority for session state remains
Status (2026-04-25):
- Current ownership model is now explicit:
k_proxyis authoritative for session creation, expiry, lookup, and logoutk_serveris authoritative for the protected monotonic counterk_clientis a client only; it holds bearer tokens but is not a state authority
- Current validation boundary is explicit:
k_proxyvalidates bearer tokens against its in-memory session storek_servertrusts only requests that arrive with the configuredX-Proxy-Tokenk_serverdoes not currently validate end-user session tokens directly
- Current concurrency strategy is explicit:
k_proxyusesThreadingHTTPServerplus one lock around the in-memory session mapk_serverusesThreadingHTTPServerplus one lock around counter increments- upstream HTTPS calls from
k_proxyare made outside the session-store lock
- Current runtime limits are explicit:
- sessions are process-local and disappear on
k_proxyrestart - counter state is process-local and resets on
k_serverrestart - transport relies on Qubes localhost forwarders
9771and9780
- sessions are process-local and disappear on
- Phase 2.5 is complete for the current prototype shape.
Phase 3: Recover Basic Device Visibility on k_proxy (Blocking)
- Verify physical + USB enumeration path.
- Check cable/port and confirm device appears in USB listings.
- Confirm
/dev/hidraw*nodes appear when card is connected.
- Validate Linux permissions.
- Install/update udev rule for ChromeCard HID VID/PID.
- Reload udev and verify non-root read/write access to hidraw node.
- Re-run host probe.
- Run
python3 /home/user/chromecard/fido2_probe.py --list. - Run
python3 /home/user/chromecard/fido2_probe.py --json. - Record VID/PID/path and CTAP2
getInfooutput inSetup.md.
Exit criteria:
- At least one CTAP HID device is listed.
--jsonreturns validctap2_info.
Phase 4: Re-validate Local WebAuthn Demo on k_proxy
- Start local demo server.
- Run
python3 /home/user/chromecard/webauthn_local_demo.py. - Confirm URL is
http://localhost:8765.
- Exercise register/login.
- Register a test user.
- Authenticate with same user.
- Capture errors (if any) and update
Setup.md.
- Decide next demo hardening step.
- Keep bring-up-only mode, or
- add signature verification for attestation/assertion.
Exit criteria:
- Register and login both complete with card interaction prompts.
Status (2026-04-24):
- Completed in
k_proxyusinghttp://localhost:8765. - Registration result:
ok=true,username=alice,credential_count=1. - Authentication result:
ok=true,username=alice,authenticated=true.
Phase 5: Implement Proxy Auth + Session Reuse
- Authenticate via card once per session window.
k_proxyhandles initial auth using connected card.- On success, create session state for
k_client.
- Session model.
- Prefer server-side session store or signed session token.
- Include TTL/expiry, rotation, and explicit invalidation/logout path.
- Do not expose card secrets or long-lived auth material to
k_client.
- Proxying behavior.
- With valid session:
k_proxyforwards request tok_serverand returns result. - Without valid session: require fresh card-backed auth flow.
Exit criteria:
- Repeated authorized requests do not require card interaction until session expiry.
- Expired/invalid sessions are correctly rejected.
Status (2026-04-24):
- Started with a runnable prototype:
/home/user/chromecard/k_proxy_app.py/home/user/chromecard/k_server_app.py/home/user/chromecard/PHASE5_RUNBOOK.md
- Implemented in prototype:
- session create/status/logout endpoints in
k_proxy - TTL-based server-side session store with expiry garbage collection
- protected monotonic counter endpoint in
k_serverwith thread-safe increments - proxy forwarding from
k_proxytok_serverusing a shared upstream token
- session create/status/logout endpoints in
- Current auth gate for session creation is card-presence probe (
fido2_probe.py --json), pending upgrade to full assertion verification path.
Status (2026-04-25):
- Prototype services were re-started successfully after VM restart.
- Current split-VM test shape is:
k_proxylistening on127.0.0.1:8771k_serverlistening on127.0.0.1:8780
- Phase 5 application logic is runnable locally inside each VM, but end-to-end validation is still blocked by Phase 1 qrexec forwarding refusal.
Phase 5.5: Implement Dummy Resource + Access Policy on k_server
- Protected dummy resource.
- Add endpoint returning increasing number.
- Require valid upstream auth/session context from
k_proxy.
- Optional user/session handling.
- Add minimal user/session checks if
k_serveris chosen as authority (or partial authority).
- Correctness under concurrency.
- Ensure increments are monotonic and race-safe under parallel calls.
Exit criteria:
- Authorized requests obtain consistent increasing values.
- Unauthorized requests are rejected.
Phase 6: Integrate Client Enrollment + Proxy Login Flow
- Enrollment process in
k_client.
- Start process from
k_clientthat captures new-user enrollment intent/data. - Route enrollment requests to
k_proxyover TLS.
- Card-mediated login in
k_proxy.
k_proxyuses connected card for FIDO2/WebAuthn operations.k_proxyauthenticates towardk_serverover TLS.
- Browser flow in
k_client.
- Browser traffic goes only to
k_proxy. - Validate end-to-end login to
k_serverresource through proxy chain.
Exit criteria:
- Enrollment and login both function end-to-end via
k_client -> k_proxy -> k_server.
Phase 6.5: Concurrency and Multi-Client Test Setup
- Single-VM concurrency tests.
- Generate parallel request bursts from
k_clienttok_proxy. - Verify response integrity, session reuse behavior, and error rates.
- Multi-client tests.
- Run requests from multiple
k_clientinstances (or equivalent parallel clients) concurrently. - Verify isolation between users/sessions.
- Acceptance checks.
- No race-related crashes/corruption in
k_proxyork_server. - Counter/resource behavior remains correct under load.
- Session reuse reduces card prompts while preserving authorization checks.
Exit criteria:
- Test results demonstrate stable concurrent operation with documented limits.
Phase 7: Restore Firmware Build/Flash Path
- Validate SDK tree completeness.
- Confirm presence of
mvp,setup,components,samplesunderCR_SDK_CK-main. - If missing, obtain full repository/checkpoint and document source.
- Install/enable build tools.
- Ensure
westandnrfjprogare available in shell. - Confirm target board/toolchain match (
nrf7002dk/nrf5340/cpuapp, NCSv2.9.2baseline in docs).
- Run baseline build+flash.
- From
CR_SDK_CK-main, run./scripts/build_flash_mvp.sh. - If flashing fails, run documented recovery and retry.
Exit criteria:
- Successful
west buildandwest flash.
Phase 8: Consolidate Documentation and Paths
- Remove path drift between docs and actual files.
- Keep
fido2_probe.pyandwebauthn_local_demo.pyat workspace root. - Ensure docs never instruct placing helper scripts under
CR_SDK_CK-main. - Update references consistently in all docs.
- Keep
Setup.mdcurrent.
- After each significant change, update status snapshot and outcomes.
- Add minimal reproducibility checklist.
- One command list for probe + demo + build/flash prechecks.
- Maintain Markdown execution records continuously.
Setup.mdandWorkplan.mdare the canonical living docs for this workspace.- Re-scan relevant
.mdfiles before each new execution cycle and reconcile drift. - Record date-stamped session notes when priorities or blockers change.
Status (2026-04-24, markdown maintenance):
- Re-scanned the active workspace Markdown set and the main source-tree reference docs.
- No workplan phase change was required from this pass.
- Ongoing documentation watch item remains path drift in
CR_SDK_CK-main/README_HOST.md, which still uses historical./scripts/...helper locations instead of workspace-root helper paths. - Operational note: the markdown scan path now runs cleanly after policy adjustment when invoked without a login shell.
Status (2026-04-24, chain probe retry):
- Phase 1 remains blocked, but the failure point is now narrowed further:
- current refusal occurs at Qubes
qubes.ConnectTCPpolicy/service evaluation for ports22,8770, and8780 - this happens before any end-to-end app-level request can be retried
- current refusal occurs at Qubes
- Practical implication:
- do not spend time on
k_proxy_app.py/k_server_app.pyrequest handling until qrexec forwarding is permitting the intended hops again - next recovery action is to fix/activate the relevant Qubes
qubes.ConnectTCPpolicy and then re-run the qrexec bridge checks before testing HTTP flow
- do not spend time on
Status (2026-04-25, post-restart probe):
- Corrected the client-facing proxy port reference to
8771. - SSH access to
k_proxyand card visibility recovered after VM restart. - New immediate blockers are:
k_proxyservice not listening on127.0.0.1:8771k_serverservice not listening on127.0.0.1:8780- qrexec forwarding for
8771and8780still returnsRequest refused
- Next retry should start services first, then re-test qrexec forwarding and only then attempt end-to-end client flow.
Status (2026-04-25, service restart):
- Local VM services are running again on the intended loopback ports:
k_server:127.0.0.1:8780k_proxy:127.0.0.1:8771
- Phase 1 remains blocked specifically by qrexec policy/forwarding refusal on those ports.
- Next action is no longer app startup; it is fixing the
qubes.ConnectTCPallow path for8771and8780.
Status (2026-04-25, in-VM forwarding test):
- Verified that using
qvm-connect-tcpinside the source VMs still does not complete the client->proxy hop:- bind succeeds locally, but first real connection gets
Request refused
- bind succeeds locally, but first real connection gets
- Independent app-layer blocker also found in
k_proxy:python-fido2is missing there, so local/session/logincurrently fails before card auth can succeed
- Current ordered blockers:
- first: effective Qubes/qrexec allow path for
k_client -> k_proxy:8771 - second: install
python-fido2ink_proxy - third: re-test end-to-end login and then proxy->server counter flow
- first: effective Qubes/qrexec allow path for
Status (2026-04-25, after python3-fido2 install):
python3-fido2blocker ink_proxyis resolved.- Updated ordered blockers:
- first: effective Qubes/qrexec allow path for
k_client -> k_proxy:8771 - second: restore CTAP HID device visibility/access in
k_proxy(No CTAP HID devices found) - third: re-test end-to-end login and then proxy->server counter flow
- first: effective Qubes/qrexec allow path for
Status (2026-04-25, card reattached):
- CTAP HID visibility/access in
k_proxyis restored. - Local proxy login is working again with the attached card.
- The only currently confirmed blocker for the end-to-end path is the
k_client -> k_proxy:8771qrexec/qvm-connect-tcprefusal.
Status (2026-04-25, clean forward retest):
- The retest shows the same qrexec failure mode on both hops, not just the client-facing one.
- Updated blocker statement:
- effective
qubes.ConnectTCPallow path is failing for bothk_client -> k_proxy:8771k_proxy -> k_server:8780
- effective
- App services and card path are currently good; forwarding remains the single active system blocker.
Status (2026-04-25, dom0 policy fix validated):
- The explicit-destination dom0
qubes.ConnectTCPpolicy fix resolved forwarding on both hops. - Current verified working chain:
k_client -> k_proxy:8771k_proxy -> k_server:8780
- Current verified prototype behavior:
- session login works from
k_client - session status works
- protected counter flow reaches
k_server - session reuse avoids re-login for repeated counter calls
- logout invalidates the session and subsequent protected access returns
401
- session login works from
- Immediate networking blocker is cleared.
Exit criteria:
- New team member can follow docs end-to-end without path or tooling ambiguity.
Phase 9: Migrate to Phone-Mediated Wireless Validation (Future)
- Auth transport abstraction in
k_proxy.
- Introduce/keep a transport interface for authenticator operations.
- Implement at least two backends:
- USB-direct backend (current).
- Phone-wireless backend (future).
- Wireless phone integration.
- Define protocol between
k_proxyand phone service. - Define secure pairing/authentication and message integrity for wireless link.
- Add timeout/retry behavior and offline handling.
- Functional equivalence tests.
- Verify login/enrollment behavior is unchanged at API level for
k_client. - Verify session reuse still works and card prompts are not increased unexpectedly.
Exit criteria:
k_proxycan validate via wireless phone path with no client-facing API changes.
Inputs Expected During This Session
- Exact observed behavior on reconnect attempts (USB/hidraw/probe).
- Whether we should pull server-side code now.
- Any board/firmware variants different from default documentation assumptions.
- Preferred TLS ports, certificate approach, and hostname scheme for
k_client,k_proxy,k_server. - Session TTL and invalidation requirements for cached authenticated access.
- Decision on where user/session authority lives (
k_proxyvsk_servervs split). - Target concurrency level for validation (parallel clients and parallel requests per client).
- Preferred wireless transport/protocol between
k_proxyand phone (for future phase).
Session Maintenance Notes (2026-04-24)
- Top-level Markdown review completed for
PHASE5_RUNBOOK.md,Setup.md, andWorkplan.md. - Current execution plan remains in sync with the Phase 5 runbook:
- prototype services at
/home/user/chromecard/k_proxy_app.pyand/home/user/chromecard/k_server_app.py - run sequence documented in
/home/user/chromecard/PHASE5_RUNBOOK.md
- prototype services at
- No phase ordering or blocker changes were required from this review pass.
- Remote execution support is now active and validated:
sshcommand execution works fork_client,k_proxy,k_serverscppush to VM home works (validated onk_proxy)