Tonight the project formerly known as PropCo became Honest CAM — a Community Association Management platform built on top of Xero — and went from a blank docs folder to a working portal with billing, OCR, a study quiz, and 384 passing tests. Here's how six PRs stacked up between midnight and 3am.
Writing the plan before writing the code
Before touching any implementation I dropped nine planning documents into docs/. Each feature — OCR, Xero sync, compliance tracking, the dashboard, search — got its own spec with three sections: Why, How, and Open Questions. When an AI agent picks up the implementation later, it's grading itself against written prose instead of arguing architecture in PR comments. And when future-me asks "why validation-based confidence instead of asking the model for a score?", the answer is right there under a heading called exactly that.
AI-powered OCR: reading 941 PDFs without writing OCR code
Phase 5 of the pipeline reads reorganized PDFs and pulls structured financial fields into JSON sidecars. The big unlock is Anthropic's native PDF support — the API accepts raw PDF bytes as a document content block, so there's zero pypdf or Tesseract code to maintain. Per-category prompts tell the model what to extract (line items from bank statements, amounts from invoices, dates from compliance notices), and a validation layer checks the results with deterministic rules instead of trusting the model's self-reported confidence.
A two-tier model ladder keeps costs sane: every document hits Haiku first. If validation fails — missing fields, amounts that don't reconcile — it retries on Sonnet. On a 941-doc batch the savings compound fast, and the validation layer is what makes the ladder safe.
The MVP: Xero sync, billing, reporting, budgets
This is the monster PR — +11,183 lines, 251 tests. Vendor alias resolution collapses 149 name variations across 58 vendors into canonical IDs via an O(1) lookup map. Category-based account routing maps 44 OCR categories to Xero chart-of-accounts codes. OAuth 2.0 with PKCE handles auth, and every sync is idempotent via SHA-256 content hashing.
On the billing side: monthly assessments for 12 units, late fees with configurable grace periods, per-unit ledgers with FIFO payment application (oldest charge first, per Florida statute), and 30/60/90-day aging. Reporting covers budget-vs-actual variance, operating and reserve fund balances per FL 718, 1099-NEC flagging for vendors paid over $600, and board meeting packet generation. Everything has a --dry-run mode because I want a human looking at proposed Xero writes before anything commits to a real ledger.
Owner portal: FastAPI + HTMX, no passwords
Unit owners get a self-service portal — current balance, aging breakdown, recent transactions, document library, and Stripe Checkout for payments. The front-end is pure HTMX and Jinja2 templates. Zero custom JavaScript. The dashboard auto-refreshes every 60 seconds via an HTMX swap.
Auth is magic links: enter your email, get a signed token (15-minute expiry), click it, get a session cookie. No passwords to store, no password resets to build. Session cookies are stateless, so horizontal scaling is free.
Fixing wrong exam answers (and expanding to 105 questions)
The study module had six factually wrong answers — most from 2024-2025 legislative changes (SB 154, HB 913) that updated thresholds the question bank still had at pre-reform numbers. Coastal inspection age, reserve waiver vote threshold, pre-license hours — all wrong. If somebody studied against those and walked into the real exam confident, that's a genuine failure mode.
The bank grew from 38 to 105 questions, weighted toward topics the exam actually tests hard: assessment collection, estoppel fees, developer turnover, fining procedures, lien expiration, and the post-Surfside condo safety legislation.
Mobile study quiz with browser-powered audio
The 105 questions now have a real interface — a mobile-first quiz with big tap targets, one question per screen, HTMX partial swaps, and a progress bar. A floating button toggles audio mode that reads questions aloud using the browser's SpeechSynthesis API. Zero API cost, zero network dependency, works offline, works on a dollar-store Android.
An admin dashboard at /study/admin tracks multiple trainees through the material — accuracy, weak topics, exam readiness — so I can see at a glance who's prepared when new CAM hires start studying.
PRs:
- #2 — Architecture and feature docs
- #3 — OCR pipeline for PDF field extraction
- #4 — Platform MVP with Xero sync, billing, reporting, and budgets
- #5 — Owner portal with FastAPI, HTMX, and magic-link auth
- #6 — Fix study errors and expand to 105 questions
- #7 — Study portal with mobile quiz, audio, and team training