Add NL-query backend (W2): local translator + safe named-query runner

Read-only "ask the database in plain English" backend. Translation runs on the local Qwen via Spark Control (question -> {intent, slots}); nothing leaves the box, no Claude and no redaction boundary (the simplification chosen after pressure-testing). The safe surface is a curated catalog of ~12 hand-written parameterized queries; a slot validator is the trust boundary (no generic SQL, no dynamic identifiers). POST /api/query/nl + GET /api/query/catalog, gated require_bot_or_admin, read-only, audited. Soft-delete-correct per table. Local Qwen translated 12/12 real example questions correctly against the live Spark. Web "Ask" box and Matrix bot still to come (steps 4-5).
2026-06-18 18:35:41 -05:00
parent a166b49397
commit 6c29c22601
13 changed files with 1348 additions and 13 deletions
@@ -48,6 +48,7 @@ cd start9/0.4 && make
 - `backend/redaction/` — `scrub.py` + `client.py`: the scrub→Claude→re-hydrate privacy boundary.
 - `backend/ingest/` — chunk→embed→Qdrant + retrieval modes.
 - `backend/entity_*.py` — entity resolution/merge (the two-investor-model reconciliation).
+- `backend/nl_query/` — read-only natural-language query (W2): `intents.py` (curated parameterized query catalog), `runner.py` (slot validator = trust boundary), `translate.py` (local-Qwen question→{intent,slots}). See the nl-query guide.
 - `backend/matrix_intake/` — Matrix intake bot (separate process; `matrix-nio`, isolated to this component): typed message → local-Qwen parse → in-thread approve → write via the CRM's own `log-communication`. See the matrix-intake guide.
 - `frontend/index.html` — the entire UI.
 - `docs/` — architecture, phase plans, contracts, runbooks (see Deeper docs). `docs/guides/` — scoped subsystem rules (see below).
@@ -65,6 +66,7 @@ Subsystem rules live in `docs/guides/` and lazy-load in Claude Code via `.claude
 - **Email capture / drafts + digest send** (`backend/email_integration/`, `backend/digest_mailer.py`, `backend/smtp_send.py`) → `docs/guides/email.md`
 - **Building or deploying the s9pk** (`start9/`) → `docs/guides/packaging.md`
 - **Matrix intake bot** (`backend/matrix_intake/`) → `docs/guides/matrix-intake.md`
+- **Natural-language query** (`backend/nl_query/`) → `docs/guides/nl-query.md`

 ## Conventions

@@ -104,17 +106,13 @@ Subsystem rules live in `docs/guides/` and lazy-load in Claude Code via `.claude

 ## Current state

-_Phase 0 + Phase 1 built; **box live at v0.1.0:91; repo at v0.1.0:92** (v92 = reminders/follow-ups — built + tested locally 2026-06-18, **deploy pending**. Box deployed & verified live 2026-06-18 — `installed-version`=0.1.0:91, server up on :8080, clean; the StartOS version-graph traversal logs an inert down-to-39-then-up because the per-version `up`/`down` hooks are no-ops — real SQLite migrations run in-app at startup). **The fundraising grid + email capture is the canonical system of record.** Deploy/feature history: git log + `start9/0.4/startos/versions/`; longer-term backlog/debt: `ROADMAP.md` / `EVALUATION.md`._
+_Phase 0 + Phase 1 built; **box live at v0.1.0:91; repo at v0.1.0:92** (reminders, deploy pending). **The fundraising grid + email capture is the canonical system of record.** Active thread: **W2 natural-language query** (backend done + validated locally; web/Matrix UI next). Deploy/feature history: git log + `start9/0.4/startos/versions/`; longer-term backlog/debt: `ROADMAP.md` / `EVALUATION.md`._

- **Reminders & follow-ups (W1) — BUILT + tested locally 2026-06-18 (repo v0.1.0:92, deploy pending).** First step of the agreed reminders → NL-search → bot-mutations plan (`ROADMAP.md` "Follow-ups/reminders + NL search + bot grid-mutations"; **overarching constraint: keep LP data off third-party LLMs — the dominant risk, above write-safety**). First-class tickler tied to the grid: `reminders` table (in-app migration `0006`; logical FK to `fundraising_investors.id` + denormalized name, like `0005`), full CRUD (`GET/POST/PATCH/DELETE /api/reminders`; soft-delete; open/done/snoozed/cancelled; assignee; `source` human/bot/automation; accepts `source_row_id` so the grid stays decoupled), a read-only **derived `reminder_status` grid column** (overdue/due_soon/open — injected + stripped like `pipeline_stage`; **filterable so a saved view can later drive the follow-up view off real reminders, not the binary `follow_up` checkbox**), an orphan reconciler (`reconcile_grid_reminders`), a **Reminders** page + Dashboard **"Reminders Due"** card + **"Reminders due"** daily-digest section, and a per-investor **`last_activity_at`** recency rollup (shared building block for the W2 NL "not nurtured" query). **Pure local CRM — no LLM path, no leak surface.** Snooze keeps a reminder `open` with a pushed-out date (reliably reappears); the `snoozed` status is an explicit "mute" (Edit only). Tests: `test_reminders.py` + digest reminders test (**31/31 green, render-smoke green**). **Not yet deployed** — needs an s9pk build + install (authorize first; verify `0006` against a DB copy). Deferred fast-follow **W1b** = nurture-gap auto-suggested reminders.
+- **W2 — natural-language query (read-only): BACKEND built + tested + validated locally 2026-06-18; web/Matrix UI next.** `backend/nl_query/` — 12 curated parameterized queries + a slot validator (the trust boundary; no generic SQL) + a **local-Qwen** translator (question→{intent,slots} via Spark Control; nothing leaves the box, **no Claude, no redaction** — the simplification Grant chose). `POST /api/query/nl` (also accepts direct `{intent,slots}`) + `GET /api/query/catalog`, `require_bot_or_admin`, audited (`entity_type='nl_query'`). **Local Qwen translated 12/12 of Grant's real example questions correctly against the live Spark** — settles local-only (Claude not needed). Soft-delete-correct per table (gotcha: `fundraising_*` has **no `deleted_at`** — `graveyard` is the axis; emails via a live `eam` sighting). Guide: `docs/guides/nl-query.md`. **Next: step 4 web "Ask" box (Communications tab), step 5 Matrix `@bot <q>`** — both thin clients of the endpoint. Not committed at deploy/version level yet; folds into the next s9pk.

- **Email-proposal review over Matrix + a `bot` role — DEPLOYED, LIVE & smoke-tested 2026-06-18 (box v0.1.0:91, Spark bot `b2690c4`).** The CRM-drafted "proposed grid notes" gain: (1) a click-to-view **inline source-email popup** on the Email Capture page (`GET /api/email/detail` — from/to/cc/date/subject + scrollable body); (2) a **CRM→Matrix review bridge** — the bot pulls pending proposals (`GET /api/intake/email-proposals`), posts dash-framed review cards (note names who emailed whom, not "Sent/Received") to a dedicated room (`MATRIX_EMAIL_REVIEW_ROOM`), and relays in-thread yes/no/NL-edit (`POST .../decide`), kept in sync with the web panel (decide on either → the other reflects it). **Decided threads are redacted whole** (card + replies; the bot holds a redact/mod power level) so — with Element's "show deleted messages" OFF — the main chat *and* threads view clear completely (confirmed the intended UX). New **`bot` role** (authenticated, never admin; `require_bot_or_admin`) gates the agent endpoints; state in `email_proposal_matrix` (email-migration `0003`). Full mechanics, deploy gotchas, and the `redact_resolved.py` backfill tool: `docs/guides/matrix-intake.md`.
+- **W1 — reminders & follow-ups: BUILT + tested locally (v0.1.0:92), DEPLOY PENDING.** First-class tickler tied to the grid (migration `0006`; CRUD `GET/POST/PATCH/DELETE /api/reminders`; derived `reminder_status` grid column; Reminders page + dashboard card + digest section; the `last_activity_at` recency rollup that W2 reuses). Needs s9pk build + install (authorize first; verify `0006` against a DB copy). Deferred **W1b** = nurture-gap auto-suggested reminders.

- **Adopt the Pipeline — LIVE & smoked (v0.1.0:88):** the grid drives the deal board. "Add to Pipeline" row action → durably-linked `opportunities` row via `opportunities.fundraising_investor_id` (migration 0005); grid owns link+seed, board owns stage/probability/owner; "Remove" soft-deletes the opp, grid row intact; deleting a grid investor archives its orphaned opp. Detail + locked decisions: `ROADMAP.md` "Adopt the Pipeline".
-
- **Matrix intake bot — LIVE (Spark, container `matrix-intake`, bot `b2690c4`):** typed Matrix message → local-Qwen parse → in-thread human approval → write via `POST /api/fundraising/log-communication` (`source="matrix_intake"`); new-vs-existing via `GET /api/intake/match`. Fuzzy match + numbered disambiguation shortlist, conversational NL revise (email-integrity preserved), and the `INTAKE_TEAM_ROSTER` outreach frame all shipped & mostly smoked. **One path still un-smoked in-room: the fuzzy disambiguation numbered-pick grammar.** Runs as a docker-compose service (`restart: unless-stopped`). Guide: `docs/guides/matrix-intake.md`.
- **Working (all draft-only):** CRM + ingest (chunk→embed→Qdrant + retrieval) + redaction boundary; Gmail capture (DWD) + email-activity propose→approve; Thesis Workshop + Architect (Claude) with dual-approval gate; Outreach Draft Assistant + follow-up radar + per-user voice + Tier-B in-thread Gmail draft creation.
- **Tests:** **30/30 backend green** (`python3 backend/run_tests.py`), `py_compile` clean; frontend render-smoke gates the default `make` build.
- **Debt (P2, not deploy-blocking; full list `EVALUATION.md`):** reports-subsystem soft-delete sweep — **pipeline/opportunities aggregates fixed v87**; remaining: the dashboard **communications** aggregates (`recent_comms`/`comms_this_month`/`meetings_this_month`) + activity report + report-endpoint tests; `?limit=abc` crashes the request thread; auth regression test for the 3 v79-gated GETs (`/api/users`, `/api/email/status`, `/api/email/accounts`); scrub-gateway TLS verify off; hardcoded Spark/Qdrant IPs + **oversized StartOS package icon** (fix before the next s9pk upload); the 5.4k-line `server.py` monolith.
- **Open / risks:** the v2.0 reserve-asset spine is the *working* approved spine but **not a canonical `thesis_version`** (needs Grant + Jonathan dual sign-off; Appendix-A conviction incl. ~40% Strike stays Grant's working read, not fed to the engine); **Claude/Architect path still unverified live on the box**; the intake matcher reads only the grid blob (not classic `contacts`); doc drift — `crm-overview.md` + `EVALUATION.md` still call `lp_profiles` live (doc-auditor pass).
- **Next:** 1) **deploy v0.1.0:92 (reminders)** to the box — needs authorization; verify migration `0006` against a copy of `data/crm.db`, then `make` + install + browser-verify the Reminders page/grid chip/dashboard card (only render-smoke ran locally, not a live authenticated click-through); 2) **W2 — NL→safe-query** (the agreed plan's next build; validated filter-AST, Claude behind redaction, only the question text leaves the box; web + Matrix; = old "search item 3"), then **W3 — bot grid-mutations** behind the Matrix approval gate; 3) **W1b** nurture-gap auto-suggested reminders (fast-follow once recency proven); 4) spark-control intake dashboard card + extract intake bot to its own repo (ROADMAP); 5) in-room smoke of the intake **disambiguation** numbered-pick grammar; 6) Grant + Jonathan freeze v2.0 canonical; 7) reply-all for Tier-B drafts; then clear the P2 debt (reports comms-aggregate soft-delete sweep, `?limit=abc` crash, auth regression test, oversized StartOS icon). **Possible follow-ups:** email-review `since`-floor on `to_post`; Pipeline drag-and-drop stage moves.
+- **Done & live (detail in git log / ROADMAP):** email-proposal Matrix review + `bot` role (box v91); grid-driven Pipeline (v88); Matrix intake bot (Spark `matrix-intake` container); Gmail capture (DWD) + propose→approve + daily digest; Thesis Workshop + Architect (Claude, dual-approval); outreach drafts + radar. All draft-only.
+- **Tests:** **34/34 backend green** (`python3 backend/run_tests.py`; +`nl_query/` suite), `py_compile` clean; render-smoke gates `make`.
+- **Next (priority order):** 1) **W2 step 4** web Ask box, then **step 5** Matrix `@bot`; 2) **deploy reminders (v92) + W2 together** — bump to **v0.1.0:93**, build s9pk, install, browser-verify (authorize first; verify `0006` against a DB copy); 3) **W3** bot grid-mutations behind the Matrix approval gate (local-Qwen parse); 4) **W1b** nurture-gap reminders; 5) Grant + Jonathan freeze v2.0 canonical; 6) in-room smoke of the intake disambiguation numbered-pick grammar; then P2 debt (reports comms-aggregate soft-delete sweep, `?limit=abc` crash, auth regression test, oversized StartOS icon).
+- **Open / risks:** W2 translation only **happy-path-validated** (typos/ambiguous/no-match phrasings shake out in live use); **Claude/Architect path still unverified live on the box**; v2.0 reserve-asset spine is the *working approved* spine but **not canonical** (needs dual sign-off); doc drift — `crm-overview.md` + `EVALUATION.md` still call `lp_profiles` live.