--- paths: - backend/nl_query/** --- # Natural-language query (W2) Read this before editing the NL-query surface (`backend/nl_query/`). It is the read-only "ask the database in plain English" layer — web "Ask" box + Matrix `@bot `. ## The trust model — named intents, not a query language There is **no generic SQL/AST compiler and no dynamically-built identifiers.** Every query is a fixed, hand-written, reviewed, parameterized statement in `intents.py`; the only thing a caller (or the model) controls is a small set of typed **slot values**, bound as `?` params. `runner.validate` is the trust boundary: it accepts only a known intent key and coerces each slot to its declared type, rejecting anything off-spec. A request that's wrong is rejected; it can never name a table/column, pick an operator, or write SQL. `run_query` never raises — every failure returns a structured error dict (a bad `limit=abc` must not crash the thread). To add a capability: add a `run_*` + a registry entry (with its `slots` spec) in `intents.py`; the translator prompt and the UI pick it up automatically from `catalog()`. Add a test case. ## Local-only — no Claude, no redaction here Translation (question → `{intent, slots}`) runs on the **local Qwen via Spark Control** (`translate.py`, reusing `ingest/llm.py`), the same sanctioned local leg as intake/digest. The question never leaves the box, so there is **no Claude path and no redaction boundary** — that was the whole point of the W2 simplification (the *answer* is sensitive and never leaves; the *question* is generic English, translated locally). Validated **12/12** on real example questions against the live Spark (2026-06-18). The model output is still untrusted: it goes straight through `runner.validate`, so a hallucinated intent is rejected. If the local model ever proves too weak, a Claude-behind-redaction translator could drop in as an alternative `chat_fn` without touching the validator/executor — deliberately **not** built. **Results never go to any model.** Summaries are deterministic local strings; rows render client-side. Never add a "summarize these rows with an LLM" step — that re-introduces the leak. ## Soft-delete per table (the gotcha the design reviews caught) The `fundraising_*` tables are a **hard-rebuilt projection** of the grid blob and have **no `deleted_at` column** — do NOT add `deleted_at IS NULL` to them (it raises). Their live/retired axis is the **`graveyard` flag** (exclude `graveyard = 1` for "live"). Other tables: - `reminders` / `opportunities` / `communications` → filter `deleted_at IS NULL`. - `emails` have no `deleted_at`; "live" = a non-tombstoned sighting (`EXISTS email_account_messages … deleted_at IS NULL`), mirroring `query_email_activity` / the digest. `intents._last_activity_by_investor` **mirrors** `server.last_activity_by_investor` (duplicated to avoid importing the `__main__` server module — helpers take a `conn`, never import server). Keep the two in sync; the soft-delete test guards the copy. ## Email/comms intents are MATCHED-ONLY The email-touching intents (`recent_emails`, `comms_by_user`, `email_counts_by_user`, `investor_last_contact`) surface only **investor-linked** email — an `email_investor_links` row must exist — exactly like the Communications panel's `query_email_activity`. Captured internal/vendor/personal mail is never counted or listed. The gate is `EXISTS (SELECT 1 FROM email_investor_links l WHERE l.email_id = e.id)`. **`comms_by_user` / `email_counts_by_user` originally omitted this** and counted the user's *entire* sent corpus — fixed; the runner test now seeds an unmatched sent email to guard it. Add this gate to any new email intent. ## Endpoint, caps, audit - `POST /api/query/nl` (`require_bot_or_admin`, read-only) — body `{question}` (local translate) or `{intent, slots}` (direct, e.g. a UI re-run). Returns `{intent, slots, rows, summary, question}`. `GET /api/query/catalog` returns the askable surface for the UI. - **Clients (thin):** the **Matrix Q&A** surface is built — it lives bot-side in `backend/matrix_intake/query.py` (trigger grammar + deterministic answer rendering) + `crm_client.nl_query`, and ships on the Spark (no s9pk for the bot). Two entry points: a **dedicated Q&A room** (`MATRIX_QUERY_ROOM`, every message is a question) and the `?`/`@bot` trigger in the intake room. **It depends on this endpoint being live on the box** — which lands with the v93 s9pk (reminders + W2); deploy the bot only after that, or it 404s. See the matrix-intake guide. The **web "Ask" box** (Communications tab) is the remaining client. - Status: local-model outage → **503**; unexpected SQL fault → **500**; everything else (a hit, or a soft `no_match`/`unknown_intent`) → **200** with the structured result, because the UI always wants the interpreted query back, not a bare code. - Every executed query writes an audit row (`audit_log`, `entity_type='nl_query'`) so a query through a leaked/automated credential is detectable. Global row ceiling `MAX_ROWS=500`. ## Tests + dev harness `test_nl_query.py` (runner: every intent + soft-delete on both recency legs + injection-safety + caps), `test_translate.py` (offline translator via an injected `chat_fn`), and `test_nl_query_endpoint.py` (HTTP auth/wiring/503, local model forced down via a dead `SPARK_CONTROL_URL` port). `try_questions.py` is a dev harness (not a test) that fires questions at the real local model and prints the translation — the cheap way to check quality.