68106d7a5a
Read-only natural-language query over the curated nl_query endpoint, answered in-thread. Two entry points (room-per-purpose model): a dedicated Q&A room (MATRIX_QUERY_ROOM) where every top-level message is a question, plus the ?/@bot trigger in the intake room as a cross-room convenience. Both routes hit the same handle_query -> crm_client.nl_query -> POST /api/query/nl; translation runs on the box's local model, nothing leaves the box, and there is no write path so no approval gate applies. Pure logic (trigger parsing, answer rendering) in query.py with offline tests; async room wiring in bot.py (live-smoke only, per the bot's convention). Bot-side only, ships on the Spark via git pull + restart. Depends on the box-side /api/query/nl endpoint, which lands with the v93 s9pk (reminders + W2): until v93 is installed the Q&A surface 404s, so the bot deploy is staged to follow that install.
77 lines
4.8 KiB
Markdown
77 lines
4.8 KiB
Markdown
---
|
|
paths:
|
|
- backend/nl_query/**
|
|
---
|
|
|
|
# Natural-language query (W2)
|
|
|
|
Read this before editing the NL-query surface (`backend/nl_query/`). It is the read-only
|
|
"ask the database in plain English" layer — web "Ask" box + Matrix `@bot <question>`.
|
|
|
|
## The trust model — named intents, not a query language
|
|
|
|
There is **no generic SQL/AST compiler and no dynamically-built identifiers.** Every query is
|
|
a fixed, hand-written, reviewed, parameterized statement in `intents.py`; the only thing a
|
|
caller (or the model) controls is a small set of typed **slot values**, bound as `?` params.
|
|
`runner.validate` is the trust boundary: it accepts only a known intent key and coerces each
|
|
slot to its declared type, rejecting anything off-spec. A request that's wrong is rejected;
|
|
it can never name a table/column, pick an operator, or write SQL. `run_query` never raises —
|
|
every failure returns a structured error dict (a bad `limit=abc` must not crash the thread).
|
|
|
|
To add a capability: add a `run_*` + a registry entry (with its `slots` spec) in `intents.py`;
|
|
the translator prompt and the UI pick it up automatically from `catalog()`. Add a test case.
|
|
|
|
## Local-only — no Claude, no redaction here
|
|
|
|
Translation (question → `{intent, slots}`) runs on the **local Qwen via Spark Control**
|
|
(`translate.py`, reusing `ingest/llm.py`), the same sanctioned local leg as intake/digest. The
|
|
question never leaves the box, so there is **no Claude path and no redaction boundary** — that
|
|
was the whole point of the W2 simplification (the *answer* is sensitive and never leaves; the
|
|
*question* is generic English, translated locally). Validated **12/12** on real example
|
|
questions against the live Spark (2026-06-18). The model output is still untrusted: it goes
|
|
straight through `runner.validate`, so a hallucinated intent is rejected. If the local model
|
|
ever proves too weak, a Claude-behind-redaction translator could drop in as an alternative
|
|
`chat_fn` without touching the validator/executor — deliberately **not** built.
|
|
|
|
**Results never go to any model.** Summaries are deterministic local strings; rows render
|
|
client-side. Never add a "summarize these rows with an LLM" step — that re-introduces the leak.
|
|
|
|
## Soft-delete per table (the gotcha the design reviews caught)
|
|
|
|
The `fundraising_*` tables are a **hard-rebuilt projection** of the grid blob and have **no
|
|
`deleted_at` column** — do NOT add `deleted_at IS NULL` to them (it raises). Their live/retired
|
|
axis is the **`graveyard` flag** (exclude `graveyard = 1` for "live"). Other tables:
|
|
|
|
- `reminders` / `opportunities` / `communications` → filter `deleted_at IS NULL`.
|
|
- `emails` have no `deleted_at`; "live" = a non-tombstoned sighting (`EXISTS email_account_messages … deleted_at IS NULL`), mirroring `query_email_activity` / the digest.
|
|
|
|
`intents._last_activity_by_investor` **mirrors** `server.last_activity_by_investor` (duplicated
|
|
to avoid importing the `__main__` server module — helpers take a `conn`, never import server).
|
|
Keep the two in sync; the soft-delete test guards the copy.
|
|
|
|
## Endpoint, caps, audit
|
|
|
|
- `POST /api/query/nl` (`require_bot_or_admin`, read-only) — body `{question}` (local translate)
|
|
or `{intent, slots}` (direct, e.g. a UI re-run). Returns `{intent, slots, rows, summary,
|
|
question}`. `GET /api/query/catalog` returns the askable surface for the UI.
|
|
- **Clients (thin):** the **Matrix Q&A** surface is built — it lives bot-side in
|
|
`backend/matrix_intake/query.py` (trigger grammar + deterministic answer rendering) +
|
|
`crm_client.nl_query`, and ships on the Spark (no s9pk for the bot). Two entry points: a
|
|
**dedicated Q&A room** (`MATRIX_QUERY_ROOM`, every message is a question) and the `?`/`@bot`
|
|
trigger in the intake room. **It depends on this endpoint being live on the box** — which lands
|
|
with the v93 s9pk (reminders + W2); deploy the bot only after that, or it 404s. See the
|
|
matrix-intake guide. The **web "Ask" box** (Communications tab) is the remaining client.
|
|
- Status: local-model outage → **503**; unexpected SQL fault → **500**; everything else
|
|
(a hit, or a soft `no_match`/`unknown_intent`) → **200** with the structured result, because
|
|
the UI always wants the interpreted query back, not a bare code.
|
|
- Every executed query writes an audit row (`audit_log`, `entity_type='nl_query'`) so a query
|
|
through a leaked/automated credential is detectable. Global row ceiling `MAX_ROWS=500`.
|
|
|
|
## Tests + dev harness
|
|
|
|
`test_nl_query.py` (runner: every intent + soft-delete on both recency legs + injection-safety
|
|
+ caps), `test_translate.py` (offline translator via an injected `chat_fn`), and
|
|
`test_nl_query_endpoint.py` (HTTP auth/wiring/503, local model forced down via a dead
|
|
`SPARK_CONTROL_URL` port). `try_questions.py` is a dev harness (not a test) that fires
|
|
questions at the real local model and prints the translation — the cheap way to check quality.
|