6c29c22601
Read-only "ask the database in plain English" backend. Translation runs on
the local Qwen via Spark Control (question -> {intent, slots}); nothing leaves
the box, no Claude and no redaction boundary (the simplification chosen after
pressure-testing). The safe surface is a curated catalog of ~12 hand-written
parameterized queries; a slot validator is the trust boundary (no generic SQL,
no dynamic identifiers). POST /api/query/nl + GET /api/query/catalog, gated
require_bot_or_admin, read-only, audited. Soft-delete-correct per table.
Local Qwen translated 12/12 real example questions correctly against the live
Spark. Web "Ask" box and Matrix bot still to come (steps 4-5).
70 lines
4.2 KiB
Markdown
70 lines
4.2 KiB
Markdown
---
|
|
paths:
|
|
- backend/nl_query/**
|
|
---
|
|
|
|
# Natural-language query (W2)
|
|
|
|
Read this before editing the NL-query surface (`backend/nl_query/`). It is the read-only
|
|
"ask the database in plain English" layer — web "Ask" box + Matrix `@bot <question>`.
|
|
|
|
## The trust model — named intents, not a query language
|
|
|
|
There is **no generic SQL/AST compiler and no dynamically-built identifiers.** Every query is
|
|
a fixed, hand-written, reviewed, parameterized statement in `intents.py`; the only thing a
|
|
caller (or the model) controls is a small set of typed **slot values**, bound as `?` params.
|
|
`runner.validate` is the trust boundary: it accepts only a known intent key and coerces each
|
|
slot to its declared type, rejecting anything off-spec. A request that's wrong is rejected;
|
|
it can never name a table/column, pick an operator, or write SQL. `run_query` never raises —
|
|
every failure returns a structured error dict (a bad `limit=abc` must not crash the thread).
|
|
|
|
To add a capability: add a `run_*` + a registry entry (with its `slots` spec) in `intents.py`;
|
|
the translator prompt and the UI pick it up automatically from `catalog()`. Add a test case.
|
|
|
|
## Local-only — no Claude, no redaction here
|
|
|
|
Translation (question → `{intent, slots}`) runs on the **local Qwen via Spark Control**
|
|
(`translate.py`, reusing `ingest/llm.py`), the same sanctioned local leg as intake/digest. The
|
|
question never leaves the box, so there is **no Claude path and no redaction boundary** — that
|
|
was the whole point of the W2 simplification (the *answer* is sensitive and never leaves; the
|
|
*question* is generic English, translated locally). Validated **12/12** on real example
|
|
questions against the live Spark (2026-06-18). The model output is still untrusted: it goes
|
|
straight through `runner.validate`, so a hallucinated intent is rejected. If the local model
|
|
ever proves too weak, a Claude-behind-redaction translator could drop in as an alternative
|
|
`chat_fn` without touching the validator/executor — deliberately **not** built.
|
|
|
|
**Results never go to any model.** Summaries are deterministic local strings; rows render
|
|
client-side. Never add a "summarize these rows with an LLM" step — that re-introduces the leak.
|
|
|
|
## Soft-delete per table (the gotcha the design reviews caught)
|
|
|
|
The `fundraising_*` tables are a **hard-rebuilt projection** of the grid blob and have **no
|
|
`deleted_at` column** — do NOT add `deleted_at IS NULL` to them (it raises). Their live/retired
|
|
axis is the **`graveyard` flag** (exclude `graveyard = 1` for "live"). Other tables:
|
|
|
|
- `reminders` / `opportunities` / `communications` → filter `deleted_at IS NULL`.
|
|
- `emails` have no `deleted_at`; "live" = a non-tombstoned sighting (`EXISTS email_account_messages … deleted_at IS NULL`), mirroring `query_email_activity` / the digest.
|
|
|
|
`intents._last_activity_by_investor` **mirrors** `server.last_activity_by_investor` (duplicated
|
|
to avoid importing the `__main__` server module — helpers take a `conn`, never import server).
|
|
Keep the two in sync; the soft-delete test guards the copy.
|
|
|
|
## Endpoint, caps, audit
|
|
|
|
- `POST /api/query/nl` (`require_bot_or_admin`, read-only) — body `{question}` (local translate)
|
|
or `{intent, slots}` (direct, e.g. a UI re-run). Returns `{intent, slots, rows, summary,
|
|
question}`. `GET /api/query/catalog` returns the askable surface for the UI.
|
|
- Status: local-model outage → **503**; unexpected SQL fault → **500**; everything else
|
|
(a hit, or a soft `no_match`/`unknown_intent`) → **200** with the structured result, because
|
|
the UI always wants the interpreted query back, not a bare code.
|
|
- Every executed query writes an audit row (`audit_log`, `entity_type='nl_query'`) so a query
|
|
through a leaked/automated credential is detectable. Global row ceiling `MAX_ROWS=500`.
|
|
|
|
## Tests + dev harness
|
|
|
|
`test_nl_query.py` (runner: every intent + soft-delete on both recency legs + injection-safety
|
|
+ caps), `test_translate.py` (offline translator via an injected `chat_fn`), and
|
|
`test_nl_query_endpoint.py` (HTTP auth/wiring/503, local model forced down via a dead
|
|
`SPARK_CONTROL_URL` port). `try_questions.py` is a dev harness (not a test) that fires
|
|
questions at the real local model and prints the translation — the cheap way to check quality.
|