6c29c22601
Read-only "ask the database in plain English" backend. Translation runs on
the local Qwen via Spark Control (question -> {intent, slots}); nothing leaves
the box, no Claude and no redaction boundary (the simplification chosen after
pressure-testing). The safe surface is a curated catalog of ~12 hand-written
parameterized queries; a slot validator is the trust boundary (no generic SQL,
no dynamic identifiers). POST /api/query/nl + GET /api/query/catalog, gated
require_bot_or_admin, read-only, audited. Soft-delete-correct per table.
Local Qwen translated 12/12 real example questions correctly against the live
Spark. Web "Ask" box and Matrix bot still to come (steps 4-5).
52 lines
2.2 KiB
Python
52 lines
2.2 KiB
Python
#!/usr/bin/env python3
|
|
"""Dev harness — fire questions at the LOCAL model and print how each is translated.
|
|
|
|
Lets you eyeball whether the local Qwen maps real questions to the right curated query
|
|
(intent + slots), against your real Spark, with NO UI, auth, HTTP, or deploy. This is the
|
|
cheap way to validate translation quality before building the web/Matrix surfaces. It only
|
|
translates (it does not touch the DB), so no data is needed and nothing leaves the box.
|
|
|
|
NOT shipped and NOT a test (no `test_` prefix) — a developer convenience.
|
|
|
|
Needs SPARK_CONTROL_URL set (read from the repo .env) and the Spark reachable.
|
|
Run:
|
|
python3 backend/nl_query/try_questions.py # the built-in sample set
|
|
python3 backend/nl_query/try_questions.py "when did we last email Acme?"
|
|
"""
|
|
import os
|
|
import sys
|
|
|
|
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) # backend/
|
|
import nl_query # noqa: E402
|
|
|
|
SAMPLES = [
|
|
"Which investors haven't we reached out to in the last 3 months?",
|
|
"Which investors do we owe follow-ups to?",
|
|
"What is Acme Capital's email and how much have they committed across funds?",
|
|
"When did we last reach out to Acme Capital?",
|
|
"What were the last 10 investor emails we sent, and who to?",
|
|
"What were the last 10 investor emails we received?",
|
|
"Who are all the investors located in Austin?",
|
|
"List our top 10 investors by committed capital.",
|
|
"List our top 10 pipeline investors by stage and the most recent conversation.",
|
|
"What is our total pipeline in dollars, split by stage?",
|
|
"What were the last investor emails sent by Grant?",
|
|
"How many emails has Jonathan sent this week, this month, and year to date?",
|
|
]
|
|
|
|
|
|
def main():
|
|
questions = sys.argv[1:] or SAMPLES
|
|
print(f"Translating {len(questions)} question(s) on the local model "
|
|
f"(SPARK_CONTROL_URL={os.environ.get('SPARK_CONTROL_URL', '(unset)')})\n")
|
|
for q in questions:
|
|
r = nl_query.translate(q)
|
|
if r.get("error"):
|
|
print(f" ? {q}\n -> [{r['error']}] {r.get('detail', '')}\n")
|
|
else:
|
|
print(f" ? {q}\n -> {r['intent']} slots={r['slots']}\n")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main()
|