ten31-database

Author	SHA1	Message	Date
Keysat	7285bb0e52	Add regression tests for v74 fixes; close soft-delete leak in list-view aggregates Lock in the three v0.1.0:74 security/privacy fixes with regression tests, and fix a same-class soft-delete leak surfaced while writing them. - backend/test_assets_traversal.py: boots the real server, proves /assets/ path-traversal vectors (incl. a real decoy file and the live crm.db, plain and URL-encoded) 404 and leak nothing, while a legit asset still serves 200. - backend/test_soft_delete_reads.py: get-by-id 404s soft-deleted rows and nested + list-view aggregates exclude soft-deleted children. - backend/mcp/test_outreach_redaction.py: an unknown free-prose name is tokenized away from the Claude payload but re-hydrated locally, and the path fails closed (no Claude call) when the local NER model is down. - backend/run_tests.py: aggregate runner (each backend/*/test_.py in its own subprocess); replaces the manual for-loop. 16/16 green. A reviewer pass on the tests confirmed the soft-delete filter was missing from list-view aggregate sub-selects: org contact_count/total_funded and contacts comm_count/last_contact_date counted soft-deleted rows. Add `deleted_at IS NULL` to those four (server.py) and regression-cover them. The reports subsystem (dashboard/pipeline/LP-breakdown, ~16 aggregate queries) has the same leak and is logged as P2 for a dedicated pass. Not yet built or deployed — bump the package version before the next s9pk build.	2026-06-13 00:26:22 -05:00
Keysat	aec2b7775b	Harden privacy boundary and asset serving (v0.1.0:74) Fixes from the 2026-06-12 full-eval (P0 + two P1s); code-only, no schema change. Without these the "private CRM" premise was breachable on the LAN: - P0: the /assets/ route joined the request path onto FRONTEND_DIR without normalizing '..' (get_path/urlparse pass it through), so an unauthenticated GET /assets/../../data/crm.db read any file the process could — the LP DB, the JWT signing secret (-> admin-token forgery), the Gmail key. Add a realpath containment check that 404s anything resolving outside FRONTEND_ROOT. - P1: the LP-outreach drafter built its redaction Boundary with no ner_fn, so unknown people/firms in raw email bodies reached Claude in the clear. Pass the local-Qwen NER backstop (ner_fn=_ner_local), matching architect_grounding; fails closed via the existing scrub_unavailable path if the local model is down. - P1: get-by-id handlers leaked soft-deleted records by direct ID. Add deleted_at IS NULL to every get-by-id path — contacts, organizations, opportunities, lp_profiles — and to the nested related-data sub-selects in the contact/opportunity detail payloads, matching the list-handler convention. Bumps the package to v0.1.0:74 (utils.ts + versions/v0.1.0.74.ts + graph). Full report in EVALUATION.md; remaining P2/P3 triaged in AGENTS.md Current state.	2026-06-12 18:01:48 -05:00
Keysat	fffc90c7a4	Replace v5 settlement spine with v2.0 reserve-asset spine (v0.1.0:73) Swap the dead "scarcity as the connecting idea" / bitcoin-as-settlement spine for the v2.0 reserve-asset spine (bitcoin = apex non-debasable reserve asset; debasement = forcing function; AI = abundance engine; throughline is an asset-value/capital-flow claim, not settlement; three seams Energy<->Compute, Debasement<->Bitcoin, AI<->Data-Ownership) everywhere it was still encoded in live code, the seed, and the docs. - architect_agent.py / outreach_agent.py: both system prompts carried "scarcity as the connecting idea" and shipped settlement framing into every generated draft; rewritten to the reserve-asset spine. - thesis_seed.py: THROUGHLINE, PILLAR_1, the AI/energy-operator segment angle, and THESIS_V2 corrected and voice-cleaned (no em dash / "X, not Y" / "bet"). PILLAR_2/3 (real revenue, founder access) kept. - ensure_thesis_v2_promoted / revert_thesis_v2_promotion: make the v2.0 spine the working APPROVED spine and re-ground/clean the core nodes, deployment-state-invariant (structural targeting, not body text) and fully reversible (captures prior body/title/status/deleted_at). NODE level only: never sets a thesis_version canonical (guardrail #4); no hard deletes (guardrail #3). Wired into init_db after the v2 candidate stage. - docs/thesis-handoff.md replaced wholesale with the complete v2.0 doc; Ten31_Agentic_Build_Plan.md + PHASE_1.md throughline glosses updated. The v2.0 spine remains an unratified draft from the signal-engine workstream: canonical freeze stays the partners' dual sign-off, and Appendix-A conviction/exposure figures stay Grant's working read. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-09 08:22:24 -05:00
Keysat	606b336a00	outreach: voice by-purpose (larger sample) + Tier-B Gmail draft creation (v0.1.0:71) (1) Voice: _voice_examples now picks the sender's prior sent emails OF THE SAME PURPOSE (PURPOSE_PATTERNS keyword cues per outreach type), larger sample (8) weighted by purpose then recency — not just recent. meta carries on_topic for transparency. (2) Tier-B sending (gmail.compose now authorized in Workspace DWD). New email_integration/compose.py create_outreach_draft: mints a compose-scoped DWD token for the sender (credentials._mint/access_token_for parameterized by scope; GMAIL_COMPOSE_SCOPE), builds an RFC822 message, and POSTs gmail.drafts.create into the SENDER's mailbox — as an in-thread reply (threadId + In-Reply-To/References, recipient = matched LP address) when there's an active thread, else a fresh email. NEVER sends — the human sends from Gmail (guardrails #4, #6). Route POST /api/outreach/gmail-draft; UI "Create Gmail draft" button + "Open Gmail Drafts" link. Tests: test_compose.py (parse/reply-target/RFC822+threading). Message construction unit-verified; the live drafts.create runs on the box. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 22:30:05 -05:00
Keysat	49f84ca9a4	outreach: per-user voice from own emails + transparency; active-thread context (v0.1.0:70) Voice upgrade. draft_outreach now learns the SENDER's voice: the codified rules PLUS a few-shot of that user's own recent sent emails (_voice_examples; from_email = the sender, de-identified in the same scrub batch as the recipient context, reference-only). The response returns which of the sender's emails were used (subject + date + recipient), shown in the UI as "Voice based on: …" — transparency to avoid the black-box problem. Falls back to rules-only with a clear note when the user has no captured sent email. Context restructured: _context groups the investor's email by thread and labels the most recent thread as the "Active conversation (what you are replying to)" with earlier emails as background, so replies stay on-topic instead of dredging old threads. Sender email resolved in handle_outreach_draft (users table by user_id). Test extended (active/background split, voice examples + meta, no-sender fallback). Fixed a UI bug the preview caught: the manual Draft button was onClick={draft}, which passed the click event as the investor arg after draft() gained params -> circular-JSON error; now onClick={()=>draft()}. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 22:06:38 -05:00
Keysat	787d580550	outreach: follow-up radar — deterministic "needs attention" + one-click draft (v0.1.0:69) The Outreach page now opens with a "Needs attention" list. A deterministic scan (outreach_agent.follow_up_radar) surfaces investors per the email history: tier 0 "you owe a reply" (their email is the most recent, unanswered, >=3d), tier 1 flagged + quiet, tier 2 warm lead gone quiet (no contact in >=45d). Most urgent first; every reason is verifiable from the data (no LLM in the surfacing — the deliberate fix for the trust problem that sank objection-grounding). Excludes graveyard; needs email history. One click sets the investor + suggested type (follow-up/nurture) and runs the existing outreach drafter. Route GET /api/outreach/radar. Test mcp/test_outreach.py extended (owe-reply/warm-quiet/recent/graveyard/order). Verified live in preview. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 21:31:52 -05:00
Keysat	b5619d61e1	outreach: Outreach Draft Assistant — tailored LP drafts (v0.1.0:68) First proactive-messaging build. New "Outreach" page (all authenticated users): pick an investor + type (intro / follow-up / fund update / meeting follow-up / nurture) + optional guidance; the agent drafts a tailored LP email in Ten31's voice, grounded in the thesis + that investor's CRM notes and matched email history. The draft is editable + copyable; nothing is sent (draft-only — guardrails #4, #6). Sovereignty: the thesis is Ten31's own non-sensitive messaging (to Claude as-is); the LP context is scrubbed through the redaction boundary before Claude, drafted with placeholders, and re-hydrated locally — the LP list never reaches the API. Fails closed (scrub_unavailable / claude_not_configured / rehydrate_failed quarantines a hallucinated-token draft). Backend: mcp/outreach_agent.py (context assembly + scrub + Claude + rehydrate, reusing architect_agent's client/thesis/voice + the Boundary); routes GET /api/outreach/investors, POST /api/outreach/draft; logged. Test mcp/test_outreach.py (context assembly). Verified in preview: page/selector/types/guidance render, fail-closed at the key-less Claude step (scrub ran locally first), success rendering verified with a mocked ok draft. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-08 20:06:46 -05:00
Keysat	6d6f4bcc7e	Thesis Workshop redesign: edit/choose/delete + approve-as-current (v0.1.0:56) Addresses Grant's feedback that the Workshop was confusing and underbuilt (no delete, no approve, redundant generate-vs-feedback panels, and a stray "0" on segment lines). Backend (architect_tools.py + server.py routes/handlers): - retire_node: soft-delete a node + its subtree (reversible). DELETE /api/thesis/nodes/{id}. - choose_variant: 'Use this' — keep this option, soft-delete the others in its group, mark it approved. POST /api/thesis/nodes/{id}/choose. - upsert_thesis_node gains actor_type so a manual human edit is recorded as 'human'. PUT /api/thesis/nodes/{id} edits a part's text directly. - handle_approve_line: one-click 'approve as current' — records this admin's approval on the line's in-review version (creating + submitting one from the live tree if none), promoting to canonical at the required distinct-approval count. POST /api/thesis/lines/{key}/approve. Frontend (ThesisWorkshop redesign): - Merged the redundant "Generate options" + "Give feedback" panels into one "Ask the Architect for options" box (revise was just generate-with-guidance). - Per option: Use this / Edit (inline) / Delete. Per part: edit + delete via the same. - "Approve as current" bar with dual-sign-off state + a "Current ✓" badge, and a one-line "how it works" hint. Refreshes the tree after every action. - Fixed the stray "0": `{line.is_core && <badge>}` rendered 0 for non-core lines (SQLite integer 0); now `{!!line.is_core && ...}`. Verified: backend test_thesis_actions.py (choose/edit/retire-subtree/dual-approval->canonical), and a live in-browser smoke test (JSX compiles, Workshop renders, options show Use/Edit/Delete, approve returns 1-of-2, no runtime errors). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 18:29:47 -05:00
Keysat	2e70b34592	Architect grounding boundary: redaction/re-hydration privacy gate (v0.1.0:55) Phase 1 Workstream D. Lets the Architect ground the thesis in REAL recurring LP objections without any LP identity reaching the Claude API. Layered, defense-in-depth, fail-closed by construction (docs/redaction-rehydration.md). backend/redaction/: - scrub.py: the leak-proof core. Drops Tier-1 (labelled/structured account/wire/SSN/ IBAN/SWIFT/passport, separator-tolerant); tokenizes known LP entities (dictionary from the canonical layer, unicode-folded + hyphen-extended) and structured PII (emails, scheme-less/social URLs, intl+ext phones, currency-cued amounts, ISO/worded/numeric/ quarter dates, addresses, bare long digit runs); pre-neutralizes injected [TYPE_N] strings; single-pass rehydrate; metadata-only audit logging (the pseudonym map is the de-anon key — local-only, never logged/sent). Hardened across THREE adversarial leak-hunts (worded/coded amounts, intl phones, NFD/ligature/zero-width names, slash/ comma SSN, SWIFT, alpha-prefixed accounts, substance-preserving false-positive fixes). - client.py: Boundary — one scrub/rehydrate contract, SCRUB_BACKEND=local (default) or gateway (Spark Control /scrub + /rehydrate). Fails closed (db_path required; dictionary build errors propagate; strict rehydrate returns tokenized-not-de-anon text). - test_scrub_leak.py, test_reidentification.py: golden-file leak + re-identification suites (synthetic only, guardrail #9), regression-locking every leak-hunt vector. backend/mcp/architect_grounding.py: the flow — retrieve (local) -> minimize-first (local Qwen) -> scrub (+ local-Qwen NER backstop for unknown names) -> Claude over the de-identified register only -> re-hydrate locally -> human review. FAILS CLOSED if the local model is unreachable or a hallucinated token appears. test_grounding_boundary.py proves nothing sensitive reaches Claude and the three fail-closed paths. server.py: POST /api/architect/ground (admin) wires retrieval -> ground_objections. docker_entrypoint.sh: SCRUB_BACKEND (default local). docs/spark-control-scrub-endpoints.md: the gateway handover spec (Option 1 — caller supplies the entity dictionary). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 17:06:29 -05:00
Keysat	dd25bbc08d	Architect agent: Claude-powered thesis generation (backend scaffolding) - backend/mcp/architect_agent.py: generate_options + revise on Claude (prompt- cached thesis context, claude-opus-4-8, Ten31 voice rules). Writes N variant drafts to a node's variant group; nothing canonical without human approval. Fails gracefully if the API key / SDK is absent. - server.py endpoints: GET /api/architect/status, GET /api/thesis/{key}/tree, GET /api/thesis/nodes/{id}/variants, POST .../generate, POST .../feedback, POST /api/thesis/lines, POST /api/thesis/lines/{key}/nodes. architect_tools gains get_node_variants. - Dockerfile installs `anthropic`; docker_entrypoint loads ANTHROPIC_API_KEY from /data/secrets/anthropic-api-key (self-disabling until the key is dropped in). Full HTTP surface verified end-to-end (graceful 502 without a key). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 13:25:47 -05:00
Keysat	dd2c34d7bc	Phase 1: investor↔contacts (member_of), system status, thesis seed v1 - entity_resolution: emit member_of relationship edges (contact -> investor), so one investor entity owns many contacts (institution) and a HNWI is the N=1 case; crm_tools.get_investor_contacts + get_entity contacts/member_of; MCP tool. - seed_synthetic: multi-contact institutions to exercise it (Harbor & Vine = 5). - server.py: GET /api/system/status (index/entity/thesis/activity health) for an in-app status view (no shell needed to verify the index). - docs/thesis-seed-v1.md: grounded v1 thesis (throughline, 6 pillars, objections, per-segment angles, voice) drawn from Ten31's newsletter/site/essays. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:47:26 -05:00
Keysat	3e199fd8d5	Phase 1 Workstream A+E: thesis substrate + dual-approval gate - migration 0002_phase1_architect: thesis_lines (core spine + per-segment lines), thesis_nodes (+ append-only revisions), thesis_versions (one-canonical-per-line DB invariant), thesis_reviews (dual approval + feedback), segments. Reversible. - backend/mcp/architect_tools.py: agent draft tools (node tree, versions, segments, get_canonical fails-closed) — NO self-approval path. MCP-exposed. - backend/thesis_review.py + server.py routes: human-gated approval. Dual sign-off via thesis_required_approvals; atomic supersede; every action logged. - docs/PHASE_1.md (kickoff brief); docs/OPERATIONS.md (partner guide); start9/0.4 "Resolve duplicate names" fuzzy action. Verified on synthetic data: dual approval promotes correctly, exactly one canonical survives supersede, get_canonical fails closed, full interaction_log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 10:20:00 -05:00
Keysat	c7ce44d963	Phase 0 foundation: canonical schema, ingest pipeline, CRM MCP server Workstream A–C substrate for the Ten31 agentic system: - A1: docs/crm-overview.md; CLAUDE.md conventions + guardrail #9 - A2: additive/reversible core migration (canonical_entities, entity_links, interaction_log, relationship_edges, soft-delete) + ledgered runner - B1/B3: chunking + deterministic entity resolution (backend/ingest) - B2: dense (bge-m3) + BM25 sparse ingest to Qdrant crm_chunks - C: CRM MCP server (reads, retrieval modes, logged writes) — no outbound tools - docs: redaction/re-hydration, Gmail enablement runbook - synthetic test data; .env.example; housekeeping (.gitignore, untrack crm.db, drop legacy files + start9/0.3.5) Verified end-to-end on synthetic data + live Sparks (hybrid > dense on entity queries). Real backfill runs on Ten31 infra; index holds synthetic data only. Branch snapshot also captures pre-existing working-tree changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-05 08:13:35 -05:00

13 Commits