Retire lp_profiles + LP Tracker; repoint Dashboard committed to the grid (v0.1.0:78)

The fundraising grid + email capture is the canonical system of record. lp_profiles was a superseded single-fund model with no reachable create/edit path, and the LP Tracker page was already orphaned (no nav entry + a redirect bouncing it to the grid). - Remove /api/lp-profiles* endpoints + handlers, the unused lp-breakdown report, the contact-dossier LP section, the demo-seed LP block, and (frontend) the LPTrackerPage component + its lp-tracker->fundraising-grid redirect. - Dashboard "Total Committed" now sums fundraising_investors.total_invested (graveyarded investors excluded) instead of the orphaned lp_profiles table, which read ~$0. "Total Funded" dropped: the grid tracks commitments, not a funded amount, and the frontend never rendered it. - Leave the empty lp_profiles table/index, the contact-delete soft-delete cascade, and the --reset-all-data clear in place (never-hard-delete). - Tests: add test_dashboard_report.py; update test_soft_delete_reads.py. 21/21 green.
2026-06-16 10:48:53 -05:00
parent 5cda84a7c0
commit 108210d8e1
8 changed files with 180 additions and 486 deletions
@@ -108,6 +108,38 @@ Have the CRM send a **daily digest email** summarizing each registered user's ac

 Open design questions (settled at build time): send time = **6 PM box-local** (configurable in the admin panel), covering the ~24h window up to send; empty days = **always send** with a "no activity" note; summary granularity = **one per-user narrative** plus a **by-investor structured section** (inbound + outbound, team-wide) added 2026-06-16; enable/time live in the **admin panel** (DB-backed), not StartOS actions.

+### Email/communication search + natural-language query
+*Requested 2026-06-16. Three increments, **sequenced 1 → 2 → 3** (1 and 2 first as a quick increment; 3 is a separate, larger build after). Origin: Grant asked whether we can query "emails sent to a specific investor" / "activity by user," and floated NL queries like "existing investors who have committed capital across our funds that we haven't emailed in a while."*
+
+**Context — the data is captured but currently has NO front-end.** The entire Gmail email schema (`emails`, `email_threads`, `email_investor_links`, `email_account_messages`, `email_activity_proposals`, …) exists and is populated by the DWD capture pipeline, but is surfaced **nowhere** in `frontend/index.html` today (only as inputs to the daily digest). So all three items below are about making already-captured data queryable/visible. Email bodies of *matched* emails are already chunked + embedded into Qdrant with `{lp_id, lp_name, doc_type:"email", date_ts}` metadata.
+
+**Caveat that shapes all three — the two-model join.** "Emails to an investor" link to the **fundraising grid** (`email_investor_links.fundraising_investor_id`); "committed capital" lives in the grid too (`fundraising_commitments`, multi-fund). But manually-logged `communications` and `lp_profiles` (single-fund) live in the **classic** model, and the two models are only bridged by fuzzy email/name matching (no authoritative join key). Any query spanning "committed capital" + "email recency" must reckon with this. Prefer the grid side as the higher-signal source (matcher already does).
+
+**1. Activity query endpoints + panel (do first).** The logic already exists and is tested inside `backend/digest_builder.py` — `collect_user_activity()` (per team-member, sent vs received, with matched investor names) and `collect_investor_activity()` (re-pivoted by investor, team-wide). Expose them as on-demand endpoints (e.g. `GET /api/activity?user_id=…&since=…&until=…` and `…?investor_id=…`) returning the actual records (not just the counts that `/api/reports/activity` gives today), plus a simple UI panel. Answers "emails to investor X" and "what has user Y sent lately" interactively. Small build — mostly assembling tested parts + a thin UI. Soft-delete filter every read.
+
+**2. Email content search box (do first, alongside 1).** Wire a search box onto the email bodies **already indexed in Qdrant** (capability is ~80% built — see the retrieval modes in `backend/ingest/search.py` and the MCP `hybrid_search`/`semantic_search`/`keyword_search` tools). This is semantic/lexical search over email *content* ("find where we discussed the mining deal"), distinct from the structured filters in item 1. Decide placement (global search bar vs. a dedicated email/search page — note there's no email UI at all today, so this may pair naturally with surfacing threads). Small.
+
+**3. Natural-language → safe structured query (separate, larger, after 1 & 2).** An LLM translates a plain-English question into a **safe, read-only** DB query against the CRM, for relational/analytical questions that semantic search *cannot* answer — Grant's example ("committed across funds AND not emailed in a while") is joins + aggregates + recency, not a text-topic match. Design constraints (locked at request time, refine at build):
+  - **LLM = Claude behind the redaction boundary** (better at text-to-SQL than local Qwen; the scrub→Claude→re-hydrate path already exists for the PII concern). Not Spark — Spark Control offers embeddings/rerank/RAG + local chat, but **no text-to-SQL**.
+  - **Safety is the hard part, not the parsing.** Do NOT hand the LLM open-ended SQL against the live DB (soft-delete leaks, injection, runaway scans). Constrain it: read-only connection/view, a curated/parameterized query surface or a validated query AST, soft-delete-filtered views, row/time caps. Treat as its own designed feature with its own tests.
+  - Must reckon with the two-model join caveat above (capital lives in the grid; recency from email links).
+
+### Consolidate on the fundraising grid as canonical; retire vestigial classic-CRM surfaces
+*Decided 2026-06-16. The CRM carries two stacked models: the original generic CRM (contacts / lp_profiles / opportunities / manual communications) and the fundraising grid + email capture. The team uses the grid; most classic surfaces are un-adopted (verified on the box: Pipeline + Communications empty, Contacts auto-populated from the grid). **Decision: the fundraising grid + email capture is the canonical system of record;** prune or repurpose the rest rather than maintain a parallel half-empty CRM.*
+
+**Retire `lp_profiles` + LP Tracker — DONE in code (2026-06-16), not yet deployed.** 21/21 backend tests green, `py_compile` clean. Needs a version bump + install to reach the box.
+- Removed the orphaned `LPTrackerPage` component + the `lp-tracker`→`fundraising-grid` redirect (frontend).
+- Removed the `/api/lp-profiles*` endpoints (list/get/create/update) and their handlers, the unused `lp-breakdown` report + route, the contact-dossier LP display (frontend + the `lp_profile` block in `handle_get_contact`), and the demo-seed LP block.
+- **Dashboard KPIs repointed:** "Total Committed" now sums `fundraising_investors.total_invested` (the canonical grid rollup), **excluding graveyarded investors** so the headline reflects live committed capital — a deliberate divergence from `/api/fundraising/relational-summary`, which sums all rows. "Total Funded" dropped — the grid has no funded-vs-committed concept and the frontend never rendered it. (If a funded/wired status is wanted later, that's a new grid feature, not a revival of lp_profiles.) Regression-guarded by `test_dashboard_report.py`.
+- **Left in place (intentional):** the empty `lp_profiles` table + index (no destructive drop, per never-hard-delete); the contact-delete soft-delete cascade; the `--reset-all-data` clear; and the inert MOCK_MODE `mockDb.lp_profiles` fixtures (dev-only fallback, never hits the backend — its dashboard mock still reads mock lp_profiles, a known dev-only divergence from the real backend). Updated `test_soft_delete_reads.py` to drop the now-removed `lp_profile` assertions (kept its org `total_funded` opportunities-aggregate checks).
+
+**Adopt the Pipeline — wire it to the grid.**
+- Pipeline (`opportunities`) is fully built and functional but unused. Keep it: it's the one classic surface that tracks something the grid doesn't — a forward-looking deal funnel (stage, `expected_amount × probability`, owner, close date) vs. the grid's actual committed dollars + flags.
+- New idea (Grant, 2026-06-16): let users **flag an investor in the grid as a pipeline opportunity** (a grid column/control) so it **auto-creates / syncs an `opportunities` row** that loads into the Pipeline board. Design the grid↔pipeline link (which fund seeds it? what sets stage/expected amount? keep them reconciled). Turns Pipeline from a disconnected second data-entry surface into a view driven by the canonical grid.
+- Revisit the stray contact-create side-door (the "Create Opportunity" modal `POST /api/contacts`, `frontend/index.html:6030`) once the grid-driven flow exists.
+
+**Keep the Contacts table — as the read-only per-person directory it already is.** Confirmed 2026-06-16: the grid models **investor entity → many people** correctly today. The grid "contacts" column is a multi-pill editor; each pill syncs to a `fundraising_contacts` row AND its own classic `contacts` row (5-person family office → 1 investor + 5 contacts, linked via `fundraising_contacts.contact_id`, migration 0004). The Contacts page is **read-only for creation** (header: "added from the Fundraising Grid"; no New-Contact button), edit-only via the detail slide-over — the desired flow already holds. Email capture already rolls **multiple people up to one investor** (matcher indexes each pill's email separately, all → same `fundraising_investor_id`; `email_investor_links` records both investor and specific person). No build here — future email-surfacing UI should present comms grouped by investor across all its people.
+
 ## Definition of done for "Airtable substitute" v1
 - Team can manage all investors in one master table
 - Saved views replicate current Airtable workflows