Files
ten31-database/ROADMAP.md
T

18 KiB
Raw Blame History

Venture CRM Roadmap (Airtable Replacement)

Current status

  • Premium Airtable-like frontend grid exists and is actively iterating.
  • Backend now has production-grade APIs for:
    • GET /api/fundraising/state
    • PUT /api/fundraising/state (with optimistic version check)
    • GET /api/fundraising/export
    • POST /api/fundraising/backup
    • POST /api/fundraising/restore-preview
    • POST /api/fundraising/restore
    • GET /api/fundraising/backups
    • GET/PATCH /api/fundraising/backup-policy
    • GET /api/fundraising/relational-summary
    • GET /api/feature-requests
    • POST /api/feature-requests
    • PATCH /api/feature-requests/:id
  • New DB tables:
    • fundraising_state
    • fundraising_investors
    • fundraising_contacts
    • fundraising_funds
    • fundraising_commitments
    • fundraising_views
    • feature_requests
    • app_settings
  • Grid saves/restores now sync into relational fundraising tables automatically.
  • Formula engine is now sandboxed (no eval/new Function) with expanded function support.
  • Automation engine v1 added:
    • Rule table + toggle API
    • List memberships (main, follow_up, graveyard, longshot, all)
    • Automation run log
  • Collaboration/reliability additions:
    • Unified activity feed API (audit + automation + backup)
    • Backup integrity verification API
    • Better version-conflict metadata (updated_at, updated_by)
  • Security hardening additions:
    • Basic IP rate limiting (login and write APIs)
    • Configurable CORS origin (CRM_CORS_ORIGIN)
    • Production secret enforcement (CRM_ENV=production requires CRM_SECRET_KEY)
    • Security status API + go-live checklist (SECURITY.md)

Phase 1 (Production foundation)

  1. Persist grid + views on backend
  • Wire frontend fundraising grid reads/writes to /api/fundraising/state.
  • Keep localStorage only as emergency fallback.
  • Add autosave debounce and conflict handling (expected_version).
  1. Admin-invite auth model
  • Disable self-register for non-admin users.
  • Add admin-only invite/create-user endpoint.
  • Keep role model: admin, member.
  1. Deployment and remote access
  • Add docker-compose for one-command launch.
  • Reverse proxy + TLS option (Caddy/Traefik) for non-Tailscale deployments.
  • Recommended for your use case: Tailscale private access to laptop host.
  1. Data safety and operations
  • Automated nightly SQLite backups and restore test script.
  • Add /api/fundraising/export for JSON snapshot export.
  • Add health/readiness checks.

Phase 2 (Airtable parity)

  1. Advanced views
  • Multi-condition filter groups (AND/OR groups)
  • Multi-column sorting
  • Pinned/frozen columns
  • Personal vs shared views
  1. Formula engine v2
  • Add functions: SUM, MIN, MAX, ROUND, ABS, CONCAT (done)
  • Type-aware formulas and better errors
  • Dependency graph and recalculation rules
  1. Activity + audit
  • Record-level change history in UI
  • Last modified by / at fields
  • Restore archived rows

Phase 3 (Team workflow and automation)

  1. Tasks/reminders tied to investors/contacts
  2. Automation rules (graveyard/follow-up triggers)
  3. Email/communication integrations (optional)
  4. Granular permissions (if team grows)

Backlog (post-Phase-1 agentic)

Daily activity digest (email to the team)

Requested 2026-06-15. Phase A deployed (v0.1.0:76). Phase B deployed & verified live in v0.1.0:77 (2026-06-16) — digest content + Spark summarization + daily scheduler + by-investor section + admin-panel control + on-demand send. Auto-send defaults OFF until an admin enables it in Settings → Admin.

Decisions (locked 2026-06-15): recipients = all active admins; summarization = Spark-LLM narrative (never Claude — un-anonymized substance stays local); granularity = grouped by user (→ per investor).

Send transport — DECIDED 2026-06-15: Gmail domain-wide delegation (not SMTP). The box's existing service-account grant (which powers email capture) includes gmail.compose, which authorizes users.messages.send — verified by a token-mint probe and a live messages.send to grant. So the digest sends through the account the CRM already uses: no app password, no new account, no admin change. The narrow gmail.send scope is not granted, so the sender must request gmail.compose.

Phase A — DONE: (v0.1.0:75) configureDigestSmtp Start9 action + docker_entrypoint.sh SMTP_* export + backend/smtp_send.py + admin POST /api/admin/digest/test-email (recipient-restricted to the admin set — not an open relay) + Settings button. (v0.1.0:76, redeploy pending) backend/email_integration/gmail_send.py (users.messages.send via DWD/compose) + backend/digest_mailer.py (Gmail-DWD preferred, SMTP fallback); the endpoint + button route through it; sender = CRM_DIGEST_SENDER else first active admin. Tests: test_smtp_send.py, test_smtp_endpoint.py, test_gmail_send.py.

Phase B — DONE (2026-06-15/16): backend/digest_builder.py builds two sectionsby team member (per-user Spark narrative + both directions, with a deterministic fallback) and by investor (team-wide, inbound + outbound, deduped per email, structured). Soft-delete filtered throughout. backend/email_integration/digest_scheduler.py is an always-on daily thread that re-reads a DB-backed policy each cycle and sends once/day at the configured hour to all active admins (window cursor in app_settings). Control moved out of env into the admin panel: app_settings.digest_policy + GET/PATCH /api/admin/digest/policy + a Settings → Admin enable toggle + send-time dropdown (env vars only seed the first-boot default). Plus admin POST /api/admin/digest/send-now + a "Send Digest Now" button. Decisions settled: 6 PM default, always-send (empty-day note), per-user narrative + by-investor section, in-app control (not StartOS). Tests: backend/test_digest_builder.py. Detail: docs/guides/email.md.

Have the CRM send a daily digest email summarizing each registered user's activity — primarily who emailed which investors and the substance of those emails — to the fund principal (and eventually other admins). Scales with the synced-user count: 2 users synced today, ~5 eventually.

  • Source data: the captured email-activity already flowing through the Gmail DWD propose→approve pipeline (backend/email_integration/), keyed per registered user → per investor/contact. Optionally fold in other CRM activity (audit feed, automation runs, new opportunities) later.
  • Send path is NEW capability. Today nothing leaves the box — the system only captures Gmail and creates drafts. This needs outbound SMTP. StartOS 0.4 has a system-wide SMTP account (since v0.4.0-beta.9): the user configures it once for the whole server and services read it via sdk.getSystemSmtp(effects).const(), which returns a T.SmtpValue (host, port, from, username, password, security). Wire the digest sender to that rather than hardcoding any account. Implementation path (researched 2026-06-15, our SDK pin ^0.4.0-beta.66): model a manageSMTP action on gitea-startos / vaultwarden-startos — a three-way selection (system / custom / disabled) built on sdk.inputSpecConstants.smtpInputSpec, persisted to storeJson, with main.ts injecting SMTP_HOST/PORT/USER/PASS/FROM/SECURITY env vars into the daemon exec block (same shape as the existing setAnthropicApiKey.ts action). The Python sender reads them via os.environ and opens smtplib.SMTP/SMTP_SSL. "Custom SMTP" is a dedicated per-package account, fully independent of the server's system SMTP — the custom branch never calls getSystemSmtp, so the digest can send through its own provider even on a box with no system account configured (confirmed in both reference packages). This is the likely fit here: a digest-only mailbox separate from anyone's Gmail. Note StartOS 0.4 dropped the old Config/Properties manifest spec — SMTP config is an action + storeJson, not a manifest config field. SDK note (verified 2026-06-15): our pin ^0.4.0-beta.66 resolves to exactly 0.4.0-beta.66 (caret on a prerelease stays within the 0.4.0 tuple), whose SMTP surface — getSystemSmtpT.SmtpValue {host, port, from, username, password, security}, inputSpecConstants.smtpInputSpec (providers gmail/ses/sendgrid/mailgun/protonmail/other; selection disabled/system/custom), smtpShape, smtpPrefill — is byte-identical to the 1.5.3 reference packages (verified from published tarballs; repo node_modules is absent). Build against beta.66 as-is — no SDK bump needed (moving to 1.x is a major-track change with broad blast radius across startos/, and nothing about SMTP justifies it).
  • Analysis runs on Spark, never Claude. The digest is deliberately un-anonymized (real LP names + email substance), so any summarization/analysis must go through Spark Control to local models — this is the one path that intentionally bypasses the scrub→Claude→re-hydrate boundary, because keeping the substance local is the whole point. Never route digest content to Claude.
  • Exempt from "agents draft, humans send." That rule governs outward LP/prospect contact. This is an internal ops digest to the team's own inboxes — a different category — so an automated daily send here does not violate the draft-only guardrail. State this explicitly at build time.
  • Scheduling: a daily cron, naturally co-located with the existing backend/email_integration/scheduler.py sync cadence.
  • Soft-delete: every aggregate/read in the digest must filter deleted_at IS NULL (see the standing soft-delete rule).

Open design questions (settled at build time): send time = 6 PM box-local (configurable in the admin panel), covering the ~24h window up to send; empty days = always send with a "no activity" note; summary granularity = one per-user narrative plus a by-investor structured section (inbound + outbound, team-wide) added 2026-06-16; enable/time live in the admin panel (DB-backed), not StartOS actions.

Email/communication search + natural-language query

Requested 2026-06-16. Three increments, sequenced 1 → 2 → 3 (1 and 2 first as a quick increment; 3 is a separate, larger build after). Origin: Grant asked whether we can query "emails sent to a specific investor" / "activity by user," and floated NL queries like "existing investors who have committed capital across our funds that we haven't emailed in a while."

Context — the data is captured but currently has NO front-end. The entire Gmail email schema (emails, email_threads, email_investor_links, email_account_messages, email_activity_proposals, …) exists and is populated by the DWD capture pipeline, but is surfaced nowhere in frontend/index.html today (only as inputs to the daily digest). So all three items below are about making already-captured data queryable/visible. Email bodies of matched emails are already chunked + embedded into Qdrant with {lp_id, lp_name, doc_type:"email", date_ts} metadata.

Caveat that shapes all three — the two-model join. "Emails to an investor" link to the fundraising grid (email_investor_links.fundraising_investor_id); "committed capital" lives in the grid too (fundraising_commitments, multi-fund). But manually-logged communications and lp_profiles (single-fund) live in the classic model, and the two models are only bridged by fuzzy email/name matching (no authoritative join key). Any query spanning "committed capital" + "email recency" must reckon with this. Prefer the grid side as the higher-signal source (matcher already does).

1. Activity query endpoints + panel (do first). The logic already exists and is tested inside backend/digest_builder.pycollect_user_activity() (per team-member, sent vs received, with matched investor names) and collect_investor_activity() (re-pivoted by investor, team-wide). Expose them as on-demand endpoints (e.g. GET /api/activity?user_id=…&since=…&until=… and …?investor_id=…) returning the actual records (not just the counts that /api/reports/activity gives today), plus a simple UI panel. Answers "emails to investor X" and "what has user Y sent lately" interactively. Small build — mostly assembling tested parts + a thin UI. Soft-delete filter every read.

2. Email content search box (do first, alongside 1). Wire a search box onto the email bodies already indexed in Qdrant (capability is ~80% built — see the retrieval modes in backend/ingest/search.py and the MCP hybrid_search/semantic_search/keyword_search tools). This is semantic/lexical search over email content ("find where we discussed the mining deal"), distinct from the structured filters in item 1. Decide placement (global search bar vs. a dedicated email/search page — note there's no email UI at all today, so this may pair naturally with surfacing threads). Small.

3. Natural-language → safe structured query (separate, larger, after 1 & 2). An LLM translates a plain-English question into a safe, read-only DB query against the CRM, for relational/analytical questions that semantic search cannot answer — Grant's example ("committed across funds AND not emailed in a while") is joins + aggregates + recency, not a text-topic match. Design constraints (locked at request time, refine at build):

  • LLM = Claude behind the redaction boundary (better at text-to-SQL than local Qwen; the scrub→Claude→re-hydrate path already exists for the PII concern). Not Spark — Spark Control offers embeddings/rerank/RAG + local chat, but no text-to-SQL.
  • Safety is the hard part, not the parsing. Do NOT hand the LLM open-ended SQL against the live DB (soft-delete leaks, injection, runaway scans). Constrain it: read-only connection/view, a curated/parameterized query surface or a validated query AST, soft-delete-filtered views, row/time caps. Treat as its own designed feature with its own tests.
  • Must reckon with the two-model join caveat above (capital lives in the grid; recency from email links).

Consolidate on the fundraising grid as canonical; retire vestigial classic-CRM surfaces

Decided 2026-06-16. The CRM carries two stacked models: the original generic CRM (contacts / lp_profiles / opportunities / manual communications) and the fundraising grid + email capture. The team uses the grid; most classic surfaces are un-adopted (verified on the box: Pipeline + Communications empty, Contacts auto-populated from the grid). Decision: the fundraising grid + email capture is the canonical system of record; prune or repurpose the rest rather than maintain a parallel half-empty CRM.

Retire lp_profiles + LP Tracker — DONE & deployed live (v0.1.0:78, 2026-06-16). 21/21 backend tests green, py_compile clean; installed to the box (installed-version0.1.0:78, migration chain …77→78 clean, server up on :8080).

  • Removed the orphaned LPTrackerPage component + the lp-trackerfundraising-grid redirect (frontend).
  • Removed the /api/lp-profiles* endpoints (list/get/create/update) and their handlers, the unused lp-breakdown report + route, the contact-dossier LP display (frontend + the lp_profile block in handle_get_contact), and the demo-seed LP block.
  • Dashboard KPIs repointed: "Total Committed" now sums fundraising_investors.total_invested (the canonical grid rollup), excluding graveyarded investors so the headline reflects live committed capital — a deliberate divergence from /api/fundraising/relational-summary, which sums all rows. "Total Funded" dropped — the grid has no funded-vs-committed concept and the frontend never rendered it. (If a funded/wired status is wanted later, that's a new grid feature, not a revival of lp_profiles.) Regression-guarded by test_dashboard_report.py.
  • Left in place (intentional): the empty lp_profiles table + index (no destructive drop, per never-hard-delete); the contact-delete soft-delete cascade; the --reset-all-data clear; and the inert MOCK_MODE mockDb.lp_profiles fixtures (dev-only fallback, never hits the backend — its dashboard mock still reads mock lp_profiles, a known dev-only divergence from the real backend). Updated test_soft_delete_reads.py to drop the now-removed lp_profile assertions (kept its org total_funded opportunities-aggregate checks).

Adopt the Pipeline — wire it to the grid.

  • Pipeline (opportunities) is fully built and functional but unused. Keep it: it's the one classic surface that tracks something the grid doesn't — a forward-looking deal funnel (stage, expected_amount × probability, owner, close date) vs. the grid's actual committed dollars + flags.
  • New idea (Grant, 2026-06-16): let users flag an investor in the grid as a pipeline opportunity (a grid column/control) so it auto-creates / syncs an opportunities row that loads into the Pipeline board. Design the grid↔pipeline link (which fund seeds it? what sets stage/expected amount? keep them reconciled). Turns Pipeline from a disconnected second data-entry surface into a view driven by the canonical grid.
  • Revisit the stray contact-create side-door (the "Create Opportunity" modal POST /api/contacts, frontend/index.html:6030) once the grid-driven flow exists.

Keep the Contacts table — as the read-only per-person directory it already is. Confirmed 2026-06-16: the grid models investor entity → many people correctly today. The grid "contacts" column is a multi-pill editor; each pill syncs to a fundraising_contacts row AND its own classic contacts row (5-person family office → 1 investor + 5 contacts, linked via fundraising_contacts.contact_id, migration 0004). The Contacts page is read-only for creation (header: "added from the Fundraising Grid"; no New-Contact button), edit-only via the detail slide-over — the desired flow already holds. Email capture already rolls multiple people up to one investor (matcher indexes each pill's email separately, all → same fundraising_investor_id; email_investor_links records both investor and specific person). No build here — future email-surfacing UI should present comms grouped by investor across all its people.

Definition of done for "Airtable substitute" v1

  • Team can manage all investors in one master table
  • Saved views replicate current Airtable workflows
  • CSV import from Airtable is reliable and repeatable
  • Data persists safely and supports multi-user access
  • Auth is invite-only and backups are automated