Workstream A–C substrate for the Ten31 agentic system: - A1: docs/crm-overview.md; CLAUDE.md conventions + guardrail #9 - A2: additive/reversible core migration (canonical_entities, entity_links, interaction_log, relationship_edges, soft-delete) + ledgered runner - B1/B3: chunking + deterministic entity resolution (backend/ingest) - B2: dense (bge-m3) + BM25 sparse ingest to Qdrant crm_chunks - C: CRM MCP server (reads, retrieval modes, logged writes) — no outbound tools - docs: redaction/re-hydration, Gmail enablement runbook - synthetic test data; .env.example; housekeeping (.gitignore, untrack crm.db, drop legacy files + start9/0.3.5) Verified end-to-end on synthetic data + live Sparks (hybrid > dense on entity queries). Real backfill runs on Ten31 infra; index holds synthetic data only. Branch snapshot also captures pre-existing working-tree changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
27 KiB
CRM Overview — Storage, Data Model, API, Auth
Workstream A1 deliverable (see PHASE_0.md). Read-only documentation of the existing CRM as of 2026-06. Every concrete claim is anchored to file:line. This is a description of what exists today, not a proposal — the schema-extension proposal for A2 lives separately.
0. TL;DR for Phase 0
- One Python file, no framework. The whole backend is
backend/server.py(~4,530 lines): a stdlibhttp.server.ThreadingHTTPServerwith a hand-writtenCRMHandler(BaseHTTPRequestHandler)and manual path dispatch.requirements.txtlists FastAPI/SQLAlchemy/Alembic/Pydantic but none are imported — they are vestigial. - Storage is one SQLite file (
data/crm.db), WAL mode, opened fresh per request. Schema is created idempotently in-code at boot. There is no Alembic; "migrations" areCREATE TABLE IF NOT EXISTS+ best-effortALTER TABLE ADD COLUMN. - Two parallel investor data models coexist with no shared key: (1) the classic
contacts / organizations / opportunities / communications / lp_profilesCRM, and (2) the newer, actively-usedfundraising_*collaborative grid. They are bridged only by fuzzy name/email matching. This duality is the central entity-resolution problem for Phase 0. - A real Gmail subsystem (
backend/email_integration/) stores threaded correspondence incrm.dband matches emails to investors — but is self-disabling (off unless a service-account key is present). - Auth is a single scheme: username/password → HS256 JWT (Bearer header), re-validated against the
userstable each request; two roles (admin/member). TheX_API_KEYnamed inCLAUDE.md/PHASE_0.mddoes not exist in the code — it is aspirational. - Guardrail flags: all deletes are hard deletes (violates guardrail #3 as written); a destructive
POST /api/admin/reset-all-dataexists;audit_logis mutation-only and is not the append-only interaction log Phase 0 wants.
1. Storage engine & where it runs
1.1 Runtime
- Server:
ThreadingHTTPServer((HOST, PORT), CRMHandler),daemon_threads = True,serve_forever()—backend/server.py:4509. Handler class atbackend/server.py:1418. Pure Python stdlib (http.server); not FastAPI/uvicorn despitebackend/requirements.txt:1-2. - Concurrency model: one OS thread per request. Safe because each request opens its own short-lived SQLite connection under WAL (rationale documented inline at
backend/server.py:4506-4508). - Request lifecycle: banned-IP check → per-IP rate limit → (email module hook) → manual
if path == …dispatch. Body is read once and cached on the handler instance (get_body(),backend/server.py:1433). Malformed JSON silently becomes{}. - Background threads: a backup scheduler loops every 60 s (
start_backup_scheduler,backend/server.py:1367); an optional Gmail sync scheduler starts only ifCRM_GMAIL_INTEGRATION_ENABLEDis truthy (backend/server.py:4498). - Default bind:
0.0.0.0:8080, plain HTTP — TLS is expected to be terminated upstream (Start9 / Tailscale).
1.2 SQLite configuration
get_db() (backend/server.py:77-84) sets, on every connection:
PRAGMA journal_mode=WAL— concurrent readers + single writer (this is what makes the ingest reader safe against the live writer).PRAGMA foreign_keys=ON— FKs are enforced at runtime (per-connection in SQLite, so re-set each time).PRAGMA busy_timeout=5000— 5 s wait on a lock.row_factory = sqlite3.Row.
The Gmail module re-implements the identical pragmas (email_integration/scheduler.py:49, email_integration/routes.py:89) rather than import server.py, to avoid a circular import.
1.3 Schema bootstrap & "migrations"
init_db()(backend/server.py:86) runs once at startup, before binding. One bigexecutescriptofCREATE TABLE/INDEX IF NOT EXISTS(backend/server.py:91-405) creates both data models plusapp_settings.- Core "migrations": a hardcoded list of
ALTER TABLE … ADD COLUMNwrapped in a try/except that swallowsOperationalError(backend/server.py:407-418) — additive-only, idempotent-by-failure. No version table, no down-migrations. (Currently addscity/state/country/location_querytocontactsandlead_sourcetofundraising_investors.) - The only real migration runner is in the Gmail module:
email_integration.db.apply_migrations()(email_integration/db.py:23) runs numberedNNNN_*.sqlfiles lexicographically. There is one today:migrations/0001_email_tables.sql. ⚠️ This is called frominit_db()guarded only byImportError(backend/server.py:421-427), so on any image where the package is importable, the email tables are created even when Gmail sync is disabled.
1.4 data/ layout
crm.db,crm.db-wal,crm.db-shm— the DB + WAL + shared-memory.backups/— JSON snapshots of the fundraising grid state only (not the whole DB), written by the backup scheduler.secrets/— holdsgmail-service-account.json(mode 600).email_attachments/— Gmail attachment bytes on disk (created on the 0.4 image)..crm-secret— persisted JWT secret, written by the container entrypoint (not the app).
1.5 Production deployment (StartOS)
Package id ten-database ("Ten31 Database"). Both generations run the same app (python3 /app/backend/server.py) in a python:3.11-slim container; all state on a single persistent volume main mounted at /data.
start9/0.4/— current/live target. Manifest & lifecycle are TypeScript understart9/0.4/startos/(manifest/index.ts,main.ts,backups.ts,interfaces.ts). Built forx86_64+aarch64. Whole-volume backups (sdk.Backups.ofVolumes('main')). The richerdocker_entrypoint.shcreates/data/{backups,secrets,email_attachments}, persistsCRM_SECRET_KEYto/data/.crm-secret, and conditionally enables Gmail iff/data/secrets/gmail-service-account.jsonexists (then exports DWD env:CRM_GMAIL_AUTH_METHOD=dwd,CRM_GMAIL_WORKSPACE_DOMAIN=ten31.xyz, sync interval 180 min). Version notes record the 0.3.5→0.4 migration is complete and the live/datavolume is the sole source of truth (no more baked-in seed snapshot).start9/0.3.5/— legacy. Hand-written YAML manifest (manifest.yaml), arm64-only, Tor80→8080+ LAN443(ssl)→8080. Superseded by 0.4.- Local/dev:
start.shrunspython3 backend/server.pywith dev defaults.start_beta.shis a Tailscale launcher that sources.env.beta, forcesCRM_ENV=production, and requires a ≥24-charCRM_SECRET_KEY. In production mode the app refuses to start withoutCRM_SECRET_KEY(backend/server.py:4487).
1.6 Environment variables (for CLAUDE.md "CRM connection vars")
Core server (backend/server.py:42-71): CRM_DATA_DIR, CRM_FRONTEND_DIR, CRM_DB_PATH (default <DATA_DIR>/crm.db), CRM_SECRET_KEY (JWT signing — required in production), CRM_HOST (default 0.0.0.0), CRM_PORT (default 8080), CRM_CORS_ORIGIN (default *), CRM_ENV (default development), CRM_LOGIN_RATE_LIMIT_PER_MIN, CRM_WRITE_RATE_LIMIT_PER_MIN, CRM_GET_RATE_LIMIT_PER_MIN, CRM_ABUSE_404_THRESHOLD, CRM_ABUSE_404_WINDOW_SEC, CRM_ABUSE_BAN_SEC, CRM_SEED_DEMO_DATA, CRM_GMAIL_INTEGRATION_ENABLED.
Gmail module (email_integration/config.py:80-101): CRM_GMAIL_AUTH_METHOD (dwd/oauth), CRM_GMAIL_SA_KEY_PATH, CRM_GMAIL_WORKSPACE_DOMAIN, CRM_GMAIL_OAUTH_CLIENT_ID/SECRET/REDIRECT_URI, CRM_GMAIL_SECRET_KEY (AES key for OAuth-token-at-rest — separate from CRM_SECRET_KEY), CRM_GMAIL_SYNC_INTERVAL_MIN, CRM_GMAIL_BACKFILL_PAGE_SIZE, CRM_GMAIL_MAX_ATTACHMENT_MB, plus rate/retry knobs.
There is no network DB protocol. "Connecting to the CRM" means either (a) opening the same SQLite file (
CRM_DB_PATH) — only possible co-located with the/datavolume — or (b) HTTP athttp://<host>:8080with a Bearer JWT.
2. Data model
PKs are 8-char truncated UUIDs (generate_id() = str(uuid.uuid4())[:8], backend/server.py:522). Timestamps are ISO-8601 UTC strings (now(), backend/server.py:525). JSON-bearing TEXT columns (tags, attendees, options) are json.dumps'd on write and auto-parsed on read by row_to_dict() (backend/server.py:506-517).
2.1 Classic CRM model
| Table | Role | Key columns / notes |
|---|---|---|
users |
auth + ownership principal | username/email UNIQUE, password_hash, role ∈ {admin,member}, is_active. First user forced admin. (backend/server.py:92) |
organizations |
weak parent of contacts/opps | name (not unique), type (free-text, default other), tags JSON, description. (backend/server.py:104) |
contacts |
the hub | first_name/last_name (req), organization_id (FK SET NULL), contact_type (free-text; load-bearing values prospect/investor), status (default active), source, tags JSON, notes, linkedin_url. (backend/server.py:123) |
opportunities |
deal pipeline | contact_id (req, FK CASCADE), stage (allowlist PIPELINE_STAGES at backend/server.py:1380, enforced only on the stage endpoint), commitment_amount, expected_amount, fund_name, owner_id, lost_reason. (backend/server.py:148) |
lp_profiles |
closed-LP extension | 1:1 with a contact (contact_id UNIQUE, FK CASCADE). Holds commitment_amount, funded_amount, accredited (bare 0/1), legal_docs_signed, wire_received, k1_sent, investor_type (free-text). (backend/server.py:186) |
custom_fields / custom_field_values |
EAV custom fields | Dead: schema exists but has no routes/handlers; only ever wiped by reset. Do not build on this. (backend/server.py:206) |
tags |
global tag palette | name UNIQUE + color. Not FK-linked to the per-row tags JSON arrays; just an autocomplete source. (backend/server.py:237) |
audit_log |
mutation diff trail | user_id, entity_type, entity_id, action, changes JSON. Mutation-only, no reads, no actor/agent dimension. (backend/server.py:227) |
How an LP is represented: a single contacts row is the canonical record; contact_type carries the funnel stage (prospect→investor). Promotion to investor is a side effect of creating an lp_profiles row (backend/server.py:2834) or of fundraising-grid sync (backend/server.py:788). The contact dossier is assembled by GET /api/contacts/{id} (backend/server.py:2008): contact + last-20 communications + all opportunities + the one lp_profile. Note the unreconciled double-modeling of money: in-flight commitment lives on opportunities, closed commitment lives on lp_profiles, and the grid has a third copy in fundraising_commitments.
2.2 Fundraising grid model (newer, actively used)
A real-time collaborative spreadsheet the partners actually edit: funds are columns, investors are rows, dollar commitments are cells, plus saved views, live presence/cell-locks, and a small automation engine.
- Authoritative store = one JSON blob:
fundraising_state.grid_json+views_json(rowid='main',backend/server.py:258). Reads/exports come straight from this. - Normalized tables are a derived mirror, fully rebuilt from the JSON on every write by
sync_fundraising_relational()(backend/server.py:945):fundraising_investors(keyed bysource_row_id),fundraising_funds(bycolumn_id),fundraising_commitments(cells),fundraising_contacts,fundraising_views. ⚠️fundraising_contacts/fundraising_commitments/viewsget fresh UUIDs on every save (DELETE+reinsert); onlyfundraising_investors.idandfundraising_funds.idare stable. Don't persist external references to the volatile ones. - Automation engine (
run_fundraising_automations,backend/server.py:668): currently ignores the rules'condition_json/action_jsonand uses hard-coded flag logic (graveyard→graveyard list,follow_up→follow_up list, everyone→all), rebuildingfundraising_list_membershipsand logging changes tofundraising_automation_runs. So the rules table is display/config surface, not a live interpreter. - Backups: JSON-only filesystem snapshots of grid state to
data/backups/(manual/auto/pre_restore), governed by a policy inapp_settings. Restore overwrites state then re-syncs the mirror.
2.3 The two-model bridge (the central problem)
There is no foreign key between fundraising_investors and the classic contacts/organizations/opportunities/lp_profiles. They are joined only by best-effort name/email matching, essentially one-directional grid → classic:
- Grid → classic (write-through): every grid save pushes each grid contact into classic
contacts/organizationsvia_upsert_contact_from_fundraising(backend/server.py:730), matching by lowercased email else (name + org). No stored key links the resultingcontacts.idback tofundraising_investors.id— it re-matches by name/email each time. - Grid-logged comms → classic
communications:POST /api/fundraising/log-communication(backend/server.py:2561) writes into the classiccommunicationstable and appends a dated line to the grid row'snotes. - Classic → grid (partial reverse mirror):
_sync_contact_to_fundraising_state(backend/server.py:815) patches an existing matching grid row but will not create a new investor row.
Net: the same investor can simultaneously exist as a fundraising_investors row, one-or-more contacts rows, an organizations row, and an lp_profiles row, with no authoritative join key. The email matcher treats the fundraising side as higher-signal (matcher.py:103). lp_profiles is entirely outside the fundraising subsystem.
2.4 Notes, interactions & correspondence (Phase-0 critical)
Three subsystems hold embeddable text:
(a) communications (backend/server.py:168) — the primary human-logged activity store. One row per note/call/email/meeting/text (type is free-text; UI offers those 5, frontend/index.html:4220). Columns: contact_id (req, CASCADE), opportunity_id (SET NULL), subject, body, communication_date (the event timestamp, distinct from created_at — this is the date_ts source), outcome, next_action, attendees JSON. Written by POST /api/communications and by the fundraising log endpoint. Hard delete at backend/server.py:2758.
(b) Scattered free-text fields worth embedding: contacts.notes, lp_profiles.notes, fundraising_investors.notes (a running, newline-appended outreach log mirroring the grid Notes column), opportunities.description/next_step, organizations.description.
(c) Gmail correspondence (backend/email_integration/, schema in migrations/0001_email_tables.sql):
emails— canonical record deduped by RFCmessage_id;subject,from_*,to/cc/bccJSON,sent_at,body_text,body_html,snippet,is_matched,match_status. ⚠️ Bodies are stored only for matched emails; unmatched emails are metadata-only with the body nulled (sync.py:319).email_threads— thread roll-up; threading via RFCIn-Reply-To/Referenceschain then Gmail thread id (threads.py:38).email_account_messages— per-mailbox sighting (dedup across team inboxes).email_attachments— metadata; bytes on disk, deduped by SHA-256.email_investor_links— the entity linkage. Populates any subset offundraising_investor_id/fundraising_contact_id/contact_id/organization_id(all soft references, no FK) withmatch_kind(exact_emailconf 1.0 /domain_matchconf 0.6) and confidence. A single email can link to several entities at once.email_sync_runsrecords per-run observability.
audit_log is not the interaction log. It is mutation-only (≈26 write sites), logs no reads, and has no agent/actor dimension. Phase 0's "append-only interaction log of every agent action and every human touch" (Workstream A2) needs a new table, not a repurpose.
3. API / route surface
Full REST verbs exist (mutations are not tunneled through POST): do_GET (1589), do_POST (1727), do_PUT (1791), do_PATCH (1817), do_DELETE (1845), do_OPTIONS (1580), all in backend/server.py. Routing is a flat if/elif ladder: exact string for collections, re.match(r'^/api/x/[^/]+$') for items, path params parsed positionally with path.split('/'). The Gmail module hooks in at the top of do_GET/do_POST via try_handle(self) (email_integration/routes.py:49), claiming any /api/email/* path.
Response envelope: reads/writes → {"data": …} (+total/limit/offset for lists); errors → {"error": msg}; create → 201; auth → bare {"token","user"}; email handlers use ad-hoc keys. CORS allows all verbs; Access-Control-Allow-Origin echoes CRM_CORS_ORIGIN (default *).
Auth column: None = public · Bearer = any active user · Admin = require_admin.
| Method | Path | Purpose | Auth |
|---|---|---|---|
| GET | /, /index.html, /assets/* |
Serve SPA + static | None |
| GET | /api/health |
Liveness | None |
| GET | /api/bootstrap/status |
First-run check | None |
| POST | /api/auth/login |
Login → JWT | None |
| POST | /api/auth/register |
First-user registration (self-disables) | None |
| GET/POST | /api/contacts |
List/search · Create | Bearer |
| GET/PUT/DELETE | /api/contacts/{id} |
Detail (dossier) · Update · Hard delete | Bearer |
| GET | /api/contacts/{id}/communications |
Per-contact interaction history | Bearer |
| GET/POST · GET/PUT/DELETE | /api/organizations[/{id}] |
Org CRUD | Bearer |
| GET/POST · GET/PUT/DELETE | /api/opportunities[/{id}] |
Opp CRUD | Bearer |
| PATCH | /api/opportunities/{id}/stage |
Move pipeline stage (validated) | Bearer |
| GET/POST · GET/PUT/DELETE | /api/communications[/{id}] |
Comms CRUD | Bearer |
| GET/POST · GET/PUT | /api/lp-profiles[/{id}] |
LP-profile CRUD (no delete route) | Bearer |
| GET | /api/reports/{dashboard,pipeline,lp-breakdown,activity} |
Aggregates | Bearer |
| GET | /api/export/contacts |
Export all contacts (returns JSON, not CSV) | Bearer |
| POST | /api/import/csv |
Bulk import from JSON rows | Bearer |
| GET/POST · PATCH | /api/feature-requests[/{id}] |
Feature-request tracker | Bearer |
| GET | /api/users |
List users (no hashes) | Bearer |
| POST · PATCH | /api/admin/users[/{id}] |
Create / update user | Admin |
| POST | /api/admin/reset-all-data |
⚠️ Wipe CRM (confirm phrase RESET ALL DATA) |
Admin |
| GET | /api/audit-log |
Mutation audit trail | Admin |
| GET | /api/security/status |
Config/security status | Admin |
| GET/PUT | /api/fundraising/state |
Get / save grid (optimistic version, 409 on conflict) |
Bearer |
| GET/POST | /api/fundraising/collab/{state,heartbeat} |
Presence + cell locks | Bearer |
| POST | /api/fundraising/log-communication |
Log comm + append grid note | Bearer |
| GET | /api/fundraising/{export,relational-summary} |
Export / counts | Bearer |
| GET | /api/fundraising/activity |
Merged audit+automation+backup feed | Admin |
| GET/PATCH | /api/fundraising/automations[/{id}] |
Automation rules | Admin |
| GET | /api/fundraising/automation-runs |
Run history | Admin |
| GET/POST | /api/fundraising/{backups,backup,backup-verify} |
Backup mgmt | Admin |
| POST | /api/fundraising/{restore-preview,restore} |
Restore grid | Admin |
| GET/PATCH | /api/fundraising/backup-policy |
Backup policy | Admin |
| GET | /api/email/{status,accounts,threads} |
Sync status / accounts / matched threads | Bearer + flag |
| GET | /api/email/oauth/{start,callback} |
Per-user OAuth (callback is state-token gated, no Bearer) | mixed + flag |
| POST | /api/email/accounts/{enroll-all,enroll} · /sync/run-now · /rematch |
Enrollment & sync ops | Admin + flag |
Defined but NOT routed: handle_list_tags/create/delete exist (backend/server.py:3366-3400) but no /api/tags route is wired; the custom_fields tables have no routes at all. Treat both as dead for Phase 0.
4. Authentication & authorization
- Login:
POST /api/auth/login→handle_login(backend/server.py:1880). Looks up active user,verify_password(bcrypt, PBKDF2-SHA256 fallback,backend/server.py:444), issuescreate_token(HS256 JWT via PyJWT, HMAC fallback; claimsuser_id/username/role/exp/iat; 24 h expiry;backend/server.py:464). - Per-request verification:
get_user()(backend/server.py:1458) readsAuthorization: Bearer,decode_token(pinsalgorithms=["HS256"]— noalg:nonedowngrade), then re-loads the user row and rejects if missing/inactive. Identity (incl. role) comes from the DB row, not token claims, so deactivation and role changes take effect immediately. - No cookies, no logout, no refresh, no revocation. The only early kill-switch is
is_active=0. - Bootstrap:
GET /api/bootstrap/status(public) reportssetup_required.POST /api/auth/register(public, self-disables once any user exists) creates and force-promotes the first user toadmin. A separate flag-gatedseed_demo_data()hardcodesadmin/admin123+grant/passwordand prints them — dev-only, off by default (backend/server.py:4351-4374). - Roles: only
admin/member. Enforcement is an inlinerequire_admin(user)(backend/server.py:541) at the top of each admin handler — no middleware. No row-level authorization: any active member can read/edit all LP and fundraising data;created_by/owner_idare informational only. X_API_KEYdoes not exist in code. Repo-wide it appears only inCLAUDE.mdanddocs/PHASE_0.md. There is no API-key header path and no service-auth distinct from the user JWT. (TheBearertokens inemail_integration/are outbound Google OAuth tokens, unrelated.)- Secrets: JWT key
CRM_SECRET_KEY(random per-process default; hard-fails in production if unset,backend/server.py:4487). Gmail OAuth tokens are encrypted at rest with AES-256-GCM keyed off the separateCRM_GMAIL_SECRET_KEY(email_integration/crypto.py:37) — a genuinely sound scheme. The Google service-account key lives atdata/secrets/gmail-service-account.json(0600). - Hardening present: per-IP sliding-window rate limits (login 20/min, writes 300/min, GETs 600/min); auto-ban of scanner IPs after a 404 burst (
record_404,backend/server.py:1520). Absent: security headers (HSTS/CSP/X-Frame-Options), CORS defaults to wildcard*,X-Forwarded-Foris trusted (only safe behind a controlled proxy).
5. Phase-0 implications (carry into A2/B/C)
- Pick a canonical LP identity. The classic vs fundraising duality (§2.3) means entity resolution (A4/B3) must collapse across both models — grid investor + grid contacts + classic contact/org/lp_profile — into one canonical
lp_id, not just dedupe name variants within one table. The email matcher's preference (fundraising_contacts>contacts) is a hint that the grid is the operationally-live LP graph, but closed financials/accreditation live only inlp_profiles. - Canonical ID host.
contacts.idis the natural join target (lp_profiles + opportunities FK to it; the dossier is keyed on it), but the 8-char truncated UUID is a uniqueness risk if it becomes the Qdrant payload key — resolve before indexing. - Interaction log is greenfield.
audit_logwon't do (mutation-only, no actor/agent dimension). A2 should add a dedicated append-onlyinteraction_logtable and route all MCP/agent writes through it (guardrail #5). - Soft-delete gap. Everything is hard-deleted today (guardrail #3 risk) and there's no tombstone for an idempotent re-embed to detect removals. A2 should add a
deleted_at/archive flag and have ingest treat it as a delete-from-index signal. - Chunk sources (per
EMBEDDINGS.md): one chunk percommunicationsrow; one per matchedemailsrow (only these have bodies); split thefundraising_investors.notesoutreach log per dated line; one chunk each for the scattered note/description fields. Keep ids/names/dates/types/confidence as filterable payload; embed only subject/body/note prose.date_ts= epoch of the event timestamp (communication_date/sent_at), notcreated_at. Key Qdrant point ids on stable ids (emails.id), namespaced to avoid the 8-char collision risk. - Migration mechanism. No Alembic. Either extend the in-code idempotent pattern (
CREATE … IF NOT EXISTS+ try/exceptALTER ADD COLUMN) or adopt the email module's numbered-.sqlrunner for the core schema (recommended; add aschema_migrationsledger). SQLite ALTER is add-column/rename only — which conveniently forces the additive/reversible guardrail. - Ingest connection. No network DB protocol — open
data/crm.dbread-only (sqlite3URImode=ro) with the same WAL/busy_timeoutpragmas, co-located with the/datavolume; WAL guarantees the reader never blocks the live writer. Reserve HTTP+JWT for any write-back. Decide the MCP↔CRM auth boundary explicitly (recommend: read-only direct SQLite for ingest; a constrained service principal for write-back) sinceX_API_KEYis unimplemented.
6. Open questions for the owner
- Which model is canonical for an LP — the
contactsrow or thefundraising_investorsgrid row? (Determines the ingest spine and the canonical-ID target.) - Is the Gmail integration enabled on the live Start9 box, and has a backfill run? If not, the Phase-0 corpus is just
communications+ note fields until it is. (The email schema exists regardless.) - Should
custom_fields(EAV) and the unroutedtagsCRUD be revived or removed? Leaving the EAV risks a second divergent custom-data path next to the live fundraising custom columns. - Accreditation today is a single boolean (
lp_profiles.accredited) with no QP flag / method / date — where should the 506(b)/506(c) + accreditation/QP fields counsel will require (guardrail #6) live? - MCP↔CRM auth: build a real
X_API_KEYservice-key path, authenticate the MCP server as a dedicated CRM user (24 h JWT, must re-login), or read SQLite directly? And does the agent principal need a new least-privilege role belowmember? fundraising_state.grid_jsonvs the normalized mirror — confirmed authoritative = the JSON blob; is reading the normalized mirror (consistent after each save) acceptable for ingest, treating the JSON as the re-derivable source of truth?
Sources: backend/server.py, backend/email_integration/*, backend/email_integration/migrations/0001_email_tables.sql, start9/0.3.5/*, start9/0.4/*. Generated from a structured multi-agent read of the codebase, cross-checked against the live data/crm.db schema (currently a near-empty seeded instance: 1 user, 9 funds, 4 views, 2 automation rules — the real corpus lives on the Start9 deployment).