Phase 0 foundation: canonical schema, ingest pipeline, CRM MCP server
Workstream A–C substrate for the Ten31 agentic system: - A1: docs/crm-overview.md; CLAUDE.md conventions + guardrail #9 - A2: additive/reversible core migration (canonical_entities, entity_links, interaction_log, relationship_edges, soft-delete) + ledgered runner - B1/B3: chunking + deterministic entity resolution (backend/ingest) - B2: dense (bge-m3) + BM25 sparse ingest to Qdrant crm_chunks - C: CRM MCP server (reads, retrieval modes, logged writes) — no outbound tools - docs: redaction/re-hydration, Gmail enablement runbook - synthetic test data; .env.example; housekeeping (.gitignore, untrack crm.db, drop legacy files + start9/0.3.5) Verified end-to-end on synthetic data + live Sparks (hybrid > dense on entity queries). Real backfill runs on Ten31 infra; index holds synthetic data only. Branch snapshot also captures pre-existing working-tree changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,57 @@
|
||||
import { VersionInfo } from '@start9labs/start-sdk'
|
||||
|
||||
// Post-migration cleanup + hardening release.
|
||||
//
|
||||
// Context:
|
||||
// * 0.1.0:39 was the first 0.4 package and shipped a baked-in
|
||||
// /data snapshot that docker_entrypoint.sh copied into the
|
||||
// mounted `main` volume on first boot (only if the volume was
|
||||
// empty). That snapshot did its job and the live host now has
|
||||
// a populated /data with all real investor + fundraising data.
|
||||
// * 0.1.0:40 removes the seed snapshot from the image and the
|
||||
// seeding logic from the entrypoint. The live /data volume is
|
||||
// the sole source of truth from here on. StartOS preserves the
|
||||
// volume across sideloads, so this upgrade does not disturb
|
||||
// any data — it just slims the image and removes a code path
|
||||
// that should never run again.
|
||||
// * 0.1.0:40 also hardens the backend HTTP server against the
|
||||
// vulnerability scanners that find the StartTunnel-exposed
|
||||
// interface within hours of going live:
|
||||
// - HTTPServer → ThreadingHTTPServer so one slow request or
|
||||
// a wave of scanner probes can't block legit users.
|
||||
// - Per-IP GET rate limit (default 600/min) in addition to
|
||||
// the existing login/write limits.
|
||||
// - 404-burst auto-ban: any IP that produces ABUSE_404_THRESHOLD
|
||||
// 404s within ABUSE_404_WINDOW_SEC (default 15 in 60s) is
|
||||
// parked on a class-level blacklist for ABUSE_BAN_SEC
|
||||
// (default 15 minutes). Banned IPs get an instant 429 with
|
||||
// no DB or filesystem work.
|
||||
// - All limits stay tunable via env vars
|
||||
// (CRM_GET_RATE_LIMIT_PER_MIN, CRM_ABUSE_404_THRESHOLD,
|
||||
// CRM_ABUSE_404_WINDOW_SEC, CRM_ABUSE_BAN_SEC).
|
||||
//
|
||||
// No data migration is required: the SQLite schema is unchanged
|
||||
// and the live DB on /data is left exactly as-is.
|
||||
export const v_0_1_0_40 = VersionInfo.of({
|
||||
version: '0.1.0:40',
|
||||
releaseNotes: {
|
||||
en_US: [
|
||||
'Removes the baked-in /data seed snapshot now that the',
|
||||
'0.3.5 → 0.4 migration is complete. The live /data volume',
|
||||
'on the StartOS host is the sole source of truth and is',
|
||||
'preserved across sideloads, so no live data is touched by',
|
||||
'this upgrade. Image is smaller and the first-boot seeding',
|
||||
'code path has been removed. Also hardens the backend',
|
||||
'against vulnerability scanners hitting the public',
|
||||
'StartTunnel interface: the HTTP server is now multi-threaded',
|
||||
'so one slow request can no longer block legit users, GET',
|
||||
'requests are rate-limited per IP, and any IP that bursts',
|
||||
'too many 404s in a short window is auto-banned for 15',
|
||||
'minutes with no DB work performed.',
|
||||
].join(' '),
|
||||
},
|
||||
migrations: {
|
||||
up: async () => {},
|
||||
down: async () => {},
|
||||
},
|
||||
})
|
||||
Reference in New Issue
Block a user