5e291203a5
User-feedback-driven release after testing v1.1.0:3. Nine themes:
1. Multi-config persistence
- New AIConfigProfile table (per-user). Save N configs, toggle one
active. Switching providers no longer wipes the previous setup.
- UserPreferences gains activeAIConfigId; legacy single-config
columns are mirrored from the active profile so existing reads
keep working without conditional logic.
- Idempotent boot migration lifts any existing single-config row
into a default profile.
2. Ollama auto-detect
- The "Add config" form probes /api/tags on the StartOS internal
addresses (ollama.startos / ollama.embassy on :11434). If
reachable: URL pre-fills, model field becomes a dropdown of
installed models. Fixes the copy-paste UX.
3. Curated model dropdowns for major providers
- Claude: Opus 4.7, Sonnet 4.6 (1M ctx), Haiku 4.5
- OpenAI: GPT-5.5, 5.4, 5.4-mini, 5.4-nano
- Gemini: 3.1-pro-preview, 2.5-pro, 2.5-flash, etc.
- "Other (type your own)" stays for niche models.
- Fixes "I tried gemini-3.0-pro and got 404."
4. Background generation
- lib/ai/generationRunner.ts: detached runner with in-memory
pub/sub bus. POST /api/ai/generate kicks it off and returns
immediately. SSE stream attaches by id. The runner survives
request cancellation; navigating away no longer kills it.
- New AIGeneration columns: progressText (in-flight stream),
durationMs (final wall-clock).
- Generate UI shows a banner explaining background-safety.
- History detail page polls progress + renders partial JSON
live for cross-process resume (page refresh, new tab).
5. System prompt overhaul
- lib/ai/systemPromptBase.ts: structural contract prepended to
every template. Forces JSON-only output, library-exerciseId
usage (kills "exerciseId doesn't belong to this user" errors),
and per-resistance-exercise suggestedWeight (with-history vs
without-history variants).
- aiExerciseSchema + ProgramExercise gain suggestedWeight +
suggestedWeightUnit. Starting a workout from a ProgramDay
pre-populates SetLog.weight from the suggestion.
6. Test connection improvements
- Latency in seconds (was ms — confusing for slow Ollama).
- Stale "✓ Connected" clears on form change.
- Per-config Test (no need to activate first).
- Generous maxOutputTokens for thinking models.
- Gemini surfaces finishReason on empty response (e.g. "blocked
by safety filter") instead of generic "empty response."
- Test endpoint accepts a draft body so you can verify before
saving + before activating.
7. History detail view
- Click row → full program tree + exact prompts sent. Apply from
here without re-generating. Pending rows poll for progress.
8. Sidebar sub-navigation
- AI: Generate / History / Templates
- Settings: General / Password / Sessions / AI integration /
Export / Instance (admin) / Danger zone, with anchor scroll.
9. API key UX
- "Key saved" indicator on saved configs (was confusing to see
an empty input after a successful save).
Schema migrations (additive, idempotent in entrypoint):
- AIConfigProfile table created
- UserPreferences.activeAIConfigId
- AIGeneration.progressText + durationMs
- ProgramExercise.suggestedWeight + suggestedWeightUnit
Tests: 16 new (systemPromptBase, modelMenu, generationRunner). 177
total pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
161 lines
6.8 KiB
TypeScript
161 lines
6.8 KiB
TypeScript
/**
|
|
* Per-model pricing in USD per million tokens. Used to estimate the
|
|
* cost of an AIGeneration row from its tokensIn/tokensOut.
|
|
*
|
|
* Prices change. This table is a best-effort starting point for
|
|
* common models as of mid-2026; users on other models will see
|
|
* `null` cost (we still surface token counts). Updating: edit this
|
|
* file and ship — no schema change needed.
|
|
*
|
|
* Matching strategy: case-insensitive prefix lookup against the
|
|
* user's configured model string. Model names like
|
|
* "claude-sonnet-4-5-20251022" match the "claude-sonnet-4-5" prefix.
|
|
*
|
|
* Keys are organized by provider for readability but the lookup is
|
|
* provider-agnostic — the model string is the key.
|
|
*/
|
|
|
|
interface PriceEntry {
|
|
inputPerM: number; // USD per 1M input tokens
|
|
outputPerM: number; // USD per 1M output tokens
|
|
}
|
|
|
|
const PRICES: Record<string, PriceEntry> = {
|
|
// Anthropic Claude (Messages API) — opus tier $15/$75, sonnet $3/$15,
|
|
// haiku $0.80/$4. New point releases inherit their tier's pricing.
|
|
'claude-opus-4-7': { inputPerM: 15, outputPerM: 75 },
|
|
'claude-opus-4-6': { inputPerM: 15, outputPerM: 75 },
|
|
'claude-opus-4-5': { inputPerM: 15, outputPerM: 75 },
|
|
'claude-opus-4': { inputPerM: 15, outputPerM: 75 },
|
|
'claude-sonnet-4-6': { inputPerM: 3, outputPerM: 15 },
|
|
'claude-sonnet-4-5': { inputPerM: 3, outputPerM: 15 },
|
|
'claude-sonnet-4': { inputPerM: 3, outputPerM: 15 },
|
|
'claude-haiku-4-5': { inputPerM: 0.8, outputPerM: 4 },
|
|
'claude-haiku-4': { inputPerM: 0.8, outputPerM: 4 },
|
|
'claude-3-7-sonnet': { inputPerM: 3, outputPerM: 15 },
|
|
'claude-3-5-sonnet': { inputPerM: 3, outputPerM: 15 },
|
|
'claude-3-5-haiku': { inputPerM: 0.8, outputPerM: 4 },
|
|
|
|
// OpenAI — gpt-5.x flagships ~$1.25-$2/$10-$15, mini/nano cheaper
|
|
'gpt-5.5': { inputPerM: 2, outputPerM: 15 },
|
|
'gpt-5.4': { inputPerM: 1.5, outputPerM: 12 },
|
|
'gpt-5.4-mini': { inputPerM: 0.3, outputPerM: 2.4 },
|
|
'gpt-5.4-nano': { inputPerM: 0.06, outputPerM: 0.5 },
|
|
'gpt-5.3': { inputPerM: 1.5, outputPerM: 12 },
|
|
'gpt-5.2': { inputPerM: 1.5, outputPerM: 12 },
|
|
'gpt-5.1': { inputPerM: 1.25, outputPerM: 10 },
|
|
'gpt-5': { inputPerM: 1.25, outputPerM: 10 },
|
|
'gpt-5-mini': { inputPerM: 0.25, outputPerM: 2 },
|
|
'gpt-5-nano': { inputPerM: 0.05, outputPerM: 0.4 },
|
|
'gpt-4o': { inputPerM: 2.5, outputPerM: 10 },
|
|
'gpt-4o-mini': { inputPerM: 0.15, outputPerM: 0.6 },
|
|
'o1': { inputPerM: 15, outputPerM: 60 },
|
|
'o3': { inputPerM: 2, outputPerM: 8 },
|
|
'o3-mini': { inputPerM: 1.1, outputPerM: 4.4 },
|
|
'o4-mini': { inputPerM: 1.1, outputPerM: 4.4 },
|
|
|
|
// Google Gemini — Gemini 3.1 Pro is $2/$12 standard; >200K ctx is 2x.
|
|
'gemini-3.1-pro-preview': { inputPerM: 2, outputPerM: 12 },
|
|
'gemini-3.1-pro': { inputPerM: 2, outputPerM: 12 },
|
|
'gemini-3-pro-preview': { inputPerM: 2, outputPerM: 12 },
|
|
'gemini-3-pro': { inputPerM: 2, outputPerM: 12 },
|
|
'gemini-2.5-pro': { inputPerM: 1.25, outputPerM: 10 },
|
|
'gemini-2.5-flash': { inputPerM: 0.3, outputPerM: 2.5 },
|
|
'gemini-2.0-flash': { inputPerM: 0.1, outputPerM: 0.4 },
|
|
'gemini-2.0-pro': { inputPerM: 1.25, outputPerM: 5 },
|
|
'gemini-1.5-pro': { inputPerM: 1.25, outputPerM: 5 },
|
|
'gemini-1.5-flash': { inputPerM: 0.075, outputPerM: 0.3 },
|
|
};
|
|
|
|
/**
|
|
* Per-provider model menus — source of truth for the "Model" dropdown
|
|
* in Settings → AI integration. `recommended` floats to the top. Users
|
|
* can still type a custom model name (the dropdown has an "Other"
|
|
* option that switches to free-text input). Order = display order.
|
|
*
|
|
* Update these when new models ship. Keys correspond to provider IDs
|
|
* in lib/ai/providers/index.ts.
|
|
*/
|
|
export interface ModelOption {
|
|
/** Exact API model identifier */
|
|
id: string;
|
|
/** Human-readable label shown in the dropdown */
|
|
label: string;
|
|
/** Floats to the top + gets a "★" mark */
|
|
recommended?: boolean;
|
|
}
|
|
|
|
export const MODEL_MENU: Record<string, ModelOption[]> = {
|
|
claude: [
|
|
{ id: 'claude-opus-4-7', label: 'Claude Opus 4.7 (most capable)', recommended: true },
|
|
{ id: 'claude-sonnet-4-6', label: 'Claude Sonnet 4.6 (1M context, fast)', recommended: true },
|
|
{ id: 'claude-haiku-4-5', label: 'Claude Haiku 4.5 (cheapest, fastest)', recommended: true },
|
|
{ id: 'claude-opus-4-6', label: 'Claude Opus 4.6' },
|
|
{ id: 'claude-sonnet-4-5', label: 'Claude Sonnet 4.5' },
|
|
{ id: 'claude-3-7-sonnet-latest', label: 'Claude 3.7 Sonnet' },
|
|
],
|
|
openai: [
|
|
{ id: 'gpt-5.5', label: 'GPT-5.5 (most capable)', recommended: true },
|
|
{ id: 'gpt-5.4', label: 'GPT-5.4', recommended: true },
|
|
{ id: 'gpt-5.4-mini', label: 'GPT-5.4 Mini (cheap, fast)', recommended: true },
|
|
{ id: 'gpt-5.4-nano', label: 'GPT-5.4 Nano (cheapest)' },
|
|
{ id: 'gpt-5', label: 'GPT-5' },
|
|
{ id: 'gpt-4o', label: 'GPT-4o (legacy)' },
|
|
{ id: 'o3', label: 'o3 (reasoning)' },
|
|
],
|
|
gemini: [
|
|
{ id: 'gemini-3.1-pro-preview', label: 'Gemini 3.1 Pro Preview (most capable)', recommended: true },
|
|
{ id: 'gemini-2.5-pro', label: 'Gemini 2.5 Pro', recommended: true },
|
|
{ id: 'gemini-2.5-flash', label: 'Gemini 2.5 Flash (cheap, fast)', recommended: true },
|
|
{ id: 'gemini-2.0-flash', label: 'Gemini 2.0 Flash' },
|
|
{ id: 'gemini-1.5-pro', label: 'Gemini 1.5 Pro (legacy)' },
|
|
],
|
|
// openai-compatible + ollama: no curated menu — model names are
|
|
// gateway- or host-specific. Ollama auto-detects via /api/tags.
|
|
'openai-compatible': [],
|
|
ollama: [],
|
|
};
|
|
|
|
/** Find the price entry whose key is a (case-insensitive) prefix of the model string. */
|
|
export function findPrice(model: string): PriceEntry | null {
|
|
const m = model.toLowerCase();
|
|
// Longest-prefix-first so e.g. "claude-sonnet-4-5" beats "claude-sonnet-4".
|
|
const sortedKeys = Object.keys(PRICES).sort((a, b) => b.length - a.length);
|
|
for (const key of sortedKeys) {
|
|
if (m.startsWith(key.toLowerCase())) return PRICES[key];
|
|
}
|
|
return null;
|
|
}
|
|
|
|
/**
|
|
* Estimate the USD cost of a generation. Returns null if the model
|
|
* isn't in the price table or if either token count is missing.
|
|
* Ollama and openai-compatible custom gateways always return null
|
|
* (they're either free or self-priced).
|
|
*/
|
|
export function estimateCost(opts: {
|
|
provider: string;
|
|
model: string;
|
|
tokensIn: number | null;
|
|
tokensOut: number | null;
|
|
}): number | null {
|
|
if (opts.provider === 'ollama') return 0; // self-hosted, no per-token cost
|
|
if (opts.provider === 'openai-compatible') return null; // we don't know the gateway's pricing
|
|
if (opts.tokensIn == null || opts.tokensOut == null) return null;
|
|
const price = findPrice(opts.model);
|
|
if (!price) return null;
|
|
return (
|
|
(opts.tokensIn / 1_000_000) * price.inputPerM +
|
|
(opts.tokensOut / 1_000_000) * price.outputPerM
|
|
);
|
|
}
|
|
|
|
/** Format USD to a string suitable for a UI label. Below $0.01 -> "<$0.01". */
|
|
export function formatCost(usd: number | null): string {
|
|
if (usd == null) return '—';
|
|
if (usd === 0) return 'free';
|
|
if (usd < 0.01) return '<$0.01';
|
|
if (usd < 1) return `$${usd.toFixed(3)}`;
|
|
return `$${usd.toFixed(2)}`;
|
|
}
|