v0.27.2:0 - vision check tool + mark Qwen3.6 vision-capable

Qwen3.6-35B-A3B is multimodal (vision tower on disk) but was labelled
text-only. Mark it [vision, reasoning] and add a 'Vision check' button on
the running vision-capable card: upload an image + prompt -> existing /v1
passthrough proxy -> show the model's text. Confirmed 7/7 fields on a
business card. Records the Gemma-4-26B deferral + research findings.
This commit is contained in:
Keysat
2026-06-18 18:14:30 -05:00
parent c846386c1a
commit 9a3bf9ed86
6 changed files with 120 additions and 5 deletions
+15
View File
@@ -365,6 +365,21 @@
</form>
</dialog>
<dialog id="vision-dialog" class="modal">
<form method="dialog" class="modal-form" id="vision-form">
<h3>Vision check<span id="vc-model" class="muted small"></span></h3>
<p class="muted small">Send an image to the running model and see what it reads back — handy for confirming OCR on a real photo (e.g. a business card). Sent over the same <code>/v1</code> endpoint your apps use; nothing leaves the LAN.</p>
<label class="modal-row"><span>Image</span><input type="file" id="vc-file" accept="image/*"></label>
<img id="vc-preview" class="vc-preview hidden" alt="selected image preview">
<label class="modal-row"><span>Prompt</span><textarea id="vc-prompt" rows="3">This is a business card. Extract every field as JSON with keys: name, title, company, phone, email, website, address. Output only the JSON.</textarea></label>
<div class="vc-result hidden" id="vc-result"></div>
<div class="modal-actions">
<button type="button" id="vc-run" class="btn primary">Run</button>
<button class="btn" value="cancel">Close</button>
</div>
</form>
</dialog>
<section id="download-panel" class="download-panel hidden">
<div class="download-form" id="download-form">
<label class="dl-row">