v0.27.2:0 - vision check tool + mark Qwen3.6 vision-capable

Qwen3.6-35B-A3B is multimodal (vision tower on disk) but was labelled text-only. Mark it [vision, reasoning] and add a 'Vision check' button on the running vision-capable card: upload an image + prompt -> existing /v1 passthrough proxy -> show the model's text. Confirmed 7/7 fields on a business card. Records the Gemma-4-26B deferral + research findings.
2026-06-18 18:14:30 -05:00
parent c846386c1a
commit 9a3bf9ed86
6 changed files with 120 additions and 5 deletions
@@ -365,6 +365,21 @@
        </form>
      </dialog>

+      <dialog id="vision-dialog" class="modal">
+        <form method="dialog" class="modal-form" id="vision-form">
+          <h3>Vision check<span id="vc-model" class="muted small"></span></h3>
+          <p class="muted small">Send an image to the running model and see what it reads back — handy for confirming OCR on a real photo (e.g. a business card). Sent over the same <code>/v1</code> endpoint your apps use; nothing leaves the LAN.</p>
+          <label class="modal-row"><span>Image</span><input type="file" id="vc-file" accept="image/*"></label>
+          <img id="vc-preview" class="vc-preview hidden" alt="selected image preview">
+          <label class="modal-row"><span>Prompt</span><textarea id="vc-prompt" rows="3">This is a business card. Extract every field as JSON with keys: name, title, company, phone, email, website, address. Output only the JSON.</textarea></label>
+          <div class="vc-result hidden" id="vc-result"></div>
+          <div class="modal-actions">
+            <button type="button" id="vc-run" class="btn primary">Run</button>
+            <button class="btn" value="cancel">Close</button>
+          </div>
+        </form>
+      </dialog>
+
      <section id="download-panel" class="download-panel hidden">
        <div class="download-form" id="download-form">
          <label class="dl-row">