v0.27.2:0 - vision check tool + mark Qwen3.6 vision-capable
Qwen3.6-35B-A3B is multimodal (vision tower on disk) but was labelled text-only. Mark it [vision, reasoning] and add a 'Vision check' button on the running vision-capable card: upload an image + prompt -> existing /v1 passthrough proxy -> show the model's text. Confirmed 7/7 fields on a business card. Records the Gemma-4-26B deferral + research findings.
This commit is contained in:
@@ -365,6 +365,21 @@
|
||||
</form>
|
||||
</dialog>
|
||||
|
||||
<dialog id="vision-dialog" class="modal">
|
||||
<form method="dialog" class="modal-form" id="vision-form">
|
||||
<h3>Vision check<span id="vc-model" class="muted small"></span></h3>
|
||||
<p class="muted small">Send an image to the running model and see what it reads back — handy for confirming OCR on a real photo (e.g. a business card). Sent over the same <code>/v1</code> endpoint your apps use; nothing leaves the LAN.</p>
|
||||
<label class="modal-row"><span>Image</span><input type="file" id="vc-file" accept="image/*"></label>
|
||||
<img id="vc-preview" class="vc-preview hidden" alt="selected image preview">
|
||||
<label class="modal-row"><span>Prompt</span><textarea id="vc-prompt" rows="3">This is a business card. Extract every field as JSON with keys: name, title, company, phone, email, website, address. Output only the JSON.</textarea></label>
|
||||
<div class="vc-result hidden" id="vc-result"></div>
|
||||
<div class="modal-actions">
|
||||
<button type="button" id="vc-run" class="btn primary">Run</button>
|
||||
<button class="btn" value="cancel">Close</button>
|
||||
</div>
|
||||
</form>
|
||||
</dialog>
|
||||
|
||||
<section id="download-panel" class="download-panel hidden">
|
||||
<div class="download-form" id="download-form">
|
||||
<label class="dl-row">
|
||||
|
||||
Reference in New Issue
Block a user