goal: Multimodal Attachments for Vision-Enabled Chat (Files & Images → LLM Vision)

## 🎯 Goal

**Multimodal Attachments for Vision-Enabled Chat (Files & Images → LLM Vision)**

## 📖 Context

Users want to ask questions about screenshots, diagrams, PDFs, and photos inside chat. We’ll let them attach files/images and have Jan-Server preprocess and feed them to **vision-capable LLMs** (via vLLM) so the model can “see” and reason over content alongside text.

## ✅ Scope

* **Upload & reference**:  TBD
* **Vision routing**:  TBD
* **Preprocessing pipeline**: TBD
* **Context assembly**: TBD
* **Storage & access**: TBD

## ❌ Out of Scope

* Image **generation/editing** (diffusion/paint)
* **Video** or audio modalities (transcription, VQA on video)
* Full document management features (versioning, sharing UI)
* Complex table extraction or layout-aware PDF QA beyond basic OCR + page images

## ❓Open questions

* **Model targeting**: default vision model(s) to ship? auto-fallback to text-only with OCR text only?
* **Limits**: max image resolution (e.g., 2048px longest side), max pages per PDF, max total bytes per request?
* **OCR**: which engine by default; language packs; accuracy vs speed tradeoffs; opt-out per org?
* **Security**: default denylist/allowlist domains for remote fetches; redaction for sensitive OCR text?
* **Budgeting**: how to prioritize pages/regions when context budget is tight (first N pages, heuristic on images with text density)?
* **Caching**: cache preprocessed artifacts (thumbnails, OCR text) by content hash to cut latency—default TTLs?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

goal: Multimodal Attachments for Vision-Enabled Chat (Files & Images → LLM Vision) #217

🎯 Goal

📖 Context

✅ Scope

❌ Out of Scope

❓Open questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

goal: Multimodal Attachments for Vision-Enabled Chat (Files & Images → LLM Vision) #217

Description

🎯 Goal

📖 Context

✅ Scope

❌ Out of Scope

❓Open questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions