👁️

Vision Council

Multi-modal models that understand images, screenshots, and visual content. OCR, diagram analysis, and visual Q&A.

📷 Use case: Screenshot review, OCR, visual analysis

Models

Providers

225

Max tok/s

Free

Best price

📊 Council Models

Model	Provider	Speed	★	Context	Price	Notes
llama-4-scout-17b	Groq	225 t/s	★★★½★	128K	Free	Fastest vision model
step-3-vl	StepFun	93 t/s	★★★★★	32K	$9/mo	StepFun vision-language model
claude-sonnet-4-5	Venice	35 t/s	★★★★½	200K	$20/mo	Uncensored, best vision quality
google/gemini-2.5-pro:free	OpenRouter	46 t/s	★★★★★	1M	Free	1M context, free tier
google/gemini-2.5-pro	ZenMux	37 t/s	★★★★★	1M	Free	1M context, 700 req/day
gemma-4-31b-turbo-tee	Chutes TEE	28 t/s	★★★½★	128K	$9/mo	TEE-secured vision model
kimi-k2.5-tee	Chutes TEE	28 t/s	★★★★★	128K	$9/mo	Vision + video understanding

⚖️ Pros & Cons

✅ Pros

Gemini 2.5 Pro has 1M context — analyze entire documents
Free options available (Groq llama-4-scout, OpenRouter, ZenMux)
Claude Sonnet 4.5 via Venice is uncensored for all content
TEE models for privacy-sensitive image analysis
kimi-k2.5 supports video frame understanding

⚠️ Cons

Vision models are generally slower than text-only
Image quality varies significantly across providers
TEE vision models are the slowest (28 t/s)
StepFun's context window is limited (32K)

🚀 Try This Council

Send an image for analysis to multiple Vision Council models.

# Vision Council: Analyze an image with Gemini 2.5 Pro
curl -X POST https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-2.5-pro:free",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image in detail."},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }],
    "max_tokens": 1000
  }'

🏠 Dashboard 🏆 Leaderboard 🔒 Privacy Council (TEE vision) → 🔌 OpenRouter on API Hub →