v1.78.5-stable - Native OCR Support
Deploy this versionโ
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:v1.78.5-stable
pip install litellm
pip install litellm==1.78.5
Key Highlightsโ
- Native OCR Endpoints - Native /v1/ocrendpoint support with cost tracking for Mistral OCR and Azure AI OCR
- Global Vendor Discounts - Specify global vendor discount percentages for accurate cost tracking and reporting
- Team Spending Reports - Team admins can now export detailed spending reports for their teams
- Claude Haiku 4.5 - Day 0 support for Claude Haiku 4.5 across Bedrock, Vertex AI, and OpenRouter with 200K context window
- GPT-5-Codex - Support for GPT-5-Codex via Responses API on OpenAI and Azure
- Performance Improvements - Major router optimizations: O(1) model lookups, 10-100x faster shallow copy, 30-40% faster timing calls, and O(n) to O(1) hash generation
New Models / Updated Modelsโ
New Model Supportโ
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features | 
|---|---|---|---|---|---|
| Anthropic | claude-haiku-4-5 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching, computer use | 
| Anthropic | claude-haiku-4-5-20251001 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching, computer use | 
| Bedrock | anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching | 
| Bedrock | global.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching | 
| Bedrock | jp.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (JP Cross-Region) | 
| Bedrock | us.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (US region) | 
| Bedrock | eu.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (EU region) | 
| Bedrock | apac.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (APAC region) | 
| Bedrock | au.anthropic.claude-haiku-4-5-20251001-v1:0 | 200K | $1.10 | $5.50 | Chat, reasoning, vision, function calling, prompt caching (AU region) | 
| Vertex AI | vertex_ai/claude-haiku-4-5@20251001 | 200K | $1.00 | $5.00 | Chat, reasoning, vision, function calling, prompt caching | 
| OpenAI | gpt-5 | 272K | $1.25 | $10.00 | Chat, responses API, reasoning, vision, function calling, prompt caching | 
| OpenAI | gpt-5-codex | 272K | $1.25 | $10.00 | Responses API mode | 
| Azure | azure/gpt-5-codex | 272K | $1.25 | $10.00 | Responses API mode | 
| Gemini | gemini-2.5-flash-image | 32K | $0.30 | $2.50 | Image generation (GA - Nano Banana) - $0.039/image | 
| ZhipuAI | glm-4.6 | - | - | - | Chat completions | 
Featuresโ
- 
- GPT-5 return reasoning content via /chat/completions + GPT-5-Codex working on Claude Code - PR #15441
 
- 
- Add anthropic.claude-haiku-4-5-20251001-v1:0 on Bedrock, VertexAI - PR #15581
- Add Claude Haiku 4.5 support for Bedrock global and US regions - PR #15650
- Add Claude Haiku 4.5 support for Bedrock Other regions - PR #15653
- Add JP Cross-Region Inference jp.anthropic.claude-haiku-4-5-20251001 - PR #15598
- Fix: bedrock-pricing-geo-inregion-cross-region / add Global Cross-Region Inference - PR #15685
- Fix: Support us-gov prefix for AWS GovCloud Bedrock models - PR #15626
- Fix GPT-OSS in Bedrock now supports streaming. Revert fake streaming - PR #15668
 
- 
- Fix(ollama/chat): correctly map reasoning_effort to think in requests - PR #15465
 
- 
- Fix(cometapi): improve CometAPI provider support (embeddings, image generation, docs) - PR #15591
 
- 
- Adding new models to the lemonade provider - PR #15554
 
- 
- Fix (pricing): Fix pricing for watsonx model family for various models - PR #15670
 
- 
- Add glm-4.6 model to pricing configuration - PR #15679
 
- 
- Add Vertex AI Discovery Engine Rerank Support - PR #15532
 
Bug Fixesโ
- 
- Fix: Pricing for Claude Sonnet 4.5 in US regions is 10x too high - PR #15374
 
- 
- Change gpt-5-codex support in model_price json - PR #15540
 
- 
- Fix filtering headers for signature calcs - PR #15590
 
- 
General - Add native reasoning and streaming support flag for gpt-5-codex - PR #15569
 
LLM API Endpointsโ
Featuresโ
- 
- Feat: Add native litellm.ocr() functions - PR #15567
- Feat: Add /ocr route on LiteLLM AI Gateway - Adds support for native Mistral OCR calling - PR #15571
- Feat: Add Azure AI Mistral OCR Integration - PR #15572
- Feat: Native /ocr endpoint support - PR #15573
- Feat: Add Cost Tracking for /ocr endpoints - PR #15678
 
- 
- Fix: Dall-e-2 for Image Edits API - PR #15604
 
- 
- Feat: Allow calling /invoke, /converse routes through AI Gateway + models on config.yaml - PR #15618
 
Bugsโ
- General
Management Endpoints / UIโ
Featuresโ
- 
Virtual Keys 
- 
Teams - Feat: Allow Team Admins to export a report of the team spending - PR #15542
 
- 
Passthrough - Feat: Passthrough - allow admin to give access to specific passthrough endpoints - PR #15401
 
- 
SCIM v2 - Feat(scim_v2.py): if group.id doesn't exist, use external id + Passthrough - ensure updates and deletions persist across instances - PR #15276
 
- 
SSO 
Logging / Guardrail / Prompt Management Integrationsโ
Guardrailsโ
- 
General 
- 
- Feature: update pillar security integration to support no persistence mode in litellm proxy - PR #15599
 
Prompt Managementโ
- General
- Small fix code snippet custom_prompt_management.md - PR #15544
 
Spend Tracking, Budgets and Rate Limitingโ
- 
Cost Tracking 
- 
Budgets - Fix: improve budget clarity - PR #15682
 
Performance / Loadbalancing / Reliability improvementsโ
- 
Router Optimizations - Perf(router): use shallow copy instead of deepcopy for model aliases - 10-100x faster than deepcopy on nested dict structures - PR #15576
- Perf(router): optimize string concatenation in hash generation - Improves time complexity from O(nยฒ) to O(n) - PR #15575
- Perf(router): optimize model lookups with O(1) data structures - Replace O(n) scans with index map lookups - PR #15578
- Perf(router): optimize model lookups with O(1) index maps - Use model_id_to_deployment_index_map and model_name_to_deployment_indices for instant lookups - PR #15574
- Perf(router): optimize timing functions in completion hot path - Use time.perf_counter() for duration measurements and time.monotonic() for timeout calculations, providing 30-40% faster timing calls - PR #15617
 
- 
SSL/TLS Performance - Feat(ssl): add configurable ECDH curve for TLS performance - Configure via ssl_ecdh_curve setting to disable PQC on OpenSSL 3.x for better performance - PR #15617
 
- 
Token Counter - Fix(token-counter): extract model_info from deployment for custom_tokenizer - PR #15680
 
- 
Performance Metrics - Add: perf summary - PR #15458
 
- 
CI/CD - Fix: CI/CD - Missing env key & Linter type error - PR #15606
 
Documentation Updatesโ
- 
Provider Documentation 
- 
General - Fixed a few typos - PR #15267
 
New Contributorsโ
- @jlan-nl made their first contribution in PR #15374
- @ImadSaddik made their first contribution in PR #15267
- @huangyafei made their first contribution in PR #15472
- @mubashir1osmani made their first contribution in PR #15468
- @kowyo made their first contribution in PR #15465
- @dhruvyad made their first contribution in PR #15448
- @davizucon made their first contribution in PR #15544
- @FelipeRodriguesGare made their first contribution in PR #15540
- @ndrsfel made their first contribution in PR #15557
- @shinharaguchi made their first contribution in PR #15598
- @TensorNull made their first contribution in PR #15591
- @TeddyAmkie made their first contribution in PR #15583
- @aniketmaurya made their first contribution in PR #15580
- @eddierichter-amd made their first contribution in PR #15554
- @konekohana made their first contribution in PR #15535
- @Classic298 made their first contribution in PR #15495
- @afogel made their first contribution in PR #15599
- @orolega made their first contribution in PR #15633
- @LucasSugi made their first contribution in PR #15634
- @uc4w6c made their first contribution in PR #15619
- @Sameerlite made their first contribution in PR #15658
- @yuneng-jiang made their first contribution in PR #15672
- @Nikro made their first contribution in PR #15680

