v1.77.5-stable - MCP OAuth 2.0 Support
Deploy this versionā
- Docker
- Pip
docker run litellm
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:v1.77.5-stable
pip install litellm
pip install litellm==1.77.5
Key Highlightsā
- MCP OAuth 2.0 Support - Enhanced authentication for Model Context Protocol integrations
- Scheduled Key Rotations - Automated key rotation capabilities for enhanced security
- New Gemini 2.5 Flash & Flash-lite Models - Latest September 2025 preview models with improved pricing and features
- Performance Improvements - 54% RPS improvement
Performance Improvements - 54% RPS Improvementā
This release brings a 54% RPS improvement (1,040 ā 1,602 RPS, aggregated) per instance.
The improvement comes from fixing O(n²) inefficiencies in the LiteLLM Router, primarily caused by repeated use of in statements inside loops over large arrays.
Tests were run with a database-only setup (no cache hits).
Test Setupā
All benchmarks were executed using Locust with 1,000 concurrent users and a ramp-up of 500. The environment was configured to stress the routing layer and eliminate caching as a variable.
System Specs
- CPU: 8 vCPUs
- Memory: 32 GB RAM
Configuration (config.yaml)
View the complete configuration: gist.github.com/AlexsanderHamir/config.yaml
Load Script (no_cache_hits.py)
View the complete load testing script: gist.github.com/AlexsanderHamir/no_cache_hits.py
New Models / Updated Modelsā
New Model Supportā
| Provider | Model | Context Window | Input ($/1M tokens) | Output ($/1M tokens) | Features |
|---|---|---|---|---|---|
| Gemini | gemini-2.5-flash-preview-09-2025 | 1M | $0.30 | $2.50 | Chat, reasoning, vision, audio |
| Gemini | gemini-2.5-flash-lite-preview-09-2025 | 1M | $0.10 | $0.40 | Chat, reasoning, vision, audio |
| Gemini | gemini-flash-latest | 1M | $0.30 | $2.50 | Chat, reasoning, vision, audio |
| Gemini | gemini-flash-lite-latest | 1M | $0.10 | $0.40 | Chat, reasoning, vision, audio |
| DeepSeek | deepseek-chat | 131K | $0.60 | $1.70 | Chat, function calling, caching |
| DeepSeek | deepseek-reasoner | 131K | $0.60 | $1.70 | Chat, reasoning |
| Bedrock | deepseek.v3-v1:0 | 164K | $0.58 | $1.68 | Chat, reasoning, function calling |
| Azure | azure/gpt-5-codex | 272K | $1.25 | $10.00 | Responses API, reasoning, vision |
| OpenAI | gpt-5-codex | 272K | $1.25 | $10.00 | Responses API, reasoning, vision |
| SambaNova | sambanova/DeepSeek-V3.1 | 33K | $3.00 | $4.50 | Chat, reasoning, function calling |
| SambaNova | sambanova/gpt-oss-120b | 131K | $3.00 | $4.50 | Chat, reasoning, function calling |
| Bedrock | qwen.qwen3-coder-480b-a35b-v1:0 | 262K | $0.22 | $1.80 | Chat, reasoning, function calling |
| Bedrock | qwen.qwen3-235b-a22b-2507-v1:0 | 262K | $0.22 | $0.88 | Chat, reasoning, function calling |
| Bedrock | qwen.qwen3-coder-30b-a3b-v1:0 | 262K | $0.15 | $0.60 | Chat, reasoning, function calling |
| Bedrock | qwen.qwen3-32b-v1:0 | 131K | $0.15 | $0.60 | Chat, reasoning, function calling |
| Vertex AI | vertex_ai/qwen/qwen3-next-80b-a3b-instruct-maas | 262K | $0.15 | $1.20 | Chat, function calling |
| Vertex AI | vertex_ai/qwen/qwen3-next-80b-a3b-thinking-maas | 262K | $0.15 | $1.20 | Chat, function calling |
| Vertex AI | vertex_ai/deepseek-ai/deepseek-v3.1-maas | 164K | $1.35 | $5.40 | Chat, reasoning, function calling |
| OpenRouter | openrouter/x-ai/grok-4-fast:free | 2M | $0.00 | $0.00 | Chat, reasoning, function calling |
| XAI | xai/grok-4-fast-reasoning | 2M | $0.20 | $0.50 | Chat, reasoning, function calling |
| XAI | xai/grok-4-fast-non-reasoning | 2M | $0.20 | $0.50 | Chat, function calling |
Featuresā
- Gemini
- XAI
- Add xai/grok-4-fast models - PR #14833
- Anthropic
- Bedrock
- Vertex AI
- SambaNova
- Add sambanova deepseek v3.1 and gpt-oss-120b - PR #14866
- OpenAI
- OpenRouter
- Add gpt-5 and gpt-5-codex to OpenRouter cost map - PR #14879
- VLLM
- Fix vllm passthrough - PR #14778
- Flux
- Support flux image edit - PR #14790
Bug Fixesā
- Anthropic
- OpenAI
- Fix a bug where openai image edit silently ignores multiple images - PR #14893
- VLLM
- Fix: vLLM provider's rerank endpoint from /v1/rerank to /rerank - PR #14938
New Provider Supportā
- W&B Inference
- Add W&B Inference to LiteLLM - PR #14416
LLM API Endpointsā
Featuresā
- General
Bugsā
- General