[Preview] v1.83.7.rc.1 - Per-User MCP OAuth, Team Spend Logs RBAC

Deploy this version

Docker
Pip

docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
docker.litellm.ai/berriai/litellm:main-v1.83.7.rc.1

pip install litellm==1.83.7

warning

Breaking change — Prometheus latency histogram buckets reduced. The default LATENCY_BUCKETS set has been reduced from 35 to 18 boundaries to lower Prometheus cardinality. Dashboards and PromQL queries that reference specific le= bucket values may stop matching. Review your alerts/dashboards before upgrading and use LATENCY_BUCKETS env override to restore the previous boundaries if needed — PR #25527.

Key Highlights

Per-User MCP OAuth Tokens — Each end-user can now hold their own OAuth tokens for interactive MCP server flows, isolating credentials across users
Team Spend Logs RBAC — Teams with the /spend/logs permission can view team-wide spend logs from the UI and API
Bulk Team Permissions API — New POST /team/permissions_bulk_update endpoint for updating member permissions across many teams in one call
Azure Container Routing — Container routing, managed container IDs, and delete-response parsing for Azure Responses API containers
UI E2E Test Suite — Playwright-based end-to-end tests for proxy admin, team, and key management flows now run in CI

New Models / Updated Models

New Model Support (14 new models)

Provider	Model	Context Window	Input ($/1M tokens)	Output ($/1M tokens)	Features
AWS Bedrock (GovCloud)	`bedrock/us-gov-east-1/anthropic.claude-sonnet-4-5-20250929-v1:0`	200K	$3.30	$16.50	Chat, vision, tool use, prompt caching, reasoning
AWS Bedrock (GovCloud)	`bedrock/us-gov-west-1/anthropic.claude-sonnet-4-5-20250929-v1:0`	200K	$3.30	$16.50	Chat, vision, tool use, prompt caching, reasoning
AWS Bedrock (GovCloud)	`us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0`	200K	$3.30	$16.50	Bedrock Converse, with above-200K tier pricing
Baseten	`baseten/MiniMaxAI/MiniMax-M2.5`	-	$0.30	$1.20	Chat
Baseten	`baseten/nvidia/Nemotron-120B-A12B`	-	$0.30	$0.75	Chat
Baseten	`baseten/zai-org/GLM-5`	-	$0.95	$3.15	Chat
Baseten	`baseten/zai-org/GLM-4.7`	-	$0.60	$2.20	Chat
Baseten	`baseten/zai-org/GLM-4.6`	-	$0.60	$2.20	Chat
Baseten	`baseten/moonshotai/Kimi-K2.5`	-	$0.60	$3.00	Chat
Baseten	`baseten/moonshotai/Kimi-K2-Thinking`	-	$0.60	$2.50	Chat
Baseten	`baseten/moonshotai/Kimi-K2-Instruct-0905`	-	$0.60	$2.50	Chat
Baseten	`baseten/openai/gpt-oss-120b`	-	$0.10	$0.50	Chat
Baseten	`baseten/deepseek-ai/DeepSeek-V3.1`	-	$0.50	$1.50	Chat
Baseten	`baseten/deepseek-ai/DeepSeek-V3-0324`	-	$0.77	$0.77	Chat

Features

AWS Bedrock
- AWS GovCloud mode support (us-gov prefix routing) - PR #25254
- Update GovCloud Claude Sonnet 4.5 pricing, raise max_tokens to 8192, and add prompt-caching costs
- Skip dummy user continue message when assistant prefix prefill is set - PR #25419
- Avoid double-counting cache tokens in Anthropic Messages streaming usage - PR #25517
Anthropic
- Support advisor_20260301 tool type - PR #25525
Triton
- Embedding usage estimation for self-hosted Triton responses - PR #25345
Baseten
- Add pricing entries for 11 new Baseten-hosted models - PR #25358
Google Gemini / Vertex AI
- Mark applicable Gemini 2.5/3 models with supports_service_tier

Bug Fixes

AWS Bedrock
- Pass-through fix for Bedrock JSON body and multipart uploads - PR #25464
OpenAI
- Mock headers in test_completion_fine_tuned_model to stabilize tests - PR #25444

LLM API Endpoints

Features

Responses API
- Containers: Azure routing, managed container IDs, and delete-response parsing - PR #25287
- WebSocket: append ?model= to backend WebSocket URL so model selection routes correctly - PR #25437
OpenAI / Files API
- Add file content streaming support for OpenAI and related utilities - PR #25450
A2A
- Default 60-second timeout when creating an A2A client - PR #25514

Bugs

Responses API
- Map refusal stop_reason to incomplete status in streaming - PR #25498
- Fix duplicate keyword argument error in Responses WebSocket path - PR #25513
Router
- Pass custom_llm_provider to get_llm_provider for unprefixed model names - PR #25334
- Fix tag-based routing when encrypted_content_affinity is enabled - PR #25347
General
- Ensure spend/cost logging runs when stream=True for web-search interception - PR #25424

Management Endpoints / UI

Features

Teams + Organizations
- New POST /team/permissions_bulk_update endpoint for bulk permission updates across teams - PR #25239
- Team member permission /spend/logs to view team-wide spend logs (UI + RBAC) - PR #25458
- Align org and team endpoint permission checks - PR #25554
Virtual Keys
- Align /v2/key/info response handling with v1 - PR #25313
Authentication / Routing
- Allow JWT to override OAuth2 routing without requiring global OAuth2 enablement - PR #25252
- Consolidate route auth for UI and API tokens - PR #25473
- Use parameterized query for combined_view token lookup - PR #25467
Provider Credentials
- Per-team / per-project credential overrides via model_config metadata - PR #24438
UI
- Improve browser storage handling and Dockerfile consistency - PR #25384
- Align v1 guardrail and agent list responses with v2 field handling - PR #25478
- Flush Tremor Tooltip timers in user_edit_view tests - PR #25480

Bugs

Improve input validation on management endpoints - PR #25445
Harden file path resolution in skill archive extraction - PR #25475

AI Integrations

Logging

Ramp
- Add Ramp as a built-in success callback - PR #23769
Langfuse
- Preserve proxy key-auth metadata on /v1/messages Langfuse traces - PR #25448
Prometheus
- Reduce default LATENCY_BUCKETS from 35 → 18 boundaries (see breaking-change note above) - PR #25527
General
- S3 logging: retry with exponential backoff for transient 503/500 errors - PR #25530

Guardrails

Optional skip system message in unified guardrail inputs - PR #25481
Inline IAM: apply guardrail support - PR #25241
Preserve dict HTTPException.detail and Bedrock context in guardrail errors - PR #25558

Spend Tracking, Budgets and Rate Limiting

Session-TZ-independent date filtering for spend / error log queries - PR #25542
Batch-limit stale managed-object cleanup to prevent 300K+ row updates - PR #25258

MCP Gateway

Per-user OAuth token storage for interactive MCP flows - PR #25441
Block arbitrary command execution via MCP stdio transport - PR #25343
Document missing MCP per-user token environment variables in config_settings - PR #25471

Performance / Loadbalancing / Reliability improvements

Reduce Prometheus latency histogram cardinality (default buckets 35 → 18) - PR #25527
S3 retry with exponential backoff for transient errors - PR #25530

Documentation Updates

Add Docker Image Security Guide covering cosign verification and deployment best practices - PR #25439
Document April townhall announcements - PR #25537
Document missing MCP per-user token env vars - PR #25471
Add "Screenshots / Proof of Fix" section to PR template - PR #25564

Infrastructure / Security Notes

Pin cosign.pub verification to initial commit hash - PR #25273
Fix node-gyp symlink path after npm upgrade in Dockerfile - PR #25048
Dockerfile.non_root: handle missing .npmrc gracefully - PR #25307
Add Playwright E2E tests with local PostgreSQL - PR #25126
UI E2E tests for proxy admin team and key management - PR #25365
Migrate Redis caching tests from GHA to CircleCI - PR #25354
Update check_responses_cost tests for _expire_stale_rows - PR #25299
Raise global vitest timeout and remove per-test overrides - PR #25468
Version bumps and UI rebuilds: PR #25316, PR #25528, PR #25578, PR #25571, PR #25573, PR #25577

New Contributors

@kedarthakkar made their first contribution in https://github.com/BerriAI/litellm/pull/23769
@csoni-cweave made their first contribution in https://github.com/BerriAI/litellm/pull/25441
@jimmychen-p72 made their first contribution in https://github.com/BerriAI/litellm/pull/25530

Full Changelog: https://github.com/BerriAI/litellm/compare/v1.83.3.rc.1...v1.83.7.rc.1

Deploy this version​

Key Highlights​

New Models / Updated Models​

New Model Support (14 new models)​

Features​

Bug Fixes​

LLM API Endpoints​

Features​

Bugs​

Management Endpoints / UI​

Features​

Bugs​

AI Integrations​

Logging​

Guardrails​

Spend Tracking, Budgets and Rate Limiting​

MCP Gateway​

Performance / Loadbalancing / Reliability improvements​

Documentation Updates​

Infrastructure / Security Notes​

New Contributors​

Deploy this version

Key Highlights

New Models / Updated Models

New Model Support (14 new models)

Features

Bug Fixes

LLM API Endpoints

Features

Bugs

Management Endpoints / UI

Features

Bugs

AI Integrations

Logging

Guardrails

Spend Tracking, Budgets and Rate Limiting

MCP Gateway

Performance / Loadbalancing / Reliability improvements

Documentation Updates

Infrastructure / Security Notes

New Contributors