Releases: BerriAI/litellm
v1.31.5
What's Changed
- fix(proxy_server.py): cache master key check by @krrishdholakia in #2483
- fix(proxy_server.py): support cost tracking if general_settings is none by @krrishdholakia in #2382
- [Proxy-High Traffic] Fix using Langfuse in high traffic by @ishaan-jaff in #2484
Full Changelog: v1.31.4...v1.31.5
v1.31.4
[BETA] Thrilled to launch support for Cohere/Command-R on LiteLLM , LiteLLM Proxy Server 👉 Start here https://docs.litellm.ai/docs/providers/cohere
☎️ PR for using cohere tool calling in OpenAI format: #2479
⚡️ LiteLLM Proxy + @langfuse - High Traffic - support 80+/Requests per second with Proxy + Langfuse logging https://docs.litellm.ai/docs/proxy/logging
⚡️ New Models - Azure GPT-Instruct models https://docs.litellm.ai/docs/providers/azure#azure-instruct-models
🛠️ Fix for using DynamoDB + LiteLLM Virtual Keys
What's Changed
- (feat) support azure/gpt-instruct models by @ishaan-jaff in #2471
- [New-Model] Cohere/command-r by @ishaan-jaff in #2474
- (fix) patch dynamoDB team_model_alias bug by @ishaan-jaff in #2478
- fix(azure.py): support cost tracking for azure/dall-e-3 by @krrishdholakia in #2475
- fix(openai.py): return model name with custom llm provider for openai-compatible endpoints (e.g. mistral, together ai, etc.) by @krrishdholakia in #2473
Full Changelog: v1.30.2...v1.31.4
v1.31.3
What's Changed
- fix(router.py): support fallbacks / retries with sync embedding calls by @krrishdholakia in #2459
- Make
argon2-cffioptional by @eladsegal in #2458 - fix(proxy_server.py): prevent user from deleting non-user owned keys by @krrishdholakia in #2448
- LiteLLM - improve memory utilization - don't use inMemCache on Router by @ishaan-jaff in #2461
- [Litellm-Proxy, Router] improve memory usage - don't store messages in memory, previous models in memory by @ishaan-jaff in #2462
- (docs) using litellm router by @ishaan-jaff in #2465
Full Changelog: v1.31.2...v1.31.3
v1.31.2
What's Changed
- Add missing
argon2-cffipoetry dependency by @eladsegal in #2443 - (fix) fix default dockerfile startup by @ishaan-jaff in #2446
New Contributors
- @eladsegal made their first contribution in #2443
Full Changelog: v1.30.7...v1.31.2
v1.30.7
What's Changed
- fix(bedrock.py): enable claude-3 streaming by @krrishdholakia in #2425
- (docs) LiteLLM Proxy - use port 4000 in examples by @ishaan-jaff in #2416
- fix(proxy_server.py): use argon2 for faster api key checking by @krrishdholakia in #2394
- (fix) use python 3.8 on ci/cd by @ishaan-jaff in #2428
- tests - monitor memory usage with litellm by @ishaan-jaff in #2427
- feat: add cost tracking + caching for
/audio/transcriptioncalls by @krrishdholakia in #2426
Full Changelog: v1.30.6...v1.30.7
v1.30.6
What's Changed
- [Docs] Deploying litellm - litellm, litellm-database, litellm with redis by @ishaan-jaff in #2423
- feat(helm-chart): redis as cache managed by chart by @debdutdeb in #2420
New Contributors
- @debdutdeb made their first contribution in #2420
Full Changelog: v1.30.5...v1.30.6
v1.30.5
What's Changed
- feat(main.py): support openai transcription endpoints by @krrishdholakia in #2401
- load balancing transcription endpoints by @krrishdholakia in #2405
Full Changelog: v1.30.4...v1.30.5
v1.30.4
1.Incognito Requests - Don't log anything - docs: https://docs.litellm.ai/docs/proxy/enterprise#incognito-requests---dont-log-anything
When no-log=True, the request will not be logged on any callbacks and there will be no server logs on litellm
import openai
client = openai.OpenAI(
api_key="anything", # proxy api-key
base_url="http://0.0.0.0:8000" # litellm proxy
)
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"no-log": True
}
)
print(response)2. Allow user to pass messages.name for claude-3, perplexity
Note: Before this pr - the two providers would raise errors with the name param
LiteLLM SDK
import litellm
response = litellm.completion(
model="claude-3-opus-20240229",
messages = [
{"role": "user", "content": "Hi gm!", "name": "ishaan"},
]
)LiteLLM Proxy Server
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:8000"
)
response = client.chat.completions.create(
model="claude-3-opus-20240229"",
messages = [
{"role": "user", "content": "Hi gm!", "name": "ishaan"},
])
print(response)3. If user is using run_gunicorn use cpu_count to select optimal num_workers
4. AzureOpenAI - Pass api_version to litellm proxy per request
Usage - sending a request to litellm proxy
from openai import AzureOpenAI
client = AzureOpenAI(
api_key="dummy",
# I want to use a specific api_version, other than default 2023-07-01-preview
api_version="2023-05-15",
# OpenAI Proxy Endpoint
azure_endpoint="https://openai-proxy.domain.com"
)
response = client.chat.completions.create(
model="gpt-35-turbo-16k-qt",
messages=[
{"role": "user", "content": "Some content"}
],
)What's Changed
- [Feat] Support messages.name for claude-3, perplexity ai API by @ishaan-jaff in #2399
- docs: fix yaml typo in proxy/configs.md by @GuillermoBlasco in #2402
- [Feat] LiteLLM - use cpu_count for default num_workers, run locust load test by @ishaan-jaff in #2406
- [FEAT] AzureOpenAI - Pass
api_versionto litellm per request by @ishaan-jaff in #2403 - Add quickstart deploy with k8s by @GuillermoBlasco in #2409
- Update Docs for Kubernetes by @H0llyW00dzZ in #2411
- [FEAT-liteLLM Proxy] Incognito Requests - Don't log anything by @ishaan-jaff in #2408
- Fix Docs Formatting in Website by @H0llyW00dzZ in #2413
New Contributors
- @GuillermoBlasco made their first contribution in #2402
- @H0llyW00dzZ made their first contribution in #2411
Full Changelog: v1.30.3...v1.30.4
v1.30.3
Full Changelog: v1.30.2...v1.30.3
v1.30.2
🚀 LiteLLM Proxy - Proxy 100+ LLMs, Set Budgets and Auto-Scale with the LiteLLM CloudFormation Stack 👉Start here: https://docs.litellm.ai/docs/proxy/deploy#aws-cloud-formation-stack
⚡️ Load Balancing - View Metrics about selected deployments in server logs
🔎 Proxy view better debug prisma logs / slack alerts
📖 Docs: setting load balancing config https://docs.litellm.ai/docs/proxy/configs
⭐️ PR for using cross account ARN with Bedrock, Sagemaker: #2179
https://github.com/BerriAI/litellm/releases/tag/v1.30.2
What's Changed
- test: reintegrate s3 testing by @krrishdholakia in #2386
- (docs) setting load balancing config by @ishaan-jaff in #2388
- feat: add realease details to discord notification message by @DanielChico in #2387
- [FIX] Proxy better debug prisma logs by @ishaan-jaff in #2390
- [Feat] Load Balancing - View Metrics about selected deployments in server logs by @ishaan-jaff in #2393
- (feat) LiteLLM AWS CloudFormation Stack Template by @ishaan-jaff in #2391
Full Changelog: v1.30.1...v1.30.2

