-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
What happened?
Hi team,
I am deploying litellm/litellm docker image and added gemini 2.5 pro model with vertex_ai credentials. When I am trying to call chat/completions with this model, it seems the reasoning content will not return, and it will stuck for a while and return only the final response part.
The image I am deploying is: https://hub.docker.com/r/litellm/litellm
version: [v1.80.7.dev.3]
The code I am using to call litellm proxy server:
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "LITELLM_DEPLOYMENG_ENDPOINT"
MODEL = "gemini-2.5-pro"
url = f"{BASE_URL}/v1/chat/completions"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json",
}
payload = {
"model": MODEL,
"stream": True,
"messages": [
{"role": "user", "content": "How many 'r' in 'cherry'?"},
],
}
with requests.post(url, headers=headers, json=payload, stream=True) as resp:
resp.raise_for_status()
for line in resp.iter_lines(decode_unicode=True):
if line:
print(line)Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.80.7.dev.3
Twitter / LinkedIn details
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working