Skip to content

Commit d5e8d45

Browse files
NickLucchexuebwang-amd
authored andcommitted
[Frontend] Skip unnecessary detokenization when token_id is requested (vllm-project#24236)
Signed-off-by: NickLucche <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
1 parent aba2b05 commit d5e8d45

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm/entrypoints/openai/serving_chat.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1419,9 +1419,10 @@ def _create_chat_logprobs(
14191419
step_top_logprobs = top_logprobs[i]
14201420
if step_top_logprobs is None or step_top_logprobs.get(
14211421
token_id) is None:
1422-
token = tokenizer.decode(token_id)
14231422
if should_return_as_token_id:
14241423
token = f"token_id:{token_id}"
1424+
else:
1425+
token = tokenizer.decode(token_id)
14251426

14261427
logprobs_content.append(
14271428
ChatCompletionLogProbsContent(

0 commit comments

Comments
 (0)