gen_ai inference spark-tts LLM, result error.

using gen_ai to inference spark-tts LLM(Qwen2 Architecture)，but in OVModelForCausalLM,the result is right,but using openvino, every token always "!", that is 0(after tokenizer)

openvino_genai: version:2025.3.0.0


code:
import time
import os

import torch
prompt = "<|task_tts|><|start_content|>量子力学（quantum mechanics）是描述原子尺度及以下微观世界行为的物理学分支，是现代物理学的两大支柱之一。<|end_content|><|start_global_token|><|bicodec_global_2391|><|bicodec_global_1229|><|bicodec_global_1008|><|bicodec_global_3279|><|bicodec_global_273|><|bicodec_global_1590|><|bicodec_global_1232|><|bicodec_global_2201|><|bicodec_global_1356|><|bicodec_global_2700|><|bicodec_global_972|><|bicodec_global_1061|><|bicodec_global_225|><|bicodec_global_3848|><|bicodec_global_3128|><|bicodec_global_3572|><|bicodec_global_758|><|bicodec_global_4095|><|bicodec_global_2290|><|bicodec_global_3325|><|bicodec_global_3445|><|bicodec_global_2683|><|bicodec_global_972|><|bicodec_global_3911|><|bicodec_global_1265|><|bicodec_global_3342|><|bicodec_global_3305|><|bicodec_global_253|><|bicodec_global_113|><|bicodec_global_3665|><|bicodec_global_507|><|bicodec_global_316|><|end_global_token|>"
from transformers import AutoTokenizer
import openvino_genai


pipe = openvino_genai.LLMPipeline("./spark_tts_ov/LLM", device="CPU")
tokenizer = AutoTokenizer.from_pretrained("./spark_tts_ov/LLM")
t0 = time.perf_counter()
result = pipe.generate("你好",max_new_tokens=10)
print(f"warm up time:{time.perf_counter() - t0:4f}")

for token in pipe.generate(prompt,stream=True,max_new_tokens=500):
    print(token)
    print(tokenizer.encode(token))




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gen_ai inference spark-tts LLM, result error. #2988

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

gen_ai inference spark-tts LLM, result error. #2988

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions