Skip to content

gen_ai inference spark-tts LLM, result error. #2988

@liqianhao111

Description

@liqianhao111

using gen_ai to inference spark-tts LLM(Qwen2 Architecture),but in OVModelForCausalLM,the result is right,but using openvino, every token always "!", that is 0(after tokenizer)

openvino_genai: version:2025.3.0.0

code:
import time
import os

import torch
prompt = "<|task_tts|><|start_content|>量子力学(quantum mechanics)是描述原子尺度及以下微观世界行为的物理学分支,是现代物理学的两大支柱之一。<|end_content|><|start_global_token|><|bicodec_global_2391|><|bicodec_global_1229|><|bicodec_global_1008|><|bicodec_global_3279|><|bicodec_global_273|><|bicodec_global_1590|><|bicodec_global_1232|><|bicodec_global_2201|><|bicodec_global_1356|><|bicodec_global_2700|><|bicodec_global_972|><|bicodec_global_1061|><|bicodec_global_225|><|bicodec_global_3848|><|bicodec_global_3128|><|bicodec_global_3572|><|bicodec_global_758|><|bicodec_global_4095|><|bicodec_global_2290|><|bicodec_global_3325|><|bicodec_global_3445|><|bicodec_global_2683|><|bicodec_global_972|><|bicodec_global_3911|><|bicodec_global_1265|><|bicodec_global_3342|><|bicodec_global_3305|><|bicodec_global_253|><|bicodec_global_113|><|bicodec_global_3665|><|bicodec_global_507|><|bicodec_global_316|><|end_global_token|>"
from transformers import AutoTokenizer
import openvino_genai

pipe = openvino_genai.LLMPipeline("./spark_tts_ov/LLM", device="CPU")
tokenizer = AutoTokenizer.from_pretrained("./spark_tts_ov/LLM")
t0 = time.perf_counter()
result = pipe.generate("你好",max_new_tokens=10)
print(f"warm up time:{time.perf_counter() - t0:4f}")

for token in pipe.generate(prompt,stream=True,max_new_tokens=500):
print(token)
print(tokenizer.encode(token))

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions