You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, I’d like to express my sincere gratitude to the Unsloth team for providing such an accessible environment where everyone can train and run language models, even with limited hardware resources🙇♂️
Thanks to your excellent notebook and documentation, I was able to train a TTS model smoothly, even as a junior developer with relatively little background knowledge in LLMs.
I trained a TTS model (orpheus-3b-0.1-ft) using a custom interview dataset, which includes a large number of filler voices such as “ah”, “um”, and “eh”.
As a result, the trained model sometimes unintentionally generates filler sounds in sentences where they don’t belong.
For that reason, I decided to post this discussion to kindly seek your advice on the matter.
In this case, would it help improve the model’s learning if I explicitly annotate filler voices in the text dataset using custom tokens such as <filler_um> or <filler_ah>?
If I add such custom tags (e.g., <filler_um>, <filler_ah>, ...) to the text dataset, should I also manually update the tokenizer configuration, such as in tokenizer_config.json, to ensure they are properly recognized during training?
Thank you very much for taking the time to read my discussion.
I deeply appreciate all the work the unsloth team has done to make this remarkable project available to the community. 🙏
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
First of all, I’d like to express my sincere gratitude to the Unsloth team for providing such an accessible environment where everyone can train and run language models, even with limited hardware resources🙇♂️
Thanks to your excellent notebook and documentation, I was able to train a TTS model smoothly, even as a junior developer with relatively little background knowledge in LLMs.
I trained a TTS model (orpheus-3b-0.1-ft) using a custom interview dataset, which includes a large number of filler voices such as “ah”, “um”, and “eh”.
As a result, the trained model sometimes unintentionally generates filler sounds in sentences where they don’t belong.
For that reason, I decided to post this discussion to kindly seek your advice on the matter.
In this case, would it help improve the model’s learning if I explicitly annotate filler voices in the text dataset using custom tokens such as
<filler_um>or<filler_ah>?If I add such custom tags (e.g., <filler_um>, <filler_ah>, ...) to the text dataset, should I also manually update the tokenizer configuration, such as in
tokenizer_config.json, to ensure they are properly recognized during training?Thank you very much for taking the time to read my discussion.
I deeply appreciate all the work the unsloth team has done to make this remarkable project available to the community. 🙏
Beta Was this translation helpful? Give feedback.
All reactions