You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great model! Can you explain how to use audio prefix described in the readme? Should prefix be added directly in the text prompt? Can you provide some examples?
The text was updated successfully, but these errors were encountered:
You upload a prefix speech sample (3-5s~ recommended, but you can really do 20s+ if you liked) into the gradio and transcribe that and put it into the text box. Then enter the speech you wish for it to be continued with.
This is typically not necessary but if you want a very high level of control over the generation or even more speaker cloning fidelity it is helpful.
We typically recommend just putting a few ms of pure silence there to condition the model to output high quality outputs.
If you do put a speech sample there instead of silence it can help to set the other conditioning inputs appropriately for your prefix audio, or just set them to uncond and let the model figure it out.
Great model! Can you explain how to use audio prefix described in the readme? Should prefix be added directly in the text prompt? Can you provide some examples?
The text was updated successfully, but these errors were encountered: