Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional Prefix Audio #156

Open
MarisKay opened this issue Feb 25, 2025 · 10 comments
Open

Optional Prefix Audio #156

MarisKay opened this issue Feb 25, 2025 · 10 comments

Comments

@MarisKay
Copy link

MarisKay commented Feb 25, 2025

Not sure how this has to work , but if the prefix is in seductive slow tone, the generated part is nothing like it. Shouldnt it pick up the pace, intonation and feel?

@darkacorn
Copy link
Contributor

did you have the prefix also transcribed as prefix for the text ?

@MarisKay
Copy link
Author

MarisKay commented Feb 27, 2025

umm, i guess not, how should i do that? Please give an example or point to where can i read the "how to" . Or wait, should i just type what the prefix says, without any special markings or something? I think i tried it once too, but the voice was not the one that is in the source voice, prefix voice was different person's and output was like my prefix voice and then - totally different tempo and feel - my source voice's generated rest part of the text. Result felt like two audio pieces from different persons were just randomly sticked together.

@darkacorn
Copy link
Contributor

you have audio prefix in wav.. and the transcription of that has to be the prefix on your regular text too otherwise the prefix wont condition much

@coezbek
Copy link
Contributor

coezbek commented Feb 27, 2025

This is also explained in #14 and there is an implementation in #148

When using the prefix audio (not reference audio) in the gradio interface, you also need to put in the text of the prefix in the text box.

@MarisKay
Copy link
Author

thanks, will try, so it should pick up the tone and feel of the prefix audio, no matter that it is spoken by some other voice, not the one in reference?

@xdevfaheem
Copy link

umm, i guess not, how should i do that? Please give an example or point to where can i read the "how to" . Or wait, should i just type what the prefix says, without any special markings or something? I think i tried it once too, but the voice was not the one that is in the source voice, prefix voice was different person's and output was like my prefix voice and then - totally different tempo and feel - my source voice's generated rest part of the text. Result felt like two audio pieces from different persons were just randomly sticked together.

i have commented throughout the code #148 check 'em out

@MarisKay
Copy link
Author

okay, i added a prefix audio, transcribed it in front of my speech text and generated. There is absolutely no link between the emotion, speed and feel of prefix audio and what comes after in generated results. None. Like voices from two different life situations simply sticked together one after another. Something is not working or the implementation is that weak so it cant deliver its initial intention.

@petermg
Copy link

petermg commented Mar 17, 2025

okay, i added a prefix audio, transcribed it in front of my speech text and generated. There is absolutely no link between the emotion, speed and feel of prefix audio and what comes after in generated results. None. Like voices from two different life situations simply sticked together one after another. Something is not working or the implementation is that weak so it cant deliver its initial intention.

I agree. I just tried this and all I am getting generated is the exact same audio I put in for the prefix. I don't think it's supposed to work that way. It seems to be broken.

@coezbek
Copy link
Contributor

coezbek commented Mar 17, 2025

Can you share your code of what you tried?

@MarisKay
Copy link
Author

I dont have any code, i used included in git gradio interface. There is no code. All done through UI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants