-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Turkish language support #223
base: main
Are you sure you want to change the base?
Conversation
@g-hano Did you train with Turkish data? |
Yes I did, had to update many lines but finally was able to train on 4 T4s. Extra training on 46k audio samples made the model produce clearer outputs. |
Hello, thank you for supporting the Turkish TTS model, first of all. When I reviewed your code, I noticed that you convert text to lowercase during normalization. Unfortunately, for Turkish, uppercase "I" is converted to lowercase "i," which is incorrect. As a solution, the following change is needed: text.replace("I", "ı").lower(). Since you are using lowercase text, I recommend trying ytu-ce-cosmos/turkish-base-bert-uncased. |
Handle Turkish character 'I' and change tokenizer to ytu-ce-cosmos/turkish-base-bert-uncased
Thank you for your recommendations, I updated the code. |
I created
turkish.py
andturkish_bert.py
to support Turkish language. Used dbmdz/bert-base-turkish-cased as tokenizer