Is there a way to edit the results of Whisper's transcription of audio files before you start training? #158
-
Hello, I am a complete novice just starting with voice cloning and I have decided on Tortoise TTS as my tool of choice. After transcribing a bunch of samples using whisperx large v2, I went through the train.txt and validation.txt files and noticed some words were not properly recognized due to a heavy accent. My question is: Can these ambiguities be simply fixed by editing the text file or will that cause issues due to the .json file? I would also like to ask if it's advisable to remove those clips in which whisper could not transcribe (about 25%) of the data set thank you |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hello. Yes, you simple edit the txt file, it won't cause any problem with json. in json file you just have the parameters for the training, not the text itself. |
Beta Was this translation helpful? Give feedback.
Hello. Yes, you simple edit the txt file, it won't cause any problem with json. in json file you just have the parameters for the training, not the text itself.