NEMO ASR on the audio recorded with phone #2127
Replies: 7 comments 1 reply
-
I did not played enough with that, but my first impression it was that with my studio microphone without dynamic compression and leveling, already was worst. Therefore at the beginning I will use with hardware dynamic compression. Once I am sure that this is the biggest issue, then I will make this compression in software also. |
Beta Was this translation helpful? Give feedback.
-
ok thanks for replying. Now i need to use NeMo for audio recorded using phone /laptop microphone. Is there any way to improve the audio quality before giving it to the model. how to get the quality speech to text with phone recording /zoom audio ? |
Beta Was this translation helpful? Give feedback.
-
Before going into development... first try to do some experiments. Take some audio processing tools, where you can do dynamic compression, limiting, even better if you can do proper noise reduction. Prepare manually a few variants of quality from the same file and compare the results. Then you will have the correct observation what makes sense to do and what doesn't make sense. |
Beta Was this translation helpful? Give feedback.
-
@kruthikakr Could you share some samples that you are seeing an issue with? Most of our training data consists of clean studio recordings but you should get reasonable transcripts for noisy audio say up to 20 dB SNR. Could you also share which model/models you experimented with?
You could finetune the model to with noise augmentation to improve noise robustness of the model, we have a noise robust QuartzNet model that you could give a try. We have a new release coming up (in a couple of weeks) with better models trained on more conversational data that may fit your application. |
Beta Was this translation helpful? Give feedback.
-
**1. we tried with 1 zoom recorded audio which was m4a format and the the size was7.62MB and then converted in to .wav format (15 MB size) which is required by the Nemo using "ffmpeg". if there is an option to give in m4a , let us know the script .Sharing the both audio here (audio_data.zip).
O. 3. Results using "QuartzNet15x5NR-En" model the transcript is pasted here. |
Beta Was this translation helpful? Give feedback.
-
Also provide some suitable or comparable tool /Script to record the audio using laptop /phone microphone, so that we get good transcript .It would be of great help, Thank you |
Beta Was this translation helpful? Give feedback.
-
@jbalam-nv Can you please check on the sample audio and respective transcript and give us some inputs on how to improvise the result ? |
Beta Was this translation helpful? Give feedback.
-
Question :
Trying to run the Nemo for the audio recorded with the phone /zoom meeting , but the text output is very bad, the same audio which has been recorded with good microphone ,the text output is exact.
Can i know is there any reference for audio quality ? how to check the quality of the audio or how to use NeMo for real work applications.
Beta Was this translation helpful? Give feedback.
All reactions