Skip to content

Commit 9deee86

Browse files
committed
Updated Readme and requirements.txt
1 parent f367014 commit 9deee86

File tree

4 files changed

+22
-7
lines changed

4 files changed

+22
-7
lines changed

README.md

+16
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,19 @@ Reconnect first allows the user to listen to a sound file that contains a senten
1717
Reconnect uses Microsoft Azure’s Speech-to-Text function to convert the user’s speech input into text. By comparing the text against the sentence provided to the user, Reconnect is able to determine if the user’s pronunciation is adequately correct. Following which, Azure’s Text-to-Speech function is used to generate a separate speech output from the same sentence. These two .wav files will then be processed by Reconnect.
1818
Reconnect uses the SciPy library to convert the sound files into audio data chunks. By using our self-developed algorithms to process the audio data’s amplitude, frequency, and breaks, Reconnect is able to determine the relative speed of vowel enunciation, and the presence of unnaturally long or short breaks between words and sentences.
1919
Finally, Reconnect will compile all of these feedback before presenting them to the user. The user will then be given the opportunity to try again. The user can also type a sentence which he or she hopes to practice, and Reconnect will generate a sound file to facilitate the same learning process as mentioned above.
20+
21+
# Challenges we ran into
22+
Since the team comprised of a sophomore and two freshmen having a less technical background, we ran into a lot of difficulties. This was the first time we ever played with APIs and it was difficult to get things working together. In the beginning, we did not go think about the number of channels of the input. Also, for the text comparison, it was necessary to mind the length of the expected text and the received text. While the typed text had to be preprocessed so that it did not contain any characters, the expected text had to be preprocessed so that it omitted some unintended words like “oh”, “umm” etc. Since none of us had enough experience in web development, the significant challenge was getting the input from the user in the form of audio
23+
- Microphone input
24+
- Two-channeled audio files and wave comparison algorithms for them
25+
26+
# Accomplishments that we're proud of
27+
Despite all the challenges, we are proud that we successfully built an interactive platform Reconnect where people can practice speaking to reconnect with the world. Helping thousands of people worldwide in transitioning from impaired hearing to speaking effectively is indeed a great satisfaction to our team.
28+
29+
# What we learned
30+
Being new to the hackathons, initially, we were unsure if we should go forward with this idea due to technical complexities. It was a second hackathon for all of us and our first-ever time using any sort of APIs. However, we decided to take up the challenge and finally it worked. Therefore, in addition to learning more about programming, using APIs and developing web-site, we learned to think big and apply the knowledge to have an impact on people’s life
31+
32+
# What's next for Reconnect
33+
- consulting with medical professionals to get effective strategies for speech reconstruction
34+
- expanding to different languages
35+
- turning Reconnect into an actual learning platform with the ability to track progress and try different strategies

WriteUpForLocalHackDay.docx

-15.1 KB
Binary file not shown.

reconnect_app/get_breaks1.py

+6-7
Original file line numberDiff line numberDiff line change
@@ -110,13 +110,13 @@ def check_sensibility_of_breaks(self, speaker_breaks, correct_breaks):
110110
if (speaker_end - speaker_start) > 1.5:
111111
self.result["long_breaks"].append(speaker_breaks[i])
112112
elif (speaker_break_time - correct_break_time) > 0.30:
113-
self.result["long_breaks"].append((speaker_breaks[i][0] + self.input_sound_start_snip, speaker_breaks[i][1] + self.input_sound_start_snip))
113+
self.result["long_breaks"].append((speaker_start + self.input_sound_start_snip, speaker_end + self.input_sound_start_snip))
114114
elif (correct_break_time - speaker_break_time) > 0.30:
115-
self.result["short_breaks"].append((speaker_breaks[i][0] + self.input_sound_start_snip, speaker_breaks[i][1] + self.input_sound_start_snip))
116-
if (speaker_start - correct_start) > (last_time_difference + 0.5):
117-
self.result["long_pronunciation"].append((speaker_breaks[i-1][1] + self.input_sound_start_snip, speaker_breaks[i][0] + self.input_sound_start_snip))
118-
elif (correct_start - speaker_start) > (last_time_difference + 0.5):
119-
self.result["short_pronunciation"].append((speaker_breaks[i-1][1] + self.input_sound_start_snip, speaker_breaks[i][0] + self.input_sound_start_snip))
115+
self.result["short_breaks"].append((speaker_start + self.input_sound_start_snip, speaker_end + self.input_sound_start_snip))
116+
if (speaker_start - correct_start) > (last_time_difference + 0.5) and i > 0:
117+
self.result["long_pronunciation"].append((speaker_breaks[i-1][1] + self.input_sound_start_snip, speaker_start + self.input_sound_start_snip))
118+
elif (correct_start - speaker_start) > (last_time_difference + 0.5) and i > 0:
119+
self.result["short_pronunciation"].append((speaker_breaks[i-1][1] + self.input_sound_start_snip, speaker_start + self.input_sound_start_snip))
120120
last_time_difference = abs(correct_end - speaker_end)
121121

122122
def remove_audio_wave_silence(self, audio_data, rate, min=None):
@@ -165,7 +165,6 @@ def convert_audio_data_to_chunk_audio_data(self, audio_data, rate):
165165
chunk_audio_data[i] = chunk_audio_data[i] / (int(rate) / 10)
166166
return chunk_audio_data, int(rate/10)
167167

168-
169168
if __name__ == "__main__":
170169
# web_file="C:\Users\Samuel\PycharmProjects\speech_analysis\wave_comparison"
171170
#

~$iteUpForLocalHackDay.docx

-162 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)