Home

Welcome to the AudFeature_extraction wiki!

for the file feature_extraction.py

The feature that it is extracting on the basis of pitch value are :

min_pitch
max_pitch
mean_pitch
num_voice_breaks
percentage_breaks
speak_rate
num_pause
Total_dur_pause
no. of rise
no. of fall
total duration of the audio file
play_time Logic for finding differnt features are :
min_pitch = just apply the function min over all the numpy array value
max_pitch = just apply the function max over all the numpy array value
mean_pitch = just apply the fucntion mean over all numpy array value
num_voice_breaks = in order to find this value what i did is that whenever there is pitch changes from zero to some value and some value to zero then it means there is some sort of voice breaks and i counted all the occurences and displayed it.
percentage_breaks = total number of voice breaks divides by the lenght of the numpy value.
speak rates = it means we have to find the words per minutes for this what i did is i converted the spoken word into text and then count the total play time by subtracting pause_time from duration_time and then dividing the lenght of the word by the paly time it basically display the word spoken per second.
num_pause = in order to find this value the simple logic that i applied is when pitch is zero it is pause time
Total_dur_pause = for this I find the corrosponding time when pitches are zero and then add all the corrosponding value and got the Total_dur_pause
duration_file = divide the total number of frames with frame rate
play_time = for this subtrace the pause time from duration of the audion file.
num_rise = when pitch is incrasing means rise since it depennds on the frequensy as well as the amplitude
num_fall = when pitch is decreasing.

audio_graph.py

This file is used to represent the audio in different format like spectrogram, spectrogram roll off, spectrogram centroid, mfcc etc.It uses the library librosa in python See the result by running this code

foo@bar python audio_graph.py /audio/human.wav

The above code will dispay some of the important features in terms of graph that is :

spectrogram
Zero cross rating
Zoomed in views
Spectral centroid
Spectral roll off
MFCC

The importnace of spectrogram graph is that it can easily be used as an input feature to any neural networl which can be used to extract some important features.

for file extra_feature_extract.py

This file contain the code that can be used to extract some of the measure and important features from the audio file. The importance is that these features are more important than the other since it contains most of the features that is enough when we train the model.

The features that it extracting are :

ZCR
Energy
Entropy of energy
Spectral centroid
Spectral Entropy
Spectral Flux
Spectral Roll off
MFCC
Chroma vector
Chroma deviation or 'zcr', 'energy', 'energy_entropy', 'spectral_centroid', 'spectral_spread', 'spectral_entropy', 'spectral_flux', 'spectral_rolloff', 'mfcc_1', 'mfcc_2', 'mfcc_3', 'mfcc_4', 'mfcc_5', 'mfcc_6', 'mfcc_7', 'mfcc_8', 'mfcc_9', 'mfcc_10', 'mfcc_11', 'mfcc_12', 'mfcc_13', 'chroma_1', 'chroma_2', 'chroma_3', 'chroma_4', 'chroma_5', 'chroma_6', 'chroma_7', 'chroma_8', 'chroma_9', 'chroma_10', 'chroma_11', 'chroma_12', 'chroma_std' The value will be displayed in terms of an array.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

for the file feature_extraction.py

audio_graph.py

for file extra_feature_extract.py

Clone this wiki locally