Skip to content

saumyakb/CS4740-NLP-Sentiment-Analysis-using-HMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Sentiment-Analysis-using-HMM

ASSIGNMENT 3 Background: Our sentiment tagging system makes use of the Viterbi forward backward algorithm to traverse the states: where each state represents a possible configuration/ prediction of the sentiment over the context traversed by the algorithm. In the continuity sentiment tags are predicted as we move forward with the algorithm. The HMM models make use of two kinds of probabilities to keep account of the state of the sentence sentiment for the current: an emission probability keeps track of the sentiment tag given the word and its frequency of occurrence in the training data. The second probability is known as the transition probability accounts for the current state of the system given the state that the system was in previously. The advantages of this is that if there exists enough evidence to incite belief that the system is no longer analyzing a positive/negative/ neutral sentence (whatever the current state) a necessary transition may be made to a better /more probable state. The decision for this transition is decided taking into context the previous state of the system, the current probability and the probability given the next word to be introduced will cause a transition. The Markov assumption takes into consideration the states preceding and succeeding the current states. The calculations for the emission probabilities are trivial consisting of the counts of words in a state over the counts of the word in all possible states. This can be found over summing the count over all states in the denominator. The challenging and abstract part was calculation of transition probabilities; as we only had a sentence level sentiment and no sentiment related to every element. As a result we improvised our system by using the dictionary of trained words occurring under each tag to classify words into respective sentiment using a Bayesian Classifier and then using these classes for each words to record counts for transition. If the word at t was tagged as negative and t+1 tagged positive this would lead to an increase in count for transition of state from negative-positive. Using this simple mechanism we were able to calculate the transition probabilities and hence plug them in to predict a sentence-level sentiment for each line in our review. We also implemented an alpha-smoothing scheme to compensate for words that may not appear in the training data and hence are OOV (out of vocabulary). For such words the smoothing function would assign a smoothed –likelihood to allow it to compete and contribute towards the sentiment of the sentence. Varying the value of alpha has a direct effect on the accuracy of the system as can be observed below. x-axis 0.01 0.05 0.01 1 2 10 50 100 score 0.37908 0.38235 0.3862 0.38725 0.39706 0.40523 0.40359 0.40359

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages